Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Introduction
============
Apache MADlib is released as both source tarball and a series of
binary convenience artifacts for Linux and Mac OS X operating systems.
MADlib's community takes great care of making sure that each release
is done in accordance with ASF's release policy:
http://www.apache.org/legal/release-policy.html
The latest state of the recommended MADlib's release process can be found
on MADlib's wiki: https://cwiki.apache.org/confluence/display/MADLIB/Release+Process
In all this, MADlib looks like any other project developed in Apache Software
Foundation. There is, however, one major difference that anybody reviewing
MADlib releases or considering to consume MADlib downstream need to be aware of:
portions of MADlib source code lack the obligatory ASF licensing header information:
http://www.apache.org/legal/release-policy.html#license-headers
This is very much intentional and simply reflects the nature of the original
BSD license that MADlib had (more on that later in the Historical Background
section). In fact, this was explicitly approved by the ASF's VP Legal:
https://s.apache.org/EOT5
https://issues.apache.org/jira/browse/LEGAL-293
It does, however, trip up human reviewers and also tools like Apache Release Audit
Tool (RAT). Basically, for every release of MADlib the community itself and all the
downstream consumers (including external reviewers) have to make sure that for any
NEW file added to the project the proper licensing header is added as well.
This could appear as a daunting task at first, but fortunately with a few tips
summarized below it doesn't have to be.
Tips for reviewers and consumers of MADlib source code
=====================================================
1. MADlib provides an exclusion list for RAT tool in its pom.xml file.
Running RAT via
$ mvn apache-rat:check
and ispecting RAT's report afterwards provides a good baseline on which
source files don't need to have an license header.
2. A second level of validation is to see how this exclusion list differs
between the previous official release of MADlib and the one under review.
Running a simple diff or a git diff on the pom.xml file will provide all
the details.
3. Finally a 3d level of validation is to see what new code was added to
the project. This is where you would have to use the magic of git by running
something along the lines of:
$ git diff --stat rel/XXXX..HEAD
where XXX is the release tag of an official release immediately preceding the
one being reviewed. Correlating the output of this command with RAT list will
provide a full understanding of where licensing headers belong and where they
don't.
4. For the really paranoid, you could always compare ANY release of MADlib to
the state of the source code base when it was imported into the ASF's repository
by running:
$ git diff --stat asf_import..HEAD
Historical Background
=====================
Prior to the software grant to ASF on Sept 15, 2015 as an incubating project,
MADlib was an open-source library licensed under a 2-clause BSD license,
with multiple contributors since its inception in approximately 2011. After
the grant to ASF, the MADlib community requested guidance from ASF legal
regarding how to manage license headers for legacy BSD-licensed files,
modified BSD-licensed files, and new files. The intent of the request was
to ensure that the Apache MADlib (incubating) project was acting as a
"good Apache citizen" and respecting the guidelines of ASF with respect to
software licensing.
Ultimate resolution (articulated in LEGAL-293) came down to:
* don't do anything with existing (BSD) files even if we edit them
* every new file we create gets an ASF license header