Contents

Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Introduction
============
Apache MADlib is released as both source tarball and a series of
binary convenience artifacts for Linux and Mac OS X operating systems.
MADlib's community takes great care of making sure that each release
is done in accordance with ASF's release policy:
http://www.apache.org/legal/release-policy.html

The latest state of the recommended MADlib's release process can be found
on MADlib's wiki: https://cwiki.apache.org/confluence/display/MADLIB/Release+Process

In all this, MADlib looks like any other project developed in Apache Software
Foundation. There is, however, one major difference that anybody reviewing
MADlib releases or considering to consume MADlib downstream need to be aware of:
portions of MADlib source code lack the obligatory ASF licensing header information:
  http://www.apache.org/legal/release-policy.html#license-headers

This is very much intentional and simply reflects the nature of the original
BSD license that MADlib had (more on that later in the Historical Background
section). In fact, this was explicitly approved by the ASF's VP Legal:
   https://s.apache.org/EOT5
   https://issues.apache.org/jira/browse/LEGAL-293

It does, however, trip up human reviewers and also tools like Apache Release Audit
Tool (RAT). Basically, for every release of MADlib the community itself and all the
downstream consumers (including external reviewers) have to make sure that for any
NEW file added to the project the proper licensing header is added as well.

This could appear as a daunting task at first, but fortunately with a few tips
summarized below it doesn't have to be.


Tips for reviewers and consumers of MADlib source code
=====================================================
  1. MADlib provides an exclusion list for RAT tool in its pom.xml file.
  Running RAT via
     $ mvn apache-rat:check
  and ispecting RAT's report afterwards provides a good baseline on which
  source files don't need to have an license header.

  2. A second level of validation is to see how this exclusion list differs
  between the previous official release of MADlib and the one under review.
  Running a simple diff or a git diff on the pom.xml file will provide all
  the details.

  3. Finally a 3d level of validation is to see what new code was added to
  the project. This is where you would have to use the magic of git by running
  something along the lines of:
     $ git diff --stat rel/XXXX..HEAD
  where XXX is the release tag of an official release immediately preceding the
  one being reviewed. Correlating the output of this command with RAT list will
  provide a full understanding of where licensing headers belong and where they
  don't.

  4. For the really paranoid, you could always compare ANY release of MADlib to
  the state of the source code base when it was imported into the ASF's repository
  by running:
     $ git diff --stat asf_import..HEAD


Historical Background
=====================
Prior to the software grant to ASF on Sept 15, 2015 as an incubating project,
MADlib was an open-source library licensed under a 2-clause BSD license,
with multiple contributors since its inception in approximately 2011. After
the grant to ASF, the MADlib community requested guidance from ASF legal
regarding how to manage license headers for legacy BSD-licensed files,
modified BSD-licensed files, and new files.  The intent of the request was
to ensure that the Apache MADlib (incubating) project was acting as a
"good Apache citizen" and respecting the guidelines of ASF with respect to
software licensing.

Ultimate resolution (articulated in LEGAL-293) came down to:
   * don't do anything with existing (BSD) files even if we edit them
   * every new file we create gets an ASF license header