American National Corpus

The Linguistic Data Consortium will distribute the American National Corpus and handle all intellectual property arrangements and License Agreements. Members of the ANC Steering Committee have settled on a general approach, in which there are two types of licenses, one more restricted and one less restricted. The more restricted license is modeled on the licenses used in the BNC effort. The less restrictive license will be modeled on the Open Publication License, adapted to the particular circumstances of this project.

Background: BNC licenses

BNC permissions discussion and permissions letters, including the initial contact letter (used to open a dialogue with the author/publisher and find out whether and how to go further), the spoken permission letter used for talkers in the spoken part of the corpus, the "second permissions request" (the crucial document in which authors and publishers actually grant a license to the BNC, the background letter (a sort of BNC FAQ intended to reassure authors and publishers about what they are involved with), and the end user license (which corpus recipients sign, and which is referenced in the "second permissions request").

Background: "Open" licenses

Licenses for open/free documentation listed on the GNU web site. Of these, the GNU Free Documentation License is probably not suitable, as it includes permission for modification, which is useful for documentation but is not appropriate here. However, liberal use of the options for "invariant sections" and "cover texts" might obviate this problem. The Open Content License also permits modification, with no apparent provision for exceptions. The Open Publication License has an option to forbid "substantive" modification (i.e. changes in formatting are OK, changes in content are not).

Discussion

The ANC will have two general license types, a more restricted and a less restricted one, applying to two different parts of the corpus. We have adapted the BNC end user license for the more restricted portion of the corpus and the Open Publication License for the less restricted portion. We urge authors and publishers to use the less restricted license. For all the material, the license has to be worldwide and perpetual. Both the open and restricted licenses explicitly register our understanding that short quotations in lexicographic works are permitted. The permissions request letter should also include such language.

Components

License Agreements

Data Contribution