Skip to main content

Text and Data Mining Library Databases: Known Vendor Policies

Guidance for text and data mining subscription resources

Known policies

This is not an exhaustive list, and we cannot promise this list will always have the most current information. Please contact us if you have questions or for information on other possible data sources.

Procedure and policies can change very quickly; this list is only to give a sense of different approaches to obtaining text mining data. As the library works with vendors, we will provide a synopsis of procedures and policies below.

 

Provider Fee? Is there licensing involved? Additional notes

Association for Computing Machinery

(ACM Digital Library, ACM Transactions)

No Yes; researcher negotiates and signs directly with provider. Researcher contacts ACM and describes project.  ACM decides whether to allow access or not.

Elsevier

(ScienceDirect, Compendex/INSPEC, ClinicalKey, etc.)

No Necessary text-mining addenda have been added to library licenses; click-through license for researcher on accessing API. Access is only available through API.  For access to data not available through API, researcher must contact Elsevier directly and negotiate.

Gale 

(17th, 18th, and 19th C. British Newspapers, Times of London Archive, etc.)

Yes Still TBD. Only Gale primary source collections are available, not journal collections. Fee covers only a single project.
HathiTrust    No Yes; use restrictions spelled out on sign-up Researcher signs up at the HathiTrust website for access to either Sandbox (hosted) or API access. HathiTrust has also released a large extracted features dataset for download; that dataset only includes public domain volumes.

IEEE

(IEEE Xplore, IEEE Transactions)

  Yes: IEEE requires library/university to sign addendum to existing agreement to allow researcher to mine data.  Library must facilitiate on a case-by-case basis. Researcher completes IEEE questionnaire, and based on that information IEEE creates addendum to existing library license allowing use. These addenda must be cleared through General Counsel.

ProQuest  

(PsycINFO, American Periodical Series Online, MLA International Bibliography.  See here for more.)

Yes Yes: researcher negotiates and signs directly with provider. Fee covers only a single project, and is not insignificant. 

Springer

(Specializes in resources for science, technology, and medicine)

No

 No:  For non-commercial research, rights will be included in all new and renewed SpringerLink subscription agreements as an additional TDM clause. 

No registration or API key is required for text mining. Full-text content can be accessed easily and programmatically at friendly URLs based on the content’s Digital Object Identifier (DOI).  See here for more.

JSTOR Data for Research (Beta)

(Academic articles across disciplines)

No No: this is a free service to researchers. Create an account to download sets of up to 1,000 documents, or contact JSTOR for more.

Tell us!

Are you working on a text mining project?  Do you have feedback on your experiences working with library vendors?  Contact us - we'd love to hear about it!