Data and data analysis are an important part of many types of research. Finding, using, and evaluating data can be a bit different than using other types of sources. Consult our Data Research Tutorials for more information and explore a variety of data collections below.
Data & Statisics Collections
- Sage Data This link opens in a new windowSage Data is a collection of U.S. and international datasets sourced from governmental, commercial, and private organizations. Sage Data allows you to search and browse millions of datasets, compare and contrast variables of interest, and create customized exportable charts and tables. Includes the Claritas Consumer Profiles dataset.
- IBISWorld This link opens in a new windowEconomic, demographic, and market data on thousands of industries worldwide.
- MarketLine Advantage This link opens in a new windowProvides company, industry, country and financial data for every major marketplace in the world. Includes company SWOTs, company overviews, industry profiles, market research, case studies, financial deals, country analysis, news and a statistics database covering over half a million data points for 215 countries and 46 political and geographic groupings.
- Mergent Intellect This link opens in a new windowA fully searchable database with detailed information on businesses including both active/inactive companies as well as daily updates on executives.
- Mintel U.S. Reports & Global New Products This link opens in a new windowCategory specific reports with quantitative and qualitative market, brand, and consumer insights as well as a global database of new consumer packaged goods launches in 86 markets.
- Passport (Euromonitor) This link opens in a new windowPassport GMID provides international marketing data, analysis, and information sources for countries, consumers, and industries, for example, searching of industries and products in order to locate leading companies in different regions of the world.
- SimplyAnalytics This link opens in a new windowThis mapping application enables users to develop interactive maps and reports using thousands of demographic, business, health, crime, and marketing data variables including Mediamark and Simmons Consumer Data. Most data begins in 2000.
- Statista This link opens in a new windowStatista is the first statistics portal in the world to integrate data on over 60,000 topics from over 18,000 sources onto a single professional platform. Categorized into 21 market sectors, Statista.com provides companies, business customers, research institutions, and the academic community with direct access to quantitative data on media, business, finance, politics, and a wide variety of other areas of interest or markets
- ICPSRICPSR is a membership based, non-profit data archive located at the University of Michigan. Access to datasets and related studies on a wide variety of topics.
- Data Citation Index (1900 to current)The Data Citation Index provides a single point of access to research data from repositories across disciplines and around the world.
- ResearchDataGovResearchDataGov is a web portal for discovering and requesting access to restricted microdata from federal statistical agencies.
Free Data Sets
- Bureau of Labor StatisticsMany important economic indicators for the United States (like unemployment and inflation) can be found. Most of the data can be divided both by time and by geography.
- CDC Data and StatisticsThe Centers for Disease Control and Prevention maintains a database on cause of death. The data can be segmented in almost every way imaginable: age, race, year, and so on.
- Data.govU.S. government data, this site acts as a portal to all sorts of amazing information on everything from climate to crime.
- DataCiteSearch for datasets across over 1000 different major data centers and repositories including ICPSR, Harvard Dataverse, Data-Planet, etc.
- Data.WorldOne key differentiator of data.world is the tools they have built to make working with data easier - you can write SQL queries within their interface to explore data and join multiple data sets. They also have SDK's for R an python to make it easier to acquire and work with data in your tool of choice.
- FBI Crime Data ExplorerNational crime statistics, with free data available at national, state and county level.
- HealthData.govUS healthcare data including claim-level Medicare data, epidemiology and population statistics.
- Pew Research Center DatasetsRaw data from Pew's research into American life.
- United Nations Data (UNdata)World data about population, education, labor and more from a variety of global organizations.
- United States CensusBusiness, economic, and population data.
- World Bank DatabankDatasets covering population demographics and a huge number of economic and development indicators from across the world.
- iPOLLDatabase of the Roper Center at Cornell University. Provides data from national opinion polls, 1935 to present.
- IPPSR Correlates of State PolicyThe Correlates of State Policy Project aims to compile, disseminate, and encourage the use of data relevant to U.S. state policy research, tracking policy differences across the 50 states and changes over time. We have gathered more than 3000 variables from various sources and assembled them into one large, useful dataset.
- The Government Finance DatabasePrepared data set based on census data for government fianance
- World Health Organization - Data and StatisticsOffers world hunger, health, and disease statistics.
- Google DatasetGoogle's search engine for datasets.
State and Local Data
- Analyze BostonThe City of Boston provides an open data hub. Locate datasets and other projects built on this open data.
- Boston Indicators ProjectProvides key indicators for data trends in Boston.The reports tend to include data from the field of health, education, transportation, etc.
- The Health of BostonIncludes reports that provide descriptive information about the health status and factors that influence the health of Boston residents.
- Massachusetts Office of Data Management and Outcomes AssessmentODMOA facilitates and coordinates the collection, access to, and use of public health data in order to monitor and improve population health.
- State Budget Sources The Volcker AllianceState Budget Sources is designed to provide improved tools for public officials, policy advocates, journalists, academics, and concerned citizens researching the critical fiscal decisions that governors and legislators must make.
Data Mining Resources
- Article Discussing the challenges data mining library resourcesMcCracken, P. & Raub, E., (2023) “Licensing Challenges Associated With Text and Data Mining: How Do We Get Our Patrons What They Need?”, Journal of Librarianship and Scholarly Communication 11(1). doi: https://doi.org/10.31274/jlsc.15530
- Google Books Ngram ViewerSearch Google's text collection, including printed sources published between 1500 and 2019 in several languages.
- Hathi Trust Data Availability and APIsPerform text mining on Hathi Trust's collection through a variety of channels.
- English-CorporaLarge collections of text for text mining.
More!
- California General Plan Database Mapping ToolLand-use design in California is controlled by local communities through documents known as General Plans. These documents commit city and county governments to long-term planning and development goals that shape the fortunes and health of the jurisdiction through zoning regulations. There is currently little coordination across cities or at the state-level for monitoring where and which policies are adopted. This searchable database of California’s General Plans is the first of its kind.