It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
OPMT 600 Statistical Methods for Decisionmaking: Datasets: Getting Started
Sample datasets for statistical analysis for OPMT 600
Use this guide to find some resources for finding and using statistical datasets for OPMT 600
Don't hesitate to ask a librarian for help!
The resources here constitute test, sample, or practice data, to be used in teaching and learning statistical analysis techniques. I recommend using these when the actual topic or question of your research is secondary to learning the techniques--if that is not the case, and the substance of the data matters, see our Datasets & Statistics guide or contact our Data Services Librarian, John Heintz.
A Github repository of datasets available through R packages (the download files will work in any statistical analysis tool). You can see the number of rows, the number of columns, and also how many columns are binary, character, numeric, etc. Includes download (.csv) and documentation links.
Mergent Online provides in-depth information on publicly traded companies, including balance sheet/income statement data, ratios, industry benchmarks, and analyst reports). If you are asked for a username/password that means we've reached our maximum number of simultaneous users (8).
Use of the Mergent data and service are intended for non-commercial academic purposes only.
An excellent source for company and industry research, including financial and ratio analysis. Contains the former Standard & Poor's Industry Surveys and Stock Reports (now produced by CFRA Equity Research). Also includes mutual fund/ETF and bond information, investment research reports, stock screening and charting tools.
RMA's e-Statement Studies provides Industry level financial ratio benchmarks that can be compared with company data; industries are classified by North American Industry Classification System (NAICS) code.
Demographics, psychographics, product usage, spending behavior, and media habits of U.S. adult consumers, based on national consumer surveys. Use it to create your own custom reports. Available only to current UST students, faculty, and staff. Limited to 20 simultaneous users.
eCommerceDB provides data on online marketing and sales channels across the global economy; includes data for individual online stores, market segments by country, brand and trend reports, etc. for thousands stores in 50 countries, including detailed revenue analytics, competitor analysis, market development, marketing budget, and other indicators such as web traffic, shipping providers, payment options, social media activity, etc. Affiliated with our Statista subscription, some reports will open on that platform (ignore notations about cost and just click through, it should recognize our subscription and get you to the content).
If you do get to a page that will not open the content without payment, copy the page URL and send explanation with content title to firstname.lastname@example.org or use our Ask a Librarian services.
PrivCo database is UST Library’s premier source for finding information on privately-held companies. Company profiles include an overview of the firm, its leadership team/owners, and company financials. The profiles also contain, when available, detailed information on VC and private equity funding, private M&A deals, patent and copyright litigation, and IPO activities. Register for a free personal account or use the Direct Access button.
Data-Planet allows users to retrieve statistics & data, and create tables, charts, & maps from a variety of sources. Holdings for the United States are significant, with some data available at state, county, or local geographies, including daily, weekly, monthly, quarterly, and annual time series varying by metric. International data are at the country level, and include the enhanced China Data Center. Data are organized by subject and source; users can browse by folders or keyword search.
Subject categories include: Banking, Finance, and Insurance; Criminal Justice and Law; Education; Energy Resources and Demand; Food and Agriculture; Government and Politics; Health and Vital Statistics; Housing and Construction; Industry and Commerce; International; Labor and Employment; Natural Resources and Environment; Population and Income; Prices and Cost of Living; Stocks and Commodities; Transportation and Travel. Java must be enabled to use this database.
Use SimplyAnalytics to create datasets using thousands of demographic, business, and marketing data variables for the U.S. Create extracts by city, county, or zip code to make a larger dataset. Create a free personal account to save your work. Limited to 5 simultaneous users.
Statista contains frequently sought statistics and studies gathered by government sources, scientific publications, market researchers, non-profits, trade organizations, etc. Indicator stats are usually displayed as a chart and the underlying data is downloadable in Excel; charts can be downloaded in multiple formats or embedded in web pages and can be great in papers or presentations. Search results also display dossiers (topical collections of statistics), infographics, country/industry/company reports, market reports, and a global consumer survey. Results also available in French, German, and Spanish.
The GSS contains a standard 'core' of demographic, behavioral, and attitudinal questions, plus topics of special interest. Create an extract, and then select a delimited or Excel format to download the data, and use an unzipping utility to open the zipped file. Spreadsheets will include the data, labels, and codes for your download.
ICPSR is the world's largest archive of social science data sets; includes data for criminal justice, demography, economics, education, foreign policy, gerontology, history, law, political science, public health, and sociology. Also hosts extensive learning modules on statistical analysis for use in courses. Register for a (free, required) MyData account using your St. Thomas email address (and independent password) to download data sets. If you have any problems accessing the data, contact our campus representative for ICPSR, John Heintz (email@example.com).
To create a MyData account:
Hit the Log In button at upper right, then follow the New User>Create Account instructions. When prompted, we encourage you to select the option to allow your campus Official Representative (OR) to view your name and email address--this can help us assist you with troubleshooting and in analyzing our usage.
Email Account Validation:
Typically, ICPSR will automatically recognize your institution and validate your account. If you're trying to download data and receive an unexpected alert that you're not from a member institution, even though you are, your account may need to be validated. We can update this manually.
To request a validation, send an email to firstname.lastname@example.org. Include your name, your institutional email address, and the name of your institution.
A U.S. federal government cross-agency repository of open data sets. This link is filtered to comma-separated-value (.csv) formatted data. | Filtered to Excel format (.xlsx). You can also go to the main page of the site and browse/search topically.
Appears to be a cache of practice datasets linked from the home page of Professor Larry Winner of the U of FL Department of Statistics. Each contains a .dat file and a very brief plain text description file.