Human Centered Data Science/Datasets

From CommunityData
< Human Centered Data Science
Revision as of 02:37, 30 October 2017 by Jtmorgan (talk | contribs) (Created page with "In order to complete your project, you will each need a dataset. If you are at the stage of your career where you already have a dataset, great! If not, there are many dataset...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

In order to complete your project, you will each need a dataset. If you are at the stage of your career where you already have a dataset, great! If not, there are many datasets to draw from. Here are some ideas:

  • If there's an author of a study you loved, you can send a polite email asking if they are able or willing to share an archival or replication version of the dataset used in their paper. Be very polite and make it clear that this is starting as a class project but that might turn into a paper for publication. Make your timeline clear. In communication, replication datasets are still very rare, so be prepared for a negative answer.
  • Do some Google Scholar and normal Google searching for datasets in your research area. You'd be surprised at what's available.
  • This Google Doc provides documentation and descriptions of datasets (and potential research questions) from Yelp, Data.seattle.gov, and Wikimedia.
  • Take a look at datasets available in the Harvard Dataverse (the largest collection of social science research data) or one of the other members of the Dataverse network.
  • Look at the collection of social scientific datasets at ICPSR (UW is a member). There are an enormous number of very rich datasets.
  • Use the ISA Explorer to find datasets. Keep in mind the large majority of datasets it will search are drawn from the natural sciences.


Open online datasets

  • Data.gov provides access to a variety of US Federal datasets and data sources, along with an API and online tools for searching for data.
  • The Sunlight Foundation used to provide a one-stop shop for datasets and tools around political activity in the USA. The Foundation has closed down, but their website points to a variety of other organizations, datasets, and tools for accessing public civic data.