Community Data Science Workshops (Winter 2020)/Resources: Difference between revisions
(Initial edit of the resources page for workshops) |
(remove this stuff) |
||
(4 intermediate revisions by 2 users not shown) | |||
Line 4: | Line 4: | ||
This list of resources was created in order to support CDSW workshop participants to continue to develop their skills or answer some of the questions that arose during the workshops | This list of resources was created in order to support CDSW workshop participants to continue to develop their skills or answer some of the questions that arose during the workshops | ||
== Scraping Data from the Web == | == Scraping Data from the Web == | ||
[https://Helena-lang.org/ Helena-Lang.org] demonstrates how to get data scraped automatically. It requires no programming and has a free Chrome Plug-in. The website has a series of tutorials available here: [http://helena-lang.org/demonstration/ Tutorials] | [https://Helena-lang.org/ Helena-Lang.org] demonstrates how to get data scraped automatically. It requires no programming and has a free Chrome Plug-in. The website has a series of tutorials available here: [http://helena-lang.org/demonstration/ Tutorials] | ||
== Quantitative Data Analysis == | == Quantitative Data Analysis == | ||
Line 20: | Line 12: | ||
== Data Visualization == | == Data Visualization == | ||
[https://altair-viz.github.io/gallery/index.html/ Altair] allows you to write a high-level specification about desired visualization and data. The platform allows you to get back data visualization and requires some programming and comfort in Python. The platform has a free Python package. | [https://altair-viz.github.io/gallery/index.html/ Altair-Viz.github.io] allows you to write a high-level specification about desired visualization and data. The platform allows you to get back data visualization and requires some programming and comfort in Python. The platform has a free Python package. | ||
== Finding a dataset == | == Finding a dataset == |
Latest revision as of 16:25, 17 March 2020
- Resources Winter 2020
This list of resources was created in order to support CDSW workshop participants to continue to develop their skills or answer some of the questions that arose during the workshops
Scraping Data from the Web[edit]
Helena-Lang.org demonstrates how to get data scraped automatically. It requires no programming and has a free Chrome Plug-in. The website has a series of tutorials available here: Tutorials
Quantitative Data Analysis[edit]
Tea-Lang.org provides a high-level specification of your data and hypothesis, and get back valid statistical test results and explanations. It requires minimal programming and comfort in Python and has a free Python package.
Data Visualization[edit]
Altair-Viz.github.io allows you to write a high-level specification about desired visualization and data. The platform allows you to get back data visualization and requires some programming and comfort in Python. The platform has a free Python package.
Finding a dataset[edit]
In case you are looking for available datasets for your projects here are some potential leads:
- Do some Google Scholar and normal internet searching for datasets in your research area. You'll probably be surprised at what's available.
- Take a look at datasets available in the Harvard Dataverse (a very large collection of social science research data) or one of the other members of the Dataverse network.
- Look at the collection of social scientific datasets at ICPSR at the University of Michigan (NU is a member). There is an enormous number of very rich datasets.
- Use the ISA Explorer to find datasets. Keep in mind the large majority of datasets it will search are drawn from the natural sciences.
- The City of Chicago has one of the best data portal sites of any municipality in the U.S. (and better than many federal agencies). There are also numerous administrative datasets released by other public entities (try searching!) that you might find inspiring.
- FiveThirtyEight.com has published a GitHub repository and an R package with pre-processed and cleaned versions of many of the datasets they use for articles published on their website.