Latest revision |
Your text |
Line 348: |
Line 348: |
| '''Work-in-progress Presentations:''' | | '''Work-in-progress Presentations:''' |
| * Elizabeth Thompson | | * Elizabeth Thompson |
| | * Muqing Liu |
|
| |
|
| == Week 11: Share and discuss works-in-progress (July 25) == | | == Week 11: Share and discuss works-in-progress (July 25) == |
Line 368: |
Line 369: |
| '''Work-in-progress Presentations:''' | | '''Work-in-progress Presentations:''' |
| * Ryan Funkhouser | | * Ryan Funkhouser |
| * Muqing Liu | | * |
|
| |
|
| '''Assignment Due:''' | | '''Assignment Due:''' |
Line 405: |
Line 406: |
|
| |
|
| '''Visualization for Exploratory Data Analysis''' | | '''Visualization for Exploratory Data Analysis''' |
| * Academic article that goes over objectives and processes for EDA using visualizations: https://www.researchgate.net/profile/Dr-Subhendu-Pani/publication/337146539_IJITEE/links/5dc70b124585151435fb427e/IJITEE.pdf | | * Great academic article that goes over objectives and processes for EDA using visualizations: https://www.researchgate.net/profile/Dr-Subhendu-Pani/publication/337146539_IJITEE/links/5dc70b124585151435fb427e/IJITEE.pdf |
| * Great article that shows how visualizations are really useful for EDA in even more NLP scenarios. For example, what are the distributions of sentiments?: - https://medium.com/towards-data-science/a-complete-exploratory-data-analysis-and-visualization-for-text-data-29fb1b96fb6a | | * Great article that shows how visualizations are really useful for EDA in even more NLP scenarios. For example, what are the distributions of sentiments?: - https://medium.com/towards-data-science/a-complete-exploratory-data-analysis-and-visualization-for-text-data-29fb1b96fb6a |
| * EDA applied to Machine Learning contexts: https://medium.com/open-machine-learning-course/open-machine-learning-course-topic-1-exploratory-data-analysis-with-pandas-de57880f1a68 and visualizations applied to machine learning contexts: https://medium.com/open-machine-learning-course/open-machine-learning-course-topic-2-visual-data-analysis-in-python-846b989675cd | | * EDA applied to Machine Learning contexts: https://medium.com/open-machine-learning-course/open-machine-learning-course-topic-1-exploratory-data-analysis-with-pandas-de57880f1a68 and visualizations applied to machine learning contexts: https://medium.com/open-machine-learning-course/open-machine-learning-course-topic-2-visual-data-analysis-in-python-846b989675cd |
Line 411: |
Line 412: |
|
| |
|
| == Advanced Pandas == | | == Advanced Pandas == |
| [https://pandas.pydata.org/pandas-docs/stable/index.html '''Pandas Documentation''']
| |
|
| |
| [https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf '''Pandas Cheatsheet''']
| |
|
| |
| '''Tutorials:'''
| |
| * [https://www.packtpub.com/product/pandas-1x-cookbook-second-edition/9781839213106 Pandas Cookbook]
| |
| * [https://tomaugspurger.net/posts/modern-1-intro/ Modern Pandas]
| |
| * [https://www.youtube.com/playlist?list=PL-osiE80TeTsWmV9i9c58mdDCSskIFdDS Video Series of Tutorials]
| |
| * [https://wesmckinney.com/book/ Python for Data Analysis]
| |
| * [https://realpython.com/pandas-project-gradebook/ Make a Gradebook with Pandas]
| |
| * [https://jakevdp.github.io/PythonDataScienceHandbook/ Python Data Science Handbook]
| |
|
| |
| '''GPT & Pandas:'''
| |
| * [https://www.sharpsightlabs.com/blog/gpt-writes-bad-pandas-code/ GPT Writes Horrible Pandas Code]
| |
| * [https://github.com/rvanasa/pandas-gpt Package to have GPT Write Good Pandas Code]
| |
|
| |
| '''Extra:'''
| |
|
| |
| [https://towardsdatascience.com/one-word-of-code-to-stop-using-pandas-so-slowly-793e0a81343c Make Pandas Run Faster with Swifter]
| |
|
| |
| '''Class Tutorial:'''
| |
|
| |
|
| '''[https://drive.google.com/file/d/162nO8u2Sr3bPOqoq8KLLhKR8OhmosGjJ/view?usp=sharing Jupyter Notebook]'''
| | Christina (I think this is where you want me to sign up? - lol) |
|
| |
|
| == Agent-based modeling == | | == Agent-based modeling == |
Line 524: |
Line 504: |
| Muqing Liu | | Muqing Liu |
|
| |
|
| Introduction to SQL:
| | An overall introduction to SQL: |
| General introduction to SQL https://www.khanacademy.org/computing/computer-programming/sql
| | https://www.khanacademy.org/computing/computer-programming/sql |
| Relational model and the foundation of SQL https://dl.acm.org/doi/10.1145/362384.362685
| |
| Principles and rules for relational database management systems https://www.dcs.warwick.ac.uk/~hugh/TTM/
| |
| | |
| Textbook Guidance to write SQL:
| |
| "The complete idiot's guide to SQL" Steven Holzner This is a beginner-friendly guide introduces SQL concepts and commands. https://www.amazon.com/Complete-Idiots-Guide-SQL/dp/1615641092
| |
| "SQL and Relational Theory: How to write accurate SQL code" C.J. Date
| |
| This book provides a comprehensive guide to understand SQL and relational theory https://www.amazon.com/SQL-Relational-Theory-Write-Accurate/dp/1449316409
| |
| "SQL pocket guide" Jonathan Gennick
| |
| This book is a handy reference for SQL syntax and command https://www.amazon.com/SQL-Pocket-Guide-Usage/dp/1449394094
| |
| | |
| Online courses:
| |
| SQL for beginners https://www.udemy.com/course/sql-for-beginners/
| |
| This beginner-friendly course covers database design, querying with SQL, data manipulation, and database management.
| |
| SQL essential training https://www.linkedin.com/learning/sql-essential-training/
| |
| This course covers basic SQL commands and querying techniques.
| |
| The complete SQL bootcamp https://www.udemy.com/course/the-complete-sql-bootcamp/
| |
| This course covers both SQL fundamentals and advanced concepts. It also includes real-world projects and hands-on exercises.
| |
| SQL for data science https://www.coursera.org/learn/sql-for-data-science
| |
| This course is designed for data science professionals to use SQL for data manipulation and analysis. It covers SQL queries, joins, and aggregations for data science tasks.
| |
| Advanced SQL for query tuning https://www.pluralsight.com/courses/advanced-sql-query-tuning
| |
| This course is for intermediate to advanced SQL users looking to optimize their SQL queries and improve database performance.
| |
| | |
|
| |
|
| SQL tutorial videos: | | SQL tutorial videos: |
| MySQL tutorial for beginners https://www.youtube.com/watch?v=7S_tz1z_5bA
| |
| SQL Tutorial - Full Database Course for Beginners: https://www.youtube.com/watch?v=HXV3zeQKqGY | | SQL Tutorial - Full Database Course for Beginners: https://www.youtube.com/watch?v=HXV3zeQKqGY |
| SQL Advanced Tutorial|Advanced SQL Tutorial With Examples https: //www.youtube.com/watch?v=M-55BmjOuXY | | SQL Advanced Tutorial|Advanced SQL Tutorial With Examples https: //www.youtube.com/watch?v=M-55BmjOuXY |
|
| |
| The use of SQL in data science:
| |
| A Comparative Analysis on different aspects of Database Management System https://www.researchgate.net/publication/352178674_A_Comparative_Analysis_on_different_aspects_of_Database_Management_System
| |
| This paper compared different database management system for handling big data storage and processing tasks.
| |
| Twitter Sentiment Analysis Approaches: A Survey https://www.learntechlib.org/p/217980/
| |
| Analysis of Healthcare Data using SQL https://www.linkedin.com/pulse/analysis-healthcare-data-using-sql-kristopher-bosch/
| |
| SQL for Stock Market Analysis https://medium.datadriveninvestor.com/sql-for-stock-market-analysis-f2145031e125
| |
|
| |
|
| == Command line == | | == Command line == |
|
| |
|
| == Large language models== | | == Building your own language model== |
| | | Dyuti |
| Resources posted by Dyuti
| |
| | |
| -[https://www.techtarget.com/searchenterpriseai/definition/languagemodeling#:~:text=Importance%20of%20language%20modeling&text=It%20is%20the%20reason%20that,other%20to%20a%20limited%20extent What is a language model and why do we need it?]
| |
| | |
| - [https://medium.com/analytics-vidhya/a-comprehensive-guide-to-build-your-own-language-model-in-python-5141b3917d6d A comprehensive guide to build your own language model]
| |
| | |
| LLMs and Research:
| |
| | |
| Large Language Models and Underrepresented Languages [https://arxiv.org/ftp/arxiv/papers/2007/2007.05872.pdf Paper]
| |
| | |
| -Social Biases:
| |
| | |
| -[http://proceedings.mlr.press/v139/liang21a.html Towards Understanding and Mitigating Social Biases in Language Models]
| |
| | |
| - [https://medium.com/@arpitnarain/unmasking-bias-assessing-fairness-in-large-language-models-a722624e4483 Unmasking Bias —Assessing Fairness in Large Language Models]
| |
| | |
| - [https://aclanthology.org/2022.bigscience-1.6.pdf Pipelines for Social Bias Testing of Large Language Models]
| |
| | |
| - [https://huggingface.co/blog/evaluating-llm-bias#evaluating-language-model-bias-with-%F0%9F%A4%97-evaluate Evaluating Language Model Bias with 🤗 Evaluate ]
| |
| | |
| Mitigating Bias:
| |
| | |
| - [https://www.aneesmerchant.com/personal-musings/large-language-models-and-bias-an-unresolved-issue#:~:text=Bias%20in%20LLMs%20can%20manifest,these%20models%20are%20trained%20on. LLM and Biases]
| |
| | |
| | |
| - [https://news.mit.edu/2023/large-language-models-are-biased-can-logic-help-save-them-0303 logic aware models- MIT]
| |
| | |
| LLM and Research:
| |
|
| |
|
| - [https://proceedings.mlr.press/v202/aher23a/aher23a.pdf Using LLMs to Simulate Multiple Humans and Replicate Human Subject Studies] (I am a little dicey about the ethics of it? Would like to hear what everyone else thinks)
| |
|
| |
|
| == Cluster / large-scale computing == | | == Cluster / large-scale computing == |
Line 607: |
Line 528: |
|
| |
|
| - A workshop summary on reproducibility and large-scale computing: https://arxiv.org/ftp/arxiv/papers/1412/1412.5557.pdf | | - A workshop summary on reproducibility and large-scale computing: https://arxiv.org/ftp/arxiv/papers/1412/1412.5557.pdf |
|
| |
| - Basics of high performance computing: https://hbctraining.github.io/Intro-to-shell-flipped/lessons/08_HPC_intro_and_terms.html
| |
|
| |
| - RedHat and HPC: https://www.redhat.com/en/products/high-performance-computing
| |
|
| |
| -
| |
|
| |
|
| == Network analysis == | | == Network analysis == |
| * [https://youtu.be/flwcAf1_1RU Network Analysis Introduction Video]
| | Hazel |
| | | * [https://youtu.be/flwcAf1_1RU Network Analysis] |
| Resources added by Hazel
| |
| | |
| NetworkX
| |
| * [https://towardsdatascience.com/network-analysis-d734cd7270f8 What is Network Analysis]
| |
| * [https://www.researchgate.net/publication/236407765_Exploring_Network_Structure_Dynamics_and_Function_Using_NetworkX Exploring Network Structure, Dynamics, and Function Using NetworkX]
| |
| * [https://youtu.be/VetBkjcm9Go Crash Course of NetworkX on Youtube] | |
| *[https://trenton3983.github.io/files/projects/2020-05-21_intro_to_network_analysis_in_python/2020-05-21_intro_to_network_analysis_in_python.html Python Notebook Introduction of NetworkX]
| |
| | |
| Applications of NetworkX in academic research
| |
| *[https://doi.org/10.1080/13683500.2020.1777950 Valeri, M., & Baggio, R. (2020). Italian tourism intermediaries: A social network analysis exploration. Current Issues in Tourism, 24(9), 1270–1283.]
| |
| *[https://doi.org/10.1016/j.gloenvcha.2015.03.006 Williams, H. T. P., McMurray, J. R., Kurz, T., & Hugo Lambert, F. (2015). Network analysis reveals open forums and Echo Chambers in social media discussions of climate change. Global Environmental Change, 32, 126–138.]
| |
| | |
| iGraph
| |
| *[https://towardsdatascience.com/newbies-guide-to-python-igraph-4e51689c35b4 Newbies Guide to Python-igraph]
| |
| *[https://www.cs.rhul.ac.uk/home/tamas/development/igraph/tutorial/tutorial.html iGraph Tutorial]
| |
| *[https://www.youtube.com/watch?v=DuTROLV1760 iGraph with R Video Tutorial]
| |
| | |
| Application of iGraph in academic research
| |
| *[https://doi.org/10.1016/j.socnet.2015.07.003 González-Bailón, S., & Wang, N. (2016). Networked discontent: The anatomy of protest campaigns in social media. Social Networks, 44, 95–104]
| |
| *[https://doi.org/10.1187/cbe.13-08-0162 Grunspan, D. Z., Wiggins, B. L., & Goodreau, S. M. (2014). Understanding Classrooms through Social Network Analysis: A Primer for Social Network Analysis in Education Research. CBE—Life Sciences Education, 13(2), 167–178]
| |
| *[https://doi.org/10.1080/01292986.2018.1453849 Kokil Jaidka, Saifuddin Ahmed, Marko Skoric & Martin Hilbert (2019) Predicting elections from social media: a three-country, three-method comparative study, Asian Journal of Communication, 29:3, 252-273]
| |
|
| |
|
| == Object-oriented programming == | | == Object-oriented programming == |