Editing Advanced Computational Communication Methods (Summer 2023)

From CommunityData

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.

Latest revision Your text
Line 161: Line 161:
* Class overview and expectations — We'll walk through this syllabus.
* Class overview and expectations — We'll walk through this syllabus.
* Make assignments for topic exploration
* Make assignments for topic exploration
'''Slides:'''
[https://jeremydfoote.com/computational_communication_resources/welcome_slides/lecture/welcome.html#/welcome-to-com-682 Welcome slides]


== Week 2: Reproducible Research I (May 23) ==
== Week 2: Reproducible Research I (May 23) ==
'''Resources:'''
* Paper: Wilson G, Bryan J, Cranston K, Kitzes J, Nederbragt L, Teal TK (2017). [https://doi.org/10.1371/journal.pcbi.1005510 Good enough practices in scientific computing]. PLoS Comput Biol 13(6): e1005510.
* Paper: Gentzkow M, Shapiro JM. Code and Data for the Social Sciences: A Practitioner's Guide; 2014. https://web.stanford.edu/~gentzkow/research/CodeAndData.pdf.
* Video: [https://www.youtube.com/watch?v=4rBX6r5emgQ Reproducible Research: Concepts and Ideas]. Roger Peng. YouTube
'''Slides:'''
* [https://jeremydfoote.com/computational_communication_resources/reproducible_research/lecture/reproducible_research.html Week 2 and 3 slides]
* [https://purdue.brightspace.com/d2l/le/content/798129/viewContent/13239600/View Video of class meeting]
=== Organization ===
Key ideas:
* Folder structure
** Different options, but separate code from data
** Jeremy's approach:
<pre>
my_cool_project
|
|-- README.md # Explanation of project and how to navigate it
|-- Snakefile # Or Makefile - workflow tool
|
|-- data/
|  |-- raw_data/
|  |-- processed_data/
|
|-- code/
|
|-- results/
|  |-- figures/
|
|-- papers/
|
|-- presentations/
</pre>


=== Data Management ===
=== Data Management ===
Key ideas:
* Back up raw data
* Keep raw data (and make it read-only)
* Step one is to clean the data: create the data you wish you received
** Name variables well
** Use a [https://www.jstatsoft.org/article/view/v059i10 tidy] data structure
* Share data (when possible)
=== Code management ===
Key ideas:
* Version control
* Don't repeat yourself (DRY)
* Build at least a few high-level test cases
== Week 3: Reproducible Research II (May 30) ==


'''Resources:'''
'''Resources:'''
* Blog post: [http://datasci.kitzes.com/lessons/python/reproducible_workflow.html Reproducible Workflows] by Justin Kitzes
* [https://www.youtube.com/watch?v=r9PWnEmz_tc Introduction to Snakemake Tutorial (video)]
* [https://lachlandeer.github.io/snakemake-econ-r-tutorial/index.html An Introduction to Snakemake for social science]
* [https://www.youtube.com/watch?v=zqQM66uAig0 LaTeX introduction (video)]
* [https://www.overleaf.com/learn/latex/Knitr knitr introduction]


'''Slides:'''


* [https://jeremydfoote.com/computational_communication_resources/reproducible_research/lecture/reproducible_research.html Week 2 and 3 slides]
=== Code management ===
* [https://purdue.brightspace.com/d2l/le/content/798129/viewContent/13254014/View Video of class meeting]
 
 
=== Reproducible analyses and papers ===




Key ideas:
=== Reproducible analyses ===
* Some big benefits (and some drawbacks) to using text-based tools ([https://bookdown.org/yihui/rmarkdown-cookbook/ Markdown] or [https://www.overleaf.com/ LaTeX])
** Can be put in version control
** Tools like [https://yihui.org/knitr/ knitr] can be used to put code directly into a document
* Make figure creation part of your workflow, have documents point to your figures directory
* Use citation management software that integrates with your document (use [https://www.zotero.org/ Zotero])




=== Sharing ===
=== Sharing ===


Key ideas:
== Week 3: Reproducible Research II (May 30) ==
* Share your code and data whenever possible!
* Lots of options - [https://osf.io/ OSF.io], [https://dataverse.harvard.edu/ Harvard Dataverse], etc.
* Share preprints online


'''Resources:'''




=== Advanced: Workflow Management ===
== Week 4: Computational text analysis: entity extraction, topic models (June 6) ==
 
Key ideas:
* Tools to reproduce as much of the workflow as possible
* README file is much better than nothing
* Even better is a "wrapper" script that runs everything
** Very clear exactly what is run and how
** Some fairly simple options:
*** Python file
*** [https://www.gnu.org/software/make/ GNU Make]
*** [https://snakemake.github.io/ Snakemake]
 
 
== Week 4: Computational text analysis: Introduction and Key Concepts (June 6) ==


'''Resources:'''
'''Resources:'''
Text As Data: A New Framework for Machine Learning and the Social Sciences (2022). Justin Grimmer, Margaret E. Roberts, and Brandon M. Stewart.
Read chapters 1-7
== Week 5: Computational text analysis: Some "traditional" approaches (June 13) ==
=== Topic modeling ===


=== Embeddings ===
== Week 5: Computational text analysis: word embedding models (June 13) ==


'''Resources:'''
'''Resources:'''
* [https://huggingface.co/blog/getting-started-with-embeddings Getting started with embeddings (Huggingface)]
=== Classification ===
=== Semantic networks ===
* [https://www.youtube.com/playlist?list=PLeo1K3hjS3uuvuAXhYjV2lMEShq2UYSwX NLP Tutorial Playlist Python (YouTube videos)]


== Week 6: Computational text analysis: using LLMs for research (June 20) ==
== Week 6: Computational text analysis: using LLMs for research (June 20) ==
'''Due:'''
* Final project proposal (details on Brightspace)


'''Resources:'''
'''Resources:'''


Intro to LLMs:
'''Agenda:'''
* [https://mark-riedl.medium.com/a-very-gentle-introduction-to-large-language-models-without-the-hype-5f67941fa59e Mark Riedl. A very gentle introduction to Large Language Models without the hype]
* Discuss how things are working and a plan for the rest of the course.
* [https://www.youtube.com/watch?v=bSvTVREwSNw How ChatGPT Works Technically | ChatGPT Architecture (YouTube)]
* [https://amatriain.net/blog/transformer-models-an-introduction-and-catalog-2d1e9039f376/ Transformer Models: An introduction and catalog]
* [https://www.youtube.com/watch?v=iR2O2GPbB0E What are Large Language Models (LLMs)?]


Reflections on LLMs for research:
== Week 7: TBD ==
* [https://arxiv.org/abs/2305.03514 Ziems, C., Held, W., Shaikh, O., Chen, J., Zhang, Z., & Yang, D. (2023). Can Large Language Models Transform Computational Social Science?. arXiv preprint arXiv:2305.03514.]


Papers using LLMs:
I'm leaving this week open because I think we may want one more week on shared topics.
* [https://arxiv.org/abs/2304.03442 Park, J. S., O'Brien, J. C., Cai, C. J., Morris, M. R., Liang, P., & Bernstein, M. S. (2023). Generative agents: Interactive simulacra of human behavior. arXiv preprint arXiv:2304.03442.]
 
== Week 7: Share and discuss works-in-progress (June 27) ==


'''Assignment Due:''' [[Self_Assessment_Reflection | self-assessment reflection]]
'''Assignment Due:''' [[Self_Assessment_Reflection | self-assessment reflection]]


'''Topic Presentations:'''
'''Resources:'''
* Juan Pablo (JP) Loaiza-Ramírez
*


'''Work-in-progress Presentations:'''
* Juan Pablo (JP) Loaiza-Ramírez
* Christina Walker


== Week 8: No class - July 4 ==
== Week 8: No class - July 4 ==
Line 333: Line 210:
== Week 9: Share and discuss works-in-progress (July 11) ==
== Week 9: Share and discuss works-in-progress (July 11) ==


'''Topic Presentations:'''
'''Students:'''
* Elizabeth Thompson
* Ryan Funkhouser
 
'''Work-in-progress Presentations:'''
* Dyuti Jha


== Week 10: Share and discuss works-in-progress (July 18) ==
== Week 10: Share and discuss works-in-progress (July 18) ==


'''Topic Presentations:'''
'''Students:'''
* Christina Walker
* Hazel Chiu
 
'''Work-in-progress Presentations:'''
* Elizabeth Thompson


== Week 11: Share and discuss works-in-progress (July 25) ==
== Week 11: Share and discuss works-in-progress (July 25) ==


Visit from [https://sites.google.com/view/billrand/ Bill Rand], an expert in agent-based moedeling.
'''Students:'''
 
'''Topic Presentations:'''
* Dyuti Jha
 
 
'''Work-in-progress Presentations:'''
* Hazel Chiu


== Week 12: Share and discuss works-in-progress (August 1) ==
== Week 12: Share and discuss works-in-progress (August 1) ==


'''Topic Presentations:'''
'''Students:'''
* Muqing Liu
*
 
'''Work-in-progress Presentations:'''
* Ryan Funkhouser
* Muqing Liu


'''Assignment Due:'''
'''Assignment Due:'''
Line 379: Line 233:
== Visualization in Python ==
== Visualization in Python ==


===Resources added by Ryan===
'''Refresher/Basic resources'''
* Quick video overview: https://www.youtube.com/watch?v=a9UrKTVEeZA
* Longer, but still simple, video course outlining visualization techniques: https://www.simplilearn.com/tutorials/python-tutorial/data-visualization-in-python
* And of course, don't forget that one of the greatest resources for getting input on how to change visualizations is ChatGPT: https://chat.openai.com/
'''Understanding which visualization libraries to learn/use'''
* A useful academic article suggesting Matplotlib, Seaborn, and Plotly as the best: - https://ieeexplore.ieee.org/abstract/document/8757088?casa_token=REAm2SOC93MAAAAA:fCJHaTYgHA8FXZMbVEdZcevcXKsNJBBvB83F5HGgSEh504YPfROjnI08K1f2CJ1b6ZDVhhxF
* An excellent article on Medium about what use case scenarios are best for each of Matplotlib, Seaborn, and Plotly: https://medium.com/mlearning-ai/comparing-python-libraries-for-visualization-b2eb6c862542#:~:text=Matplotlib%20is%20a%20great%20choice,choice%20for%20creating%20interactive%20visualizations.
'''Matplotlib'''
* Excellent general overview: https://towardsdatascience.com/introduction-to-data-visualization-in-python-89a54c97fbed
* Great, more in-depth guide on how to really take visualizations to the next level: https://towardsdatascience.com/5-steps-to-build-beautiful-bar-charts-with-python-3691d434117a
* Documentation: https://matplotlib.org/stable/index.html
'''Seaborn'''
* Great overview of Seaborn: https://medium.com/insight-data/data-visualization-in-python-advanced-functionality-in-seaborn-20d217f1a9a6
* Third-party documentation-style site that helps make it really easy to figure out how to do each kind of visualization: https://www.geeksforgeeks.org/python-seaborn-tutorial/
* Documentation: https://seaborn.pydata.org/
'''Plotly'''
* Excellent quick overview of what Plotly can do and how to use it: https://towardsdatascience.com/the-next-level-of-data-visualization-in-python-dd6e99039d5e
* Third-party documentation-style site that helps make it really easy to figure out how to do each kind of visualization: https://www.geeksforgeeks.org/python-plotly-tutorial/
* Documentation: https://plotly.com/python/
'''Visualization for Exploratory Data Analysis'''
* Academic article that goes over objectives and processes for EDA using visualizations: https://www.researchgate.net/profile/Dr-Subhendu-Pani/publication/337146539_IJITEE/links/5dc70b124585151435fb427e/IJITEE.pdf
* Great article that shows how visualizations are really useful for EDA in even more NLP scenarios. For example, what are the distributions of sentiments?: - https://medium.com/towards-data-science/a-complete-exploratory-data-analysis-and-visualization-for-text-data-29fb1b96fb6a
* EDA applied to Machine Learning contexts: https://medium.com/open-machine-learning-course/open-machine-learning-course-topic-1-exploratory-data-analysis-with-pandas-de57880f1a68 and visualizations applied to machine learning contexts: https://medium.com/open-machine-learning-course/open-machine-learning-course-topic-2-visual-data-analysis-in-python-846b989675cd
* A gentle introduction to EDA: https://towardsdatascience.com/a-gentle-introduction-to-exploratory-data-analysis-f11d843b8184


== Advanced Pandas ==
== Advanced Pandas ==
[https://pandas.pydata.org/pandas-docs/stable/index.html '''Pandas Documentation''']


[https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf '''Pandas Cheatsheet''']
'''Tutorials:'''
* [https://www.packtpub.com/product/pandas-1x-cookbook-second-edition/9781839213106 Pandas Cookbook]
* [https://tomaugspurger.net/posts/modern-1-intro/ Modern Pandas]
* [https://www.youtube.com/playlist?list=PL-osiE80TeTsWmV9i9c58mdDCSskIFdDS Video Series of Tutorials]
* [https://wesmckinney.com/book/ Python for Data Analysis]
* [https://realpython.com/pandas-project-gradebook/ Make a Gradebook with Pandas]
* [https://jakevdp.github.io/PythonDataScienceHandbook/ Python Data Science Handbook]
'''GPT & Pandas:'''
* [https://www.sharpsightlabs.com/blog/gpt-writes-bad-pandas-code/ GPT Writes Horrible Pandas Code]
* [https://github.com/rvanasa/pandas-gpt Package to have GPT Write Good Pandas Code]
'''Extra:'''
[https://towardsdatascience.com/one-word-of-code-to-stop-using-pandas-so-slowly-793e0a81343c Make Pandas Run Faster with Swifter]
'''Class Tutorial:'''
'''[https://drive.google.com/file/d/162nO8u2Sr3bPOqoq8KLLhKR8OhmosGjJ/view?usp=sharing Jupyter Notebook]'''


== Agent-based modeling ==
== Agent-based modeling ==


'''Resources added by Juan Pablo (JP) Loaiza-Ramírez'''
The following resources are listed in order of importance. Consider them as a "gentle" introduction to agent-based modeling.
'''Best papers overall'''
* [https://doi.org/10.1016/j.ijresmar.2011.04.002 Rand, W., & Rust, R. T. (2011). Agent-based modeling in marketing: Guidelines for rigor. International Journal of Research in Marketing, 28(3), 181–193. https://doi.org/10.1016/j.ijresmar.2011.04.002]
* [https://doi.org/10.1287/mnsc.2017.2877 Smith, E. B., & Rand, W. (2018). Simulating macro-level effects from micro-level observations. Management Science, 64(11), 5405–5421. https://doi.org/10.1287/mnsc.2017.2877]
* [https://doi.org/10.1080/19312458.2021.1986478 Waldherr, A., Hilbert, M., & González-Bailón, S. (2021). Worlds of agents: Prospects of agent-based modeling for communication research. Communication Methods and Measures, 15(4), 243–254. https://doi.org/10.1080/19312458.2021.1986478]
* [https://ijoc.org/index.php/ijoc/article/view/10588 Waldherr, A., & Wettstein, M. (2019). Bridging the gaps: Using agent-based modeling to reconcile data and theory in computational communication science. International Journal of Communication, 13, 3976–3999. https://ijoc.org/index.php/ijoc/article/view/10588]
* [http://www.jstor.org/stable/3069238 Macy, M. W., & Willer, R. (2002). From Factors to Actors: Computational Sociology and Agent-Based Modeling. Annual Review of Sociology, 28, 143–166.]
'''Seminal papers'''
* [https://www.jstor.org/stable/2117868 Arthur, W. B. (1994). Inductive Reasoning and Bounded Rationality. The American Economic Review, 84(2), 406–411. http://www.jstor.org/stable/2117868]
* [https://onlinelibrary.wiley.com/doi/10.1002/%28SICI%291099-0526%28199711/12%293%3A2%3C16%3A%3AAID-CPLX4%3E3.0.CO%3B2-K Axelrod, R. (1997). Advancing the art of simulation in the social sciences. Complexity, 3(2), 16–22. https://doi.org/10.1002/(SICI)1099-0526(199711/12)3:2<16::AID-CPLX4>3.0.CO;2-K]
* [https://doi.org/10.1007/BF01299065 Axtell, R., Axelrod, R., Epstein, J. M., & Cohen, M. D. (1996). Aligning simulation models: A case study and results. In Computational and Mathematical Organization Theory (Vol. 1, Issue 2, pp. 123–141). Springer Science and Business Media LLC. https://doi.org/10.1007/bf01299065]
* [https://doi.org/10.1073/pnas.082080899 Bonabeau, E. (2002). Agent-based modeling: Methods and techniques for simulating human systems. Proceedings of the National Academy of Sciences, 99, 7280–7287. https://doi.org/10.1073/pnas.082080899]
* [https://doi.org/10.1002/cplx.6130010503 Casti, J. L. (1996). Seeing the light at El Farol: A look at the most important problem in complex systems theory. Complexity, 1(5), 7–10. https://doi.org/10.1002/cplx.6130010503]
* [https://doi.org/10.1016/j.ecolmodel.2006.04.023 Grimm, V., Berger, U., Bastiansen, F., Eliassen, S., Ginot, V., Giske, J., Goss-Custard, J., Grand, T., Heinz, S. K., Huse, G., Huth, A., Jepsen, J. U., Jørgensen, C., Mooij, W. M., Müller, B., Pe’er, G., Piou, C., Railsback, S. F., Robbins, A. M., … DeAngelis, D. L. (2006). A standard protocol for describing individual-based and agent-based models. Ecological Modelling, 198(1–2), 115–126. https://doi.org/10.1016/j.ecolmodel.2006.04.023]
* [https://www.jstor.org/stable/1823701 Schelling, T. C. (1969). Models of Segregation. The American Economic Review, 59(2), 488–493. http://www.jstor.org/stable/1823701]
'''Examples of agent-based models in communication and other research fields'''
* [https://doi.org/10.1086/681254 DellaPosta, D., Shi, Y., & Macy, M. (2015). Why Do Liberals Drink Lattes? American Journal of Sociology, 120(5), 1473–1511. https://doi.org/10.1086/681254]
* [https://doi.org/10.1016/j.ecolecon.2022.107651 Foramitti, J. (2023). A framework for agent-based models of human needs and ecological limits. Ecological Economics, 204, 107651. https://doi.org/10.1016/j.ecolecon.2022.107651]
* [https://doi.org/10.1142/S1793962319500375 Forero, D. S., Ceballos, Y. F., & Torres, G. S. (2019). Simulation of consumers decision-making process using agent-based model approach. International Journal of Modeling, Simulation, and Scientific Computing, 10(06), 1950037. https://doi.org/10.1142/S1793962319500375]
* [https://doi.org/10.1007/s10614-021-10158-x Kato, J. S., & Sbicca, A. (2022). Bounded rationality, group formation and the emergence of trust: An agent-based economic model. Computational Economics, 60(2), 571–599. https://doi.org/10.1007/s10614-021-10158-x]
* [https://doi.org/10.24084/repqj08.367 Lopez Rodriguez, I., & Hernández Tejera, M. (2010). Agent-based services for building markets in distributed energy environments. Renewable Energy and Power Quality Journal, 1(08), 482–487. https://doi.org/10.24084/repqj08.367]
* [https://doi.org/10.1371/journal.pone.0031043 Luan, S., Katsikopoulos, K. V., & Reimer, T. (2012). When does diversity trump ability (and vice versa) in group decision making? A simulation study. PLoS ONE, 7(2). https://doi.org/10.1371/journal.pone.0031043]
* [https://doi.org/10.1016/j.solener.2019.08.040 Mittal, A., Krejci, C. C., Dorneich, M. C., & Fickes, D. (2019). An agent-based approach to modeling zero energy communities. Solar Energy, 191, 193–204. https://doi.org/10.1016/j.solener.2019.08.040]
* [https://doi.org/10.1177/0093650219856510 Sohn, D. (2022). Spiral of silence in the social media era: A simulation approach to the interplay between social networks and mass media. Communication Research, 49(1), 139–166. https://doi.org/10.1177/0093650219856510]
* [https://doi.org/10.1111/jcom.12288 Song, H., & Boomgaarden, H. G. (2017). Dynamic spirals put to test: An agent-based model of reinforcing spirals between selective exposure, interpersonal networks, and attitude polarization. Journal of Communication, 67(2), 256–281. https://doi.org/10.1111/jcom.12288]
'''YouTube Playlists'''
* [https://www.youtube.com/playlist?list=PLD4TWcPfbZO9HmaSutF_R2Y2RmiNDxvaP KaVe 101 - Agent Based Modeling with Python]
* [https://www.youtube.com/playlist?list=PLF0b3ThojznRKYcrw8moYMUUJK2Ra8Hwl Agent-Based Modeling (NetLogo)]
'''GitHub Repositories'''
* [https://github.com/topics/agent-based-modeling Different frameworks for agent-based-modeling, including mesa, agentpy, among others]
* [https://github.com/azvoleff/pyabm pyabm - Another agent-based modeling toolkit]
'''Online courses'''
* [https://www.publichealth.columbia.edu/research/population-health-methods/agent-based-modeling#Overview Agent-Based Modeling - Columbia University Irving Medical Center (General overview)]
* [https://www.coursera.org/learn/modeling-simulation-natural-processes#syllabus Simulation and modeling of natural processes - University of Geneva (Coursera)]
* [https://www.complexityexplorer.org/courses/171-introduction-to-agent-based-modeling Introduction to Agent-Based Modeling - Santa Fe Institute]
* [https://www.udemy.com/course/2020-intro-to-agent-based-modeling-simulation-ai-in-netlogo/ 2022 Intro to Agent-Based Modeling Simulation AI in NetLogo - Udemy]
'''Tutorials'''
* [https://www.complexityexplorer.org/courses/172-agent-based-models-with-python-an-introduction-to-mesa Agent-Based Models with Python: An Introduction to Mesa - Santa Fe Institute]


== SQL ==
== SQL ==
Muqing Liu


Introduction to SQL:
General introduction to SQL https://www.khanacademy.org/computing/computer-programming/sql
Relational model and the foundation of SQL https://dl.acm.org/doi/10.1145/362384.362685
Principles and rules for relational database management systems https://www.dcs.warwick.ac.uk/~hugh/TTM/
Textbook Guidance to write SQL:
"The complete idiot's guide to SQL" Steven Holzner This is a beginner-friendly guide introduces SQL concepts and commands.  https://www.amazon.com/Complete-Idiots-Guide-SQL/dp/1615641092
"SQL and Relational Theory: How to write accurate SQL code" C.J. Date
This book provides a comprehensive guide to understand SQL and relational theory https://www.amazon.com/SQL-Relational-Theory-Write-Accurate/dp/1449316409
"SQL pocket guide" Jonathan Gennick
This book is a handy reference for SQL syntax and command https://www.amazon.com/SQL-Pocket-Guide-Usage/dp/1449394094
Online courses:
SQL for beginners https://www.udemy.com/course/sql-for-beginners/
This beginner-friendly course covers database design, querying with SQL, data manipulation, and database management.
SQL essential training  https://www.linkedin.com/learning/sql-essential-training/
This course covers basic SQL commands and querying techniques.
The complete SQL bootcamp  https://www.udemy.com/course/the-complete-sql-bootcamp/
This course covers both SQL fundamentals and advanced concepts. It also includes real-world projects and hands-on exercises.
SQL for data science https://www.coursera.org/learn/sql-for-data-science
This course is designed for data science professionals to use SQL for data manipulation and analysis. It covers SQL queries, joins, and aggregations for data science tasks.
Advanced SQL for query tuning  https://www.pluralsight.com/courses/advanced-sql-query-tuning
This course is for intermediate to advanced SQL users looking to optimize their SQL queries and improve database performance.
SQL tutorial videos:
MySQL tutorial for beginners https://www.youtube.com/watch?v=7S_tz1z_5bA
SQL Tutorial - Full Database Course for Beginners: https://www.youtube.com/watch?v=HXV3zeQKqGY
SQL Advanced Tutorial|Advanced SQL Tutorial With Examples https: //www.youtube.com/watch?v=M-55BmjOuXY
The use of SQL in data science:
A Comparative Analysis on different aspects of Database Management System https://www.researchgate.net/publication/352178674_A_Comparative_Analysis_on_different_aspects_of_Database_Management_System
This paper compared different database management system for handling big data storage and processing tasks.
Twitter Sentiment Analysis Approaches: A Survey https://www.learntechlib.org/p/217980/
Analysis of Healthcare Data using SQL https://www.linkedin.com/pulse/analysis-healthcare-data-using-sql-kristopher-bosch/
SQL for Stock Market Analysis https://medium.datadriveninvestor.com/sql-for-stock-market-analysis-f2145031e125


== Command line ==
== Command line ==


== Large language models==
Resources posted by Dyuti
-[https://www.techtarget.com/searchenterpriseai/definition/languagemodeling#:~:text=Importance%20of%20language%20modeling&text=It%20is%20the%20reason%20that,other%20to%20a%20limited%20extent What is a language model and why do we need it?]
- [https://medium.com/analytics-vidhya/a-comprehensive-guide-to-build-your-own-language-model-in-python-5141b3917d6d A comprehensive guide to build your own language model]
LLMs and Research:
Large Language Models and Underrepresented Languages [https://arxiv.org/ftp/arxiv/papers/2007/2007.05872.pdf Paper]
-Social Biases:
-[http://proceedings.mlr.press/v139/liang21a.html Towards Understanding and Mitigating Social Biases in Language Models]
- [https://medium.com/@arpitnarain/unmasking-bias-assessing-fairness-in-large-language-models-a722624e4483 Unmasking Bias —Assessing Fairness in Large Language Models]
- [https://aclanthology.org/2022.bigscience-1.6.pdf Pipelines for Social Bias Testing of Large Language Models]
- [https://huggingface.co/blog/evaluating-llm-bias#evaluating-language-model-bias-with-%F0%9F%A4%97-evaluate Evaluating Language Model Bias with 🤗 Evaluate ]
Mitigating Bias:
- [https://www.aneesmerchant.com/personal-musings/large-language-models-and-bias-an-unresolved-issue#:~:text=Bias%20in%20LLMs%20can%20manifest,these%20models%20are%20trained%20on. LLM and Biases]
- [https://news.mit.edu/2023/large-language-models-are-biased-can-logic-help-save-them-0303 logic aware models- MIT]
LLM and Research:
- [https://proceedings.mlr.press/v202/aher23a/aher23a.pdf Using LLMs to Simulate Multiple Humans and Replicate Human Subject Studies] (I am a little dicey about the ethics of it? Would like to hear what everyone else thinks)


== Cluster / large-scale computing ==
== Cluster / large-scale computing ==


Elizabeth: Topic presentation and additional resources
- Google intro documentation: https://cloud.google.com/architecture/using-clusters-for-large-scale-technical-computing
- An cool example tutorial of how UCLA uses a cluster: https://github.com/chris-german/Hoffman2Tutorials
- Link for Purdue RCAC: https://www.rcac.purdue.edu/compute
- A workshop summary on reproducibility and large-scale computing: https://arxiv.org/ftp/arxiv/papers/1412/1412.5557.pdf
- Basics of high performance computing: https://hbctraining.github.io/Intro-to-shell-flipped/lessons/08_HPC_intro_and_terms.html
- RedHat and HPC: https://www.redhat.com/en/products/high-performance-computing
-


== Network analysis ==
== Network analysis ==
* [https://youtu.be/flwcAf1_1RU Network Analysis Introduction Video]
Resources added by Hazel
NetworkX
* [https://towardsdatascience.com/network-analysis-d734cd7270f8 What is Network Analysis]
* [https://www.researchgate.net/publication/236407765_Exploring_Network_Structure_Dynamics_and_Function_Using_NetworkX Exploring Network Structure, Dynamics, and Function Using NetworkX]
* [https://youtu.be/VetBkjcm9Go Crash Course of NetworkX on Youtube]
*[https://trenton3983.github.io/files/projects/2020-05-21_intro_to_network_analysis_in_python/2020-05-21_intro_to_network_analysis_in_python.html Python Notebook Introduction of NetworkX]
Applications of NetworkX in academic research
*[https://doi.org/10.1080/13683500.2020.1777950 Valeri, M., & Baggio, R. (2020). Italian tourism intermediaries: A social network analysis exploration. Current Issues in Tourism, 24(9), 1270–1283.]
*[https://doi.org/10.1016/j.gloenvcha.2015.03.006 Williams, H. T. P., McMurray, J. R., Kurz, T., & Hugo Lambert, F. (2015). Network analysis reveals open forums and Echo Chambers in social media discussions of climate change. Global Environmental Change, 32, 126–138.]


iGraph
* [https://youtu.be/flwcAf1_1RU Network Analysis]
*[https://towardsdatascience.com/newbies-guide-to-python-igraph-4e51689c35b4 Newbies Guide to Python-igraph]
*[https://www.cs.rhul.ac.uk/home/tamas/development/igraph/tutorial/tutorial.html iGraph Tutorial]
*[https://www.youtube.com/watch?v=DuTROLV1760 iGraph with R Video Tutorial]


Application of iGraph in academic research
*[https://doi.org/10.1016/j.socnet.2015.07.003 González-Bailón, S., & Wang, N. (2016). Networked discontent: The anatomy of protest campaigns in social media. Social Networks, 44, 95–104]
*[https://doi.org/10.1187/cbe.13-08-0162 Grunspan, D. Z., Wiggins, B. L., & Goodreau, S. M. (2014). Understanding Classrooms through Social Network Analysis: A Primer for Social Network Analysis in Education Research. CBE—Life Sciences Education, 13(2), 167–178]
*[https://doi.org/10.1080/01292986.2018.1453849 Kokil Jaidka, Saifuddin Ahmed, Marko Skoric & Martin Hilbert (2019) Predicting elections from social media: a three-country, three-method comparative study, Asian Journal of Communication, 29:3, 252-273]


== Object-oriented programming ==
== Object-oriented programming ==
Line 651: Line 266:


* [https://www.youtube.com/watch?v=K8L6KVGG-7o Regular Expressions]
* [https://www.youtube.com/watch?v=K8L6KVGG-7o Regular Expressions]


= Administrative Notes =
= Administrative Notes =
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see CommunityData:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel Editing help (opens in new window)