Not logged in
Talk
Contributions
Create account
Log in
Navigation
Main page
About
People
Publications
Teaching
Resources
Research Blog
Wiki Functions
Recent changes
Help
Licensing
Page
Discussion
Edit
View history
Editing
Text as Data (Spring 2026)
(section)
From CommunityData
Jump to:
navigation
,
search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Readings == [[Image:Text_as_data_book_cover.jpg|right|350px|thumb]] This course will rely heavily on the book ''Text as Data'' by Justin Grimmer, Maggie Roberts, and Brandon Stewart. I expect you all to have access to this book: : Grimmer, Justin, Margaret E. Roberts, and Brandon M. Stewart. 2022. ''Text as Data: A New Framework for Machine Learning and the Social Sciences.'' Princeton, NJ: Princeton University Press. The book is an updated version of what is now "the" classic text and was published in 2022. It's excellent but there have been some very big changes in the last several years. The most obvious one is massive advances in transformers and large language models (LLMs) which are only touched on very briefly in the book. I will be supplementing and/or replacing some of the text. You will be asked to conduct analyses in R or Python throughout the course and to modify code that either I or the TA share with you. There are two books—one for folks using Python and one for folks using R—although I won't be assigning chapters of these books because I know people's background will vary, I will attempt to list relevant sections of these books in the optional readings: * '''[Python]''' Hovy, Dirk. 2021. ''Text Analysis in Python for Social Scientists: Discovery and Exploration''. Cambridge University Press. https://www.cambridge.org/core/elements/text-analysis-in-python-for-social-scientists/BFAB0A3604C7E29F6198EA2F7941DFF3. {{avail-uw|https://www.cambridge.org/core/elements/text-analysis-in-python-for-social-scientists/BFAB0A3604C7E29F6198EA2F7941DFF3}} * '''[R]''' Silge, Julia, and David Robinson. 2017. ''Text Mining with R: A Tidy Approach''. O’Reilly Media. [[https://orbiscascade-washington.primo.exlibrisgroup.com/permalink/01ALLIANCE_UW/6psp7h/alma99162160009301452 ''Available through UW libraries'']] Some other useful books are: * Bengfort, Benjamin, Rebecca Bilbro, and Tony Ojeda. 2018. ''Applied Text Analysis with Python: Enabling Language-Aware Data Products with Machine Learning''. O’Reilly Media. [[https://orbiscascade-washington.primo.exlibrisgroup.com/permalink/01ALLIANCE_UW/6psp7h/alma99162160303001452 ''Available through UW libraries'']] * Brown, Taylor R. 2023. ''An Introduction to R and Python for Data Analysis: A Side-By-Side Approach''. CRC Press. https://doi.org/10.1201/9781003263241. {{avail-uw|https://doi.org/10.1201/9781003263241}} * Hvitfeldt, Emil, and Julia Silge. 2022. ''Supervised Machine Learning for Text Analysis in R''. Chapman and Hall/CRC. {{avail-instructor}} * Jockers, Matthew L., and Rosamond Thalken. 2020. ''Text Analysis with R''. Springer. https://link.springer.com/book/10.1007/978-3-319-03164-4. {{avail-uw|https://link.springer.com/book/10.1007/978-3-319-03164-4}} === Access to Readings === Many readings are marked as "''[Available through UW libraries]''". Most of these will be accessible to anybody who connects from the UW network. This means that if you're on campus, it will likely work. Although you can go through the UW libraries website to get most of these, the easiest way is using the [https://www.lib.washington.edu/help/connect/tools UW library proxy bookmarklet]. This is a little button you can drag and drop onto the bookmarks toolbar on your browser. When you press the button, it will ask you to log in using your UW NetID and then will automatically send your traffic through UW libraries. You can also use the other tools on [https://www.lib.washington.edu/help/connect this UW libraries webpage].
Summary:
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see
CommunityData:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:
Cancel
Editing help
(opens in new window)
Tools
What links here
Related changes
Special pages
Page information