Social media comp chapter
From CommunityData
A book chapter that Jeremy, Mako, and Aaron are working on.
Current to-do list[edit]
Data collection and preparation[edit]
- Get Scopus articles about "online communities" as well
Introduction[edit]
- Revise once we have completed the sections and have a story to tell about our findings.
- Incorporate a bit more about how the chapter attempts to point people toward great resources and exemplary work applying computational techniques.
Data Collection and Descriptives (Jeremy)[edit]
- Write code to produce descriptive statistics
- Citation Counts
- Top Journals
- Top Countries
- Papers per year
- Finish writing up benefits/drawbacks section
- Incorporate mentions/citations of instructional texts/resources that interested readers can use.
Topic Models (Jeremy)[edit]
- Write code to produce:
- Overall topic distribution
- Topic distribution over time
- Topic distribution in top journals?
- Write discussion of findings
- Write benefits/limitations of approach
Citation Prediction (Aaron)[edit]
- Incorporate links/citations to Statistical Learning book and related instructional resources.
- Prepare data for analysis
- Abstract text
- Lowercase
- Remove stop-words
- Create uni-, bi-, tri- grams.
- Determine minimum threshhold of phrase occurrence for inclusion across categories/subjects (e.g., Mitra & Gilbert say 50 in their dataset).
- Control measures
- n.authors
- publication year
- publication type (conference? journal?)
- agg prior citations for authors (sqrt? log?)
- author affiliation (fixed effects? maybe just dummy for R1?)
- venue (fixed effects)
- subject area (fixed effects — various measures available)
- affiliation countries
- language
- Abstract text
Citation Networks (Mako)[edit]
- Mako will complete detailed outline of his section in .tex file.
Conclusion[edit]
- Expand draft conclusion as we go. Revise upon completion.