Social media comp chapter

From CommunityData

A book chapter that Jeremy, Mako, and Aaron are working on.

Current to-do list[edit]

Data collection and preparation[edit]

  • Get Scopus articles about "online communities" as well


  • Revise once we have completed the sections and have a story to tell about our findings.
  • Incorporate a bit more about how the chapter attempts to point people toward great resources and exemplary work applying computational techniques.

Data Collection and Descriptives (Jeremy)[edit]

  • Write code to produce descriptive statistics
    • Citation Counts
    • Top Journals
    • Top Countries
    • Papers per year
  • Finish writing up benefits/drawbacks section
  • Incorporate mentions/citations of instructional texts/resources that interested readers can use.

Topic Models (Jeremy)[edit]

  • Write code to produce:
    • Overall topic distribution
    • Topic distribution over time
    • Topic distribution in top journals?
  • Write discussion of findings
  • Write benefits/limitations of approach

Citation Prediction (Aaron)[edit]

  • Incorporate links/citations to Statistical Learning book and related instructional resources.
  • Prepare data for analysis
    • Abstract text
      • Lowercase
      • Remove stop-words
      • Create uni-, bi-, tri- grams.
      • Determine minimum threshhold of phrase occurrence for inclusion across categories/subjects (e.g., Mitra & Gilbert say 50 in their dataset).
    • Control measures
      • n.authors
      • publication year
      • publication type (conference? journal?)
      • agg prior citations for authors (sqrt? log?)
      • author affiliation (fixed effects? maybe just dummy for R1?)
      • venue (fixed effects)
      • subject area (fixed effects — various measures available)
      • affiliation countries
      • language

Citation Networks (Mako)[edit]

  • Mako will complete detailed outline of his section in .tex file.


  • Expand draft conclusion as we go. Revise upon completion.