Not logged in
Talk
Contributions
Create account
Log in
Navigation
Main page
About
People
Publications
Teaching
Resources
Research Blog
Wiki Functions
Recent changes
Help
Licensing
Project page
Discussion
Edit
View history
Editing
CommunityData:Participation Pathways
From CommunityData
Jump to:
navigation
,
search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Overview == This project is interested in exploring how people move between communities in online spaces. In particular, we are interested in identifying patterns in the order in which people participate in given communities. This order of participation can be represented as a network, where an edge between two communities means that people are likely to move from community A to community B. These pathways can be used to answer questions about the influence of membership in a community on future behavior, to identify potentially dangerous pathways of radicalization, etc. == Initial Data == === Submission Data === I used data on submissions to reddit from the beginning of reddit through 2019 to build a dataset. It looks sequentially at submissions from each user, and the very first time that a user makes a new post in community ''j'' the edge between the community they posted in most recently (''i'') and ''j'' is incremented. I then took a very simple Bayesian approach to building a posterior likelihood distribution for the proportion of the time that people submit to ''j'' after submitting to ''i'' (rather than the reverse). If <math>C_{i,j}</math> is the count of times that ''j'' was posted in after ''i'', then the posterior with a uniform prior is <math>B(C_{i,j} + 1), B(C_{j,i} + 1)</math> where <math>B</math> is the Beta distribution. I think calculate the proportion of the posterior distribution that is < 0.5. This (I think!) can be thought of as the likelihood that the true probability is less than 0.5. By only taking edges where very little of the posterior probability is less than 0.5, we can identify communities where people are much more likely to post in ''j'' after ''i'' rather than the reverse. A dataset with only the edges where the posterior probability is less than .05 is at [https://jeremydfoote.com/files/outgoing/reddit_significant_edges.feather dataset]. === Natural Experiment Data === I have also been scraping the reddit homepage to grab the subreddits that appear on the [https://www.reddit.com/subreddits/leaderboard leaderboard]. My tentative plan is to identify communities which are part of strong pathways. Then, to compare users who are exposed to these communities via leaderboards during the first few days after joining compared to users where these subreddits did not appear on the leaderboard (or not as high up). == Problems and Questions == I've been thinking about possible problems with this approach and would love others to help me to identify them (and identify solutions!) * I think age may be a big problem - if a user posts in ''i'' before ''j'' exists, then they couldn't have posted in ''j'' first * I thought it was at first, but this is not quite the same as testing if <math>P(i|j) > P(j|i)</math>. The downside of that approach is that pathways can get diluted in the case where a subreddit acts as a gateway to multiple other subreddits. I think that my new approach is closer to the intuition I wanted. ** I also only look at the immediately subsequent subreddit and only the first time someone posts in a given subredit. This is not obviously the best way but controls somewhat for heterogeneity in activity levels. * This produces a lot of edges! I would like to figure out a good way to prune them - the simplest is to keep limiting by the proportion of the posterior < .5, but maybe there are some other ways? * What do I do now? What pathways are interesting? Pathways to banned subs? == Some very preliminary results == Here are the neighbors of r/conspiracy, when using a cutoff of .0001 * People move from these communities to r/conspiracy ** adviceanimals ** askreddit ** codzombies ** dota2 ** fantasybaseball ** funny ** leagueoflegends ** pewdiepiesubmissions ** pics ** politics ** prettylittleliars ** reddit.com ** showerthoughts ** squaredcircle ** teenagers ** the_donald ** trees * People move from r/conspiracy to these communities: ** astronomy ** bad_cop_no_donut ** cannabis ** cbts_stream ** christianity ** circleoftrust ** collapse ** creepy ** documentaries ** economics ** environment ** fullmoviesonyoutube ** health ** hiphoptruth ** history ** latestagecapitalism ** legaladvice ** lifeprotips ** military ** mycology ** nottheonion ** occult ** philosophy ** quotes ** redditrequest ** science ** shadowban ** space ** thecalmbeforethestorm ** topmindsofreddit ** ufos ** upliftingnews ** futurology ** ideasfortheadmins ** activism ** anticonsumption ** conspiracytheories ** conspiratard ** hailcorporate ** preppers ** wikileaks ** 911truth ** actualconspiracies ** alternativehistory ** altnewz ** c_s_t ** conspiracies ** conspiracy_commons ** conspiracydocumentary ** conspiracyfact ** conspiracyfacts ** conspiracyhub ** conspiracyii ** conspiracymemes ** conspiracyright ** conspiracyundone ** conspiro ** culturallayer ** descentintotyranny ** endlesswar ** falseflagwatch ** fringetheory ** governmentoppression ** highstrangeness ** holofractal ** intelligence ** jfkresearcher ** limitedhangouts ** occultconspiracy ** pedogate ** propaganda ** romerules ** truthleaks ** unagenda21 ** undelete
Summary:
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see
CommunityData:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:
Cancel
Editing help
(opens in new window)
Tools
What links here
Related changes
Special pages
Page information