Communication and Social Networks (Spring 2020)/Dutch School Data Visualization challenge: Difference between revisions

From CommunityData
(Created page with "== The goal == In 2003 and 2004, researchers repeatedly surveyed a number of Dutch school students about their friendships and their behavior. They were particularly interest...")
 
No edit summary
Line 1: Line 1:
== The goal ==
== The goal ==


In 2003 and 2004, researchers repeatedly surveyed a number of Dutch school students about their friendships and their behavior. They were particularly interested in the relationship between friendships and drinking behavior.
In 2003 and 2004, researchers repeatedly surveyed a number of Dutch school students about their friendships and their behavior. They were particularly interested in the relationship between friendships and drinking behavior. They recorded information about alcohol use, gender, age, ethnicity (whether Dutch or not), and religion.


However, there are lots of different questions that you can ask about this data, and lots of different ways to visualize relationships between them. Your goal is to identify a question that you think would be interesting and to use R to visualize the network in a way that sheds light on that question.
For this homework, you are supposed to think of a question that you could ask about this data. I don't remember exactly the questions that we came up with in class, but you could ask things like:
* Are people who drink more more popular?
* Are males or females more likely to have the same drinking behavior as their friends?
* Are people of the dominant religion more likely to be popular? More likely to be friends with each other?
 
I created [https://github.com/jdfoote/Communication-and-Social-Networks/raw/master/activities/school_data_example.Rmd this R file] to show an A+ example, to give you ideas, and some code that you might want to repurpose.
 
There are lots of different questions that you can ask about this data, and lots of different ways to visualize relationships between them. Your goal is to identify a question that you think would be interesting and to use R to visualize the network in a way that sheds light on that question. I decided to look at whether friendships which were mutual were more likely to have the same drinking behavior. I ended up coloring the nodes based on drinking behavior and coloring the edges based on whether they had the same drinking behavior.


== The data ==
== The data ==
Line 9: Line 16:
'''Right-click [https://github.com/jdfoote/Communication-and-Social-Networks/raw/master/activities/network_visualization_examples_and_assignment.Rmd this link]''' and open it in RStudio. At the top of RStudio click "knit", and it should open up something that looks kind of like a web page, which was created from this file ([https://youtu.be/tKUufzpoHDE video explaining R Markdown]). It includes example code for making network visualizations, and also includes code for loading the data for this assignment.
'''Right-click [https://github.com/jdfoote/Communication-and-Social-Networks/raw/master/activities/network_visualization_examples_and_assignment.Rmd this link]''' and open it in RStudio. At the top of RStudio click "knit", and it should open up something that looks kind of like a web page, which was created from this file ([https://youtu.be/tKUufzpoHDE video explaining R Markdown]). It includes example code for making network visualizations, and also includes code for loading the data for this assignment.


The researchers made their data available in files that are difficult to get into R. In order to make things easier, I've changed this messy data of multiple matrices into an igraph object for you. You can see how I did that [https://github.com/jdfoote/Communication-and-Social-Networks/blob/master/activities/knecht_school_data.r here]. The key piece of code that you will need in your code is <code>load(url('https://github.com/jdfoote/Communication-and-Social-Networks/raw/master/activities/school_graph.Rdata'))</code>. This should grab the igraph objects <code>G</code> and <code>friend_net</code>, and load them into your environment. Descriptions of both networks is in the R Markdown file.
The R Markdown file linked above explains that I created 2 igraph objects for you:
 
* <code>G</code> is a multiplex network, which includes both friendships and edges which represent whether two people went to grade school together
* <code>friend_net</code> is just a simplified version of </G>, where I removed the grade school edges.
 
In order to load these igraph objects into R you will need to run
 
<code>load(url('https://github.com/jdfoote/Communication-and-Social-Networks/raw/master/activities/school_graph.Rdata'))</code>.  


Descriptions of what each measure means are at the [http://www.stats.ox.ac.uk/~snijders/siena/tutorial2010_data.htm this site], maintained by the people who collected the data.
This should grab the igraph objects <code>G</code> and <code>friend_net</code>, and load them into your environment. Descriptions of both networks are in the R Markdown file.


Basically, it includes information about alcohol use, gender, age, ethnicity (whether Dutch or not), and religion.
Descriptions of what each measure means are at [http://www.stats.ox.ac.uk/~snijders/siena/tutorial2010_data.htm this site], maintained by the people who collected the data.

Revision as of 23:08, 26 March 2020

The goal

In 2003 and 2004, researchers repeatedly surveyed a number of Dutch school students about their friendships and their behavior. They were particularly interested in the relationship between friendships and drinking behavior. They recorded information about alcohol use, gender, age, ethnicity (whether Dutch or not), and religion.

For this homework, you are supposed to think of a question that you could ask about this data. I don't remember exactly the questions that we came up with in class, but you could ask things like:

  • Are people who drink more more popular?
  • Are males or females more likely to have the same drinking behavior as their friends?
  • Are people of the dominant religion more likely to be popular? More likely to be friends with each other?

I created this R file to show an A+ example, to give you ideas, and some code that you might want to repurpose.

There are lots of different questions that you can ask about this data, and lots of different ways to visualize relationships between them. Your goal is to identify a question that you think would be interesting and to use R to visualize the network in a way that sheds light on that question. I decided to look at whether friendships which were mutual were more likely to have the same drinking behavior. I ended up coloring the nodes based on drinking behavior and coloring the edges based on whether they had the same drinking behavior.

The data

Right-click this link and open it in RStudio. At the top of RStudio click "knit", and it should open up something that looks kind of like a web page, which was created from this file (video explaining R Markdown). It includes example code for making network visualizations, and also includes code for loading the data for this assignment.

The R Markdown file linked above explains that I created 2 igraph objects for you:

  • G is a multiplex network, which includes both friendships and edges which represent whether two people went to grade school together
  • friend_net is just a simplified version of </G>, where I removed the grade school edges.

In order to load these igraph objects into R you will need to run

load(url('https://github.com/jdfoote/Communication-and-Social-Networks/raw/master/activities/school_graph.Rdata')).

This should grab the igraph objects G and friend_net, and load them into your environment. Descriptions of both networks are in the R Markdown file.

Descriptions of what each measure means are at this site, maintained by the people who collected the data.