Editing Statistics and Statistical Programming (Winter 2017)

From CommunityData

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.

Latest revision Your text
Line 35: Line 35:
This class will focus much more on statistical programming in R than most similar classes. Most similar classes in communication will focus on using an easier to use statistical package like SPSS.
This class will focus much more on statistical programming in R than most similar classes. Most similar classes in communication will focus on using an easier to use statistical package like SPSS.


We're focusing on programming instead of a package like SPSS for several reasons:  
We're focusing on programming instead of a package like SPSS for several reasons: (a) student who understands a programming language won't be limited to the "canned" functions in the off-the-shelf packages; (b) pedagogically, it supports students in building a deeper understanding of the mathematics and assumptions behind the canned functions by both allowing them to read the code "behind" the canned functions and by allowing the students to implement the functions themselves in assignments; (c) analyses composed of code instead of clicks supports reproducible analyses that can document every step of the process of an analysis including during data cleaning and conversion where errors are common and very difficult to detect; and (d) because programming is a skill that is in demand in our department and discipline more generally and that I strongly believe is generally useful.
 
* Student who understands a programming language won't be limited to the "canned" functions in the off-the-shelf packages.
* Pedagogically, programming supports students in building a deeper understanding of the mathematics and assumptions behind the canned functions by both allowing them to read the code "behind" the canned functions and by allowing the students to implement the functions themselves in assignments.
* Analyses composed of code instead of clicks supports reproducible analyses that can document every step of the process of an analysis including during data cleaning and conversion where errors are common and very difficult to detect.
* Because programming is a skill that is in demand in our department and discipline more generally and that I strongly believe is generally useful.


Of course, there are other programming languages well suited to statistics including Stata and Python.  Ultimately, I'm teaching R because a few of us that seemed mostly to teach in this sequence going forward future got together and the decision was that R made the most sense and because there was consensus among the faculty in the department who were likely to teach statistics classes in the future that this made the most sense.
Of course, there are other programming languages well suited to statistics including Stata and Python.  Ultimately, I'm teaching R because a few of us that seemed mostly to teach in this sequence going forward future got together and the decision was that R made the most sense and because there was consensus among the faculty in the department who were likely to teach statistics classes in the future that this made the most sense.


Our reasoning was that:
Our reasoning was that (a) R is freely available and open source; (b) R is becoming the most widely used package in statistical fields and is (by our estimate) used by most academics in my cohort or later in statistics, political science, and economics already; (c) R is the system (along with Stata) that will be in other CSSS advanced stats classes we hope students will continue to take after COM521; and (c) R is better general purpose programming language than software like Stata which means that R programming skills will let students solve non-stastical problems like collecting data from the web and will make it easier to learn other programming languages.
 
* R is freely available and open source
* R is becoming the most widely used package in statistical fields and is (by our estimate) used by most academics in my cohort or later in statistics, political science, and economics already.
* R is the system (along with Stata) that will be in other CSSS advanced stats classes we hope students will continue to take after COM521.
* R is better general purpose programming language than software like Stata which means that R programming skills will let students solve non-stastical problems like collecting data from the web and will make it easier to learn other programming languages.


For students with a strong psychometric focus or whose research will be limited to linear and logistic regression or ANOVA on small pre-collected datasets and similar, SPSS will likely be fine. R has a higher barrier to entry than SPSS but it's ceiling is ''much'' higher.
For students with a strong psychometric focus or whose research will be limited to linear and logistic regression or ANOVA on small pre-collected datasets and similar, SPSS will likely be fine. R has a higher barrier to entry than SPSS but it's ceiling is ''much'' higher.
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see CommunityData:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel Editing help (opens in new window)