Not logged in
Talk
Contributions
Create account
Log in
Navigation
Main page
About
People
Publications
Teaching
Resources
Research Blog
Wiki Functions
Recent changes
Help
Licensing
Page
Discussion
Edit
View history
Editing
DS4UX (Spring 2016)/Reading and writing files
(section)
From CommunityData
Jump to:
navigation
,
search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Instructions == # If you haven't done so already, [http://jtmorgan.net/ds4ux/week4/lecture.zip download the ZIPPED code and data file] called <code>lecture.zip</code>, unzip it, and navigate to it in your Terminal or Powershell. # Open the code and datafiles in that directory in TextWrangler as well, so you can see the code while we walk through it. === Notes === # The input file we will be working with in this exercise is intentionally messy—it has lots of extra spaces and tabs scattered through the various lines of the file. One of the reasons we're devoting a whole mini-lecture on reading and writing flat files is that the data you want to analyze is OFTEN messy like this when you first get it, and often creating a "clean" version is one of the first things you'll want to do with that data. # File suffixes (.txt, .tsv, .csv) are in some cases more a matter of ''conventions'' than requirements. If a file is just plain text, Python may not care what its file suffix is: you may be able to read it in and read through it line-by-line whether it has ".txt" on it or now. However, in many cases, Python (like most applications) uses the file suffix to decide how to render/execute a file, so it's best to always use the proper (conventional) suffix for any file database you have: .csv for comma-separated value files, .tsv for tab-separated value files, and .txt for generic text files (or for text files where you don't know or can't guarantee that the structure is complete or consistent)
Summary:
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see
CommunityData:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:
Cancel
Editing help
(opens in new window)
Tools
What links here
Related changes
Special pages
Page information