Not logged in
Talk
Contributions
Create account
Log in
Navigation
Main page
About
People
Publications
Teaching
Resources
Research Blog
Wiki Functions
Recent changes
Help
Licensing
Project page
Discussion
Edit
View history
Editing
CommunityData:Dataverse
From CommunityData
Jump to:
navigation
,
search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
The [https://dataverse.harvard.edu/ Harvard Dataverse] is an archive for datasets and code hosted by Harvard but available to anybody. The [https://dataverse.harvard.edu/dataverse/communitydata Community Data Science Collective Dataverse] a portal within the Harvard Dataverse that is for CDSC projects and that is managed by our team. ==How should I add things to the CDSC Dataverse?== If you have not done so before, '''everyone should begin by''': # [https://dataverse.harvard.edu/dataverseuser.xhtml;?editMode=CREATE&redirectPage=%2Fdataverse.xhtml%3Falias%3Dcommunitydata Create an account] You might want to use your institutional login for it or you can create a new one with your username/email and password. # Ask an existing administration to make you a member/administrator of the CDSC dataverse (everybody in the group should be an admin so it's best to ask on IRC). You now need to select between one of two choices: (1) The first is to create your dataset within the CDSC Dataverse. This is usually best because anyone in the group can manage it. (2) Th second is to create it outside of the Dataverse but to "link" it. You will typically do this when there is access-restricted data that should ''not'' be available to everyone in the group. If you want to '''create a dataset in the CDSC Dataverse''' you should create your replication package or dataset release by: # Go to the [https://dataverse.harvard.edu/dataverse/communitydata CDSC Dataverse Page] # Click "+ Add Data" β "New Dataset" # Make sure that "Host Dataverse" says "Community Data Science Collective Dataverse" # Upload and fill out metadata fields (minimally, include a README.txt file to explain how to use your data and code) # Publish/release! If you want to '''create your dataset outside the CDSC Dataverse but have it listed''' you will need to: # Go to the [https://dataverse.harvard.edu/ Main Harvard Dataverse Page] # Click "Click" β "Add a Dataset" # Make sure that "Host Dataverse" says "Harvard Dataverse" # Upload and fill out metadata fields (minimally, include a README.txt file to explain how to use your data and code) # Publish/release! # Click the "Link Dataset" button on your dataset page and then type/select the ''Community Data Science Collective'' Dataverse. Finally, '''if you have already created a dataset and want it moved into the CDSC Dataverse''', you will need to click the Support button on the top each page and write a message asking them to move it for you. They usually do this very quickly. ==An open science workflow using dataverse== There are many ways to follow open science practices. One way to fit the CDSC dataverse into your open science workflow is as follows: ===Step 1: Anonymous while under review=== Some publications ask for an anonymized release of code and data. This is easy to do without breaking double-blind anonymity. Generate a code and data package that doesn't include information that will identify you, and then when uploading '''do not fill out metadata fields with authorship information''' and '''do not release (publish) your archive'''. Delete places where it autofills your name. Once your files are uploaded, under 'Edit Dataset', there's an option to 'Generate Private URL'. See details in the [https://guides.dataverse.org/en/6.0/user/dataset-management.html#private-url-to-review-unpublished-dataset|dataverse user guide]. You'll see that this creates a blue box at the top of your archive which reads "Unpublished Dataset Private URL β Privately share this dataset before it is published:" -- that's the link to share with your reviewers (test this link with another browser to be sure that it doesn't reveal anything). ===Step 2: Identified after acceptance=== You might like to include a link to your dataverse in your paper; you might also want to add it to your accepted preprint before uploading the paper into arXiv. Fill out as many metadata fields as you find useful (authors, description, subject, keywords), ask a colleague to take a look at your archive, and then release it. ===Step 3: Updated after publication=== After your paper is published and the DOI goes live, why not add this information into your archive so that others can find it (the 'Related Publication' metadata field)? ==Potential questions and problems== ===Oh no, I made an error in my archive!=== After an archive is released, you can make updates. But if you've realized that the previous version is sufficiently bad that you don't want it to be findable, the archive needs to be deleted or 'deaccessioned'. ===What's this message about my data format and 'tabular ingest failed'?=== Dataverse wants to be able to present your data in tabular form for people to view live without downloading, and is having trouble parsing what you uploaded. You can reformat, or you can ignore this error. ===My replication package has a main directory and subdirectories -- how do I represent this?=== Dataverse assumes everything is in the root. If you have subdirectories, the way to make this work is to upload the files from those subdirectories and then specify the file path using the UI that only shows up after you do the upload.
Summary:
Please note that all contributions to CommunityData are considered to be released under the Attribution-Share Alike 3.0 Unported (see
CommunityData:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:
Cancel
Editing help
(opens in new window)
Tools
What links here
Related changes
Special pages
Page information