Latest revision |
Your text |
Line 1: |
Line 1: |
| The [https://dataverse.harvard.edu/ Harvard Dataverse] is an archive for datasets and code hosted by Harvard but available to anybody. The [https://dataverse.harvard.edu/dataverse/communitydata Community Data Science Collective Dataverse] a portal within the Harvard Dataverse that is for CDSC projects and that is managed by our team.
| | ==How should I use the CDSC dataverse?== |
|
| |
|
| ==How should I add things to the CDSC Dataverse?==
| | # Create an account; you'll likely use your institutional login for it |
| | | # Ask to join the CDSC group |
| If you have not done so before, '''everyone should begin by''':
| | # Create your replication package or dataset release |
| | |
| # [https://dataverse.harvard.edu/dataverseuser.xhtml;?editMode=CREATE&redirectPage=%2Fdataverse.xhtml%3Falias%3Dcommunitydata Create an account] You might want to use your institutional login for it or you can create a new one with your username/email and password. | |
| # Ask an existing administration to make you a member/administrator of the CDSC dataverse (everybody in the group should be an admin so it's best to ask on IRC). | |
| | |
| You now need to select between one of two choices: (1) The first is to create your dataset within the CDSC Dataverse. This is usually best because anyone in the group can manage it. (2) Th second is to create it outside of the Dataverse but to "link" it. You will typically do this when there is access-restricted data that should ''not'' be available to everyone in the group.
| |
| | |
| If you want to '''create a dataset in the CDSC Dataverse''' you should create your replication package or dataset release by:
| |
| | |
| # Go to the [https://dataverse.harvard.edu/dataverse/communitydata CDSC Dataverse Page]
| |
| # Click "+ Add Data" → "New Dataset"
| |
| # Make sure that "Host Dataverse" says "Community Data Science Collective Dataverse"
| |
| # Upload and fill out metadata fields (minimally, include a README.txt file to explain how to use your data and code) | | # Upload and fill out metadata fields (minimally, include a README.txt file to explain how to use your data and code) |
| # Publish/release! | | # Release! |
| | |
| If you want to '''create your dataset outside the CDSC Dataverse but have it listed''' you will need to:
| |
| | |
| # Go to the [https://dataverse.harvard.edu/ Main Harvard Dataverse Page]
| |
| # Click "Click" → "Add a Dataset"
| |
| # Make sure that "Host Dataverse" says "Harvard Dataverse"
| |
| # Upload and fill out metadata fields (minimally, include a README.txt file to explain how to use your data and code)
| |
| # Publish/release!
| |
| # Click the "Link Dataset" button on your dataset page and then type/select the ''Community Data Science Collective'' Dataverse.
| |
| | |
| Finally, '''if you have already created a dataset and want it moved into the CDSC Dataverse''', you will need to click the Support button on the top each page and write a message asking them to move it for you. They usually do this very quickly.
| |
|
| |
|
| ==An open science workflow using dataverse== | | ==An open science workflow using dataverse== |
Line 55: |
Line 33: |
|
| |
|
| Dataverse assumes everything is in the root. If you have subdirectories, the way to make this work is to upload the files from those subdirectories and then specify the file path using the UI that only shows up after you do the upload. | | Dataverse assumes everything is in the root. If you have subdirectories, the way to make this work is to upload the files from those subdirectories and then specify the file path using the UI that only shows up after you do the upload. |
| | |
| | ===I have uploaded a repository to a Dataverse that is not part of the CDSC one. What should I do?=== |