CommunityData:Zotero: Difference between revisions

From CommunityData
(Adding section about using Better BibLateX)
 
(21 intermediate revisions by 5 users not shown)
Line 1: Line 1:
We use Zotero for citation management.
We use Zotero for citation management!
 
There's an outline for a short session on how to use Zotero at [[Zotero/Workshop outline]] as well [https://communitydata.science/~mako/zotero_tutorial-20221007-201250_Recording_1920x1080.mp4 this recording of a tutorial session] (1h long, run by [[Mako]]). The video of the tutorial is hosted in [[Mako's]] <code>cdsc_only</code> video repository so there's a username and and password but you can ask anybody in the group and they should be able to get it by searching their email for "cdsc_only".  


== Install Zotero ==
== Install Zotero ==


We recommend installing Zotero on your desktop as well as your browser. You can download the software and browser plugin (for Chrome, Firefox or Safari) from [https://www.zotero.org/ the zotero website]. You'd also want to register for an account if you don't already have one.
You should register for a free Zotero account if you don't already have one. We recommend [https://www.zotero.org/download/ installing Zotero] on your desktop as well as the [https://www.zotero.org/download/connectors "connector" for your browser]. The browser plugin is used to quickly add citations from websites instead of manually entering all of the information.
 
You likely also want to [https://retorque.re/zotero-better-bibtex/installation/ install the Better BibTeX plugin]. This will let you do things like automatically rename PDF files and keep .bib files automatically updated.


== Group Directory ==
== Group Directory ==


We have a group directory that we use to collect citations for all shared projects. The directory is private, so ask a member of the collective to invite you.
We have a group directory that we use to collect citations for all shared projects: [https://www.zotero.org/groups/163394/community_data_science_collective the Community Data Science Collective Zotero repository]. The directory is private, so ask a member of the collective to invite you.
 
The directory is organized by folder, with each folder aggregating citations for a particular project or topic.


== Exporting ==
The directory is organized by ''collections'' (visualized), with each folder aggregating citations for a particular project or topic.


If you want to create a BibTex file from the citations in your folder, right-click your folder in the desktop application, and click "Export Collection". Select BibTex as the format and save to a location of your choice. Be sure to rename the file <code>refs.bib</code> so that it works with the [[CommunityData:Build papers|paper-building workflow]].
In addition to the folders for individual researchers, there are 2 general folders:
* CDSC Publications: For publications by all CDSC members.
* Duplicate Items: Identifies any duplicates across the group directory. When deciding which duplicate to keep and which to merge, simply choose whichever option contains more information.


== Zotero Better Bibtex Extension ==  
== Adding and Organizing References ==
Better Bib(la)tex [https://github.com/retorquere/zotero-better-bibtex] is a useful extension for using zotero with [[CommunityData:TeX|Tex]].


Better BibTeX has a few nice features:  
The process to add something cleanly is complicated but should always include the following steps:  
# It supports BibLaTeX which supports Unicode better than regular BibTeX
# Automatic journal abbreviations.
# No longer need refsprocessed.bib
# Automatically syncing .bib files with zotero collections
# Pulling .bib files from zotero with an http request


===Before you add a source===
# '''Sync your local Zotero repository''' (by clicking the little green symbol that looks like a recycle sign in the top right). This may not be necessary all the time, but it's better to do it!
# '''Check to see if the thing you're about to add is already in the CDSC shared folder''' by clicking on the top level of the shared group and searching. If it is already there, just drag it into your new subfolder for your project. If it's not, click back on your sub-folder and add it.


This final feature is particularly useful if you are [[CommunityData:Build papers|building papers]] with make.  
===Once you've added a source===
Let's walk through steps for setting up Better BibTeX with make.  
# '''Once you've added a source, change the title to sentence case.''' You can do this by: (a) ''Right clicking on Title → Transform Text → Sentence Case'' (b) you will then need to capitalize any proper nouns (e.g., Scratch, Wikipedia) as well as anything immediately following a ":". This is important because software like BibTeX/BibLaTeX can change from "Sentence case" to "Title Case" automatically, but not the other way around. APA 6 requires sentence case.
#* Do make sure that any proper nouns are still capitalized after converting the title to sentence case. (EX: places like "United States," websites like "Wikipedia," apps like "Airbnb")
# '''Review and revise bibliographic record as-needed.''' This varies a bit by publication type and by the data source you've imported the bibliographic metadata from, but here's a minimal set of details that you should check to make sure have imported correctly:
#* Item Type: This should match the sort of thing you're importing, such as a book, journal article, magazine article, blog post, etc.
#* Title: The title of the piece itself.
#* Authors, editors, translators: Imported metadata is often pretty messy for these. Do your best to make them right. Wherever possible, defer to the apparent preferences/conventions adopted by authors (e.g., capitalization, spellings, name changes, etc.). When names include characters or diacritics that are not part of the English language, do what you can to incorporate the correct, original characters (copy/paste is your friend here).
#* Publication: For journals, magazines, newspapers, etc.
#* Volume + Issue: Usually only applies to periodicals.
#* Pages: For periodicals, book chapters, or other selections.
#* Date: This is also often a little weird in the metadata. Should match the publication date used by the publication. For books and journals, year alone is enough. For everything else, there should be a ''yy/mm/dd'' (or whatever format) entry.
#* DOI: (Digital Object Identifier): Ensure that there's a DOI for your entry if it's available. In general, this applies to journals and conference proceedings. Some publishers and conferences (like AAAI publications which publishes ICWSM) do not have DOIs but these are extremely rare. If your publication does not have a DOI, it needs to have a URL because APA 6 requires one or the other... which brings us to:
#* URL: Web addresses should reflect canonical sources (publisher websites, institutional repositories, pre-print servers, etc.) to the extent possible. Personal websites are fine if that seems like the best option (i.e., there's not an archival version anywhere else). Sometimes URLs include DOI information and, if DOI metadata was missing otherwise, you should extract DOIs in this way.
#* Publisher: According to [http://blog.apastyle.org/apastyle/2010/01/the-generic-reference-where.html APA 6 style] we should drop common words like "Press," "Publisher," "Inc." as well as first names (i.e., just Wiley, not John Wiley Inc.)
#* Place: [http://blog.apastyle.org/apastyle/2010/01/the-generic-reference-where.html APA 6 style] requires "City, State" within the USA and "City, Country" outside. So, it's "New York, New York" for the ACM and "Cambridge, UK" for University of Cambridge Press.
# '''Remove anything in the "Extra" field''' unless it's something you want to be printed every time. Sometimes things like "ⓒ JSTOR" sneaks in.
# '''Make sure that there's a clearly named PDF attached.''' You can attach PDFs by: ''Right-click on the item → Add attachment → Attach Stored Copy of File.'' One the PDF is uploaded, you should rename the PDF to ''Name-YYYY-Short_title.pdf''. After your bibliographic record is cleaned up and accurate, you can do this by: ''Right click on PDF → Rename File from Parent Metadata''
# '''Ensure that there aren't extraneous files attached.''' Just delete anything that doesn't look critical or useful. For example, when using the browser plugin, sometimes a 'Snapshot' will be attached which is just a link to the website you sourced from (which should be in the URL field anyways). Anything left attached will show up in fulltext searches, which can be a reason to either leave something or remove it depending on what it is.


=== Setup Better BibTeX with Make ===
== Tips and Tricks ==
You'll need make, our [[CommunityData:TeX|TeX]] setup and wget to make this work.


==== Installing and Configuring Better BibTeX ====
* In Zotero, folders are not like traditional folders. They are like tags. Do not add things if they are already there!
# Install from instructions on github https://github.com/retorquere/zotero-better-bibtex/wiki/Installation
* Holding down the <tt>Option key</tt> (macOS), <tt>Ctrl</tt> key (Windows), or <tt>Alt</tt> (Zotero 6)/<tt>Ctrl</tt> (Zotero 7) key (Linux) is extremely useful! It will show you which folders the selected item is in!
# Open Zotero preferences, go to the Better BibTex tab
* When you're done adding a bunch of things, look at the ''Duplicate'' pseudo-folder underneath the ''Community Data Science Collective'' folder.
# Open export subtab (under Better BibTeX) and enable "Enable export by HTTP". This turns on an HTTP server that can export your Zotero collections.
* Do not download the [https://github.com/beloglazov/zotero-scholar-citations Google Scholar add-on]. This adds citation data to the "extra" column for all papers in Zotero which shows up in some types of reference lists and is a huge pain to undo.
# Open the Automatic export subtab and select "Disabled".
** If you had already imported something from Google Scholar, it is usually easier to find the publication online again and re-import the information using the browser connector than to find and input the missing information manually.
* Note that every time you import a new publication, you will automatically scroll to the newly-added citation. If you are updating information for many citations, keep track of the order so that you don't miss any as you add new citations!
* Zotero allows you to customize what columns of information are shown if you click on the small grid-like icon in the top right of the citations pane (next to the default paperclip icon that shows whether a citation has any attachments). It may be helpful to add columns such as Year or Extra (to make it easier to identify citations that need to be fixed).


== Exporting ==


The below steps may not be necessary, but they help getting cleaner bib files that won't add much strange junk you your bibliography.
If you want to create a BibTex file from the citations in your folder, right-click your folder in the desktop application, and click "Export Collection". Select BibLaTeX as the format and save to a location of your choice. Be sure to rename the file <code>refs.bib</code> so that it works with the [[CommunityData:Build papers|paper-building workflow]].  
 
*  add "keywords,note,abstract,eprint,isbn,rights,issue" to the "Fields to omit from export(comma-separated) field. If any of your zotero entries start adding weird crap to your bibliography you can stop it using this setting.  
* I also select DOI for "When a reference has both DOI and an URL, export
 
There are many other Better BibTeX settings, but the above are the ones that I think are useful for use with make.


==== Make sure your Tex file is using BibLaTeX ====
You'll need make, our [[CommunityData:TeX|TeX]] setup to make this work.


If you are using the latest versions of mako's templates[http://projects.mako.cc/source/?p=latex_mako] you should be using BibLaTeX.
If you are using the latest versions of mako's templates[http://projects.mako.cc/source/?p=latex_mako] you should be using BibLaTeX.
Line 57: Line 71:
\addbibresource{refs.bib}   
\addbibresource{refs.bib}   
</code>
</code>
If you aren't using Better BibTeX I suggest merging your document with Mako's template so that you are using the bibliography setup from the template.
==== Modifying Makefile to use your Zotero Collection ====
Make a few small changes to Makefile in paper_template.tex so running make synces refs.bib to your Zotero collection.
# Identify the url you can use to download your Zotero Collection.
## Right click on the collection and then click on BibLaTeX url.
## There may be two urls show. I prefer the one that doesn't have whitespace. Instead of the name of the collection it has a unique identifier. It looks like this: http://localhost:23119/better-bibtex/collection?/1233037/QZE3X7B3.biblatex.
## Copy this url
# Test that the url works by opening it in your browser. You should see your .bib file!
# Now modify the Makefile to download the file and output it to refs.bib. Add these two lines:<br /><code>refs.bib: <br/><TAB> wget -r -q -O refs.bib "your_collection_url"</code> <br/>
# Next add refs.bib to the dependencies of .tex. Add <code>refs.bib</code> to the end of the line that starts with <code>%.tex:</code>
# Finally make refs.bib phony so that you update refs.bib every time you run make. Add <code>refs.bib</code> to the end of the line that starts with <code>.PHONY</code>.


That should be it! You should be ready to go! Test it out by running.
That should be it! You should be ready to go! Test it out by running.
<code> make refs.bib </code> and <code> make pdf </code>
<code> make refs.bib </code> and <code> make pdf </code>

Latest revision as of 18:38, 12 September 2024

We use Zotero for citation management!

There's an outline for a short session on how to use Zotero at Zotero/Workshop outline as well this recording of a tutorial session (1h long, run by Mako). The video of the tutorial is hosted in Mako's cdsc_only video repository so there's a username and and password but you can ask anybody in the group and they should be able to get it by searching their email for "cdsc_only".

Install Zotero[edit]

You should register for a free Zotero account if you don't already have one. We recommend installing Zotero on your desktop as well as the "connector" for your browser. The browser plugin is used to quickly add citations from websites instead of manually entering all of the information.

You likely also want to install the Better BibTeX plugin. This will let you do things like automatically rename PDF files and keep .bib files automatically updated.

Group Directory[edit]

We have a group directory that we use to collect citations for all shared projects: the Community Data Science Collective Zotero repository. The directory is private, so ask a member of the collective to invite you.

The directory is organized by collections (visualized), with each folder aggregating citations for a particular project or topic.

In addition to the folders for individual researchers, there are 2 general folders:

  • CDSC Publications: For publications by all CDSC members.
  • Duplicate Items: Identifies any duplicates across the group directory. When deciding which duplicate to keep and which to merge, simply choose whichever option contains more information.

Adding and Organizing References[edit]

The process to add something cleanly is complicated but should always include the following steps:

Before you add a source[edit]

  1. Sync your local Zotero repository (by clicking the little green symbol that looks like a recycle sign in the top right). This may not be necessary all the time, but it's better to do it!
  2. Check to see if the thing you're about to add is already in the CDSC shared folder by clicking on the top level of the shared group and searching. If it is already there, just drag it into your new subfolder for your project. If it's not, click back on your sub-folder and add it.

Once you've added a source[edit]

  1. Once you've added a source, change the title to sentence case. You can do this by: (a) Right clicking on Title → Transform Text → Sentence Case (b) you will then need to capitalize any proper nouns (e.g., Scratch, Wikipedia) as well as anything immediately following a ":". This is important because software like BibTeX/BibLaTeX can change from "Sentence case" to "Title Case" automatically, but not the other way around. APA 6 requires sentence case.
    • Do make sure that any proper nouns are still capitalized after converting the title to sentence case. (EX: places like "United States," websites like "Wikipedia," apps like "Airbnb")
  2. Review and revise bibliographic record as-needed. This varies a bit by publication type and by the data source you've imported the bibliographic metadata from, but here's a minimal set of details that you should check to make sure have imported correctly:
    • Item Type: This should match the sort of thing you're importing, such as a book, journal article, magazine article, blog post, etc.
    • Title: The title of the piece itself.
    • Authors, editors, translators: Imported metadata is often pretty messy for these. Do your best to make them right. Wherever possible, defer to the apparent preferences/conventions adopted by authors (e.g., capitalization, spellings, name changes, etc.). When names include characters or diacritics that are not part of the English language, do what you can to incorporate the correct, original characters (copy/paste is your friend here).
    • Publication: For journals, magazines, newspapers, etc.
    • Volume + Issue: Usually only applies to periodicals.
    • Pages: For periodicals, book chapters, or other selections.
    • Date: This is also often a little weird in the metadata. Should match the publication date used by the publication. For books and journals, year alone is enough. For everything else, there should be a yy/mm/dd (or whatever format) entry.
    • DOI: (Digital Object Identifier): Ensure that there's a DOI for your entry if it's available. In general, this applies to journals and conference proceedings. Some publishers and conferences (like AAAI publications which publishes ICWSM) do not have DOIs but these are extremely rare. If your publication does not have a DOI, it needs to have a URL because APA 6 requires one or the other... which brings us to:
    • URL: Web addresses should reflect canonical sources (publisher websites, institutional repositories, pre-print servers, etc.) to the extent possible. Personal websites are fine if that seems like the best option (i.e., there's not an archival version anywhere else). Sometimes URLs include DOI information and, if DOI metadata was missing otherwise, you should extract DOIs in this way.
    • Publisher: According to APA 6 style we should drop common words like "Press," "Publisher," "Inc." as well as first names (i.e., just Wiley, not John Wiley Inc.)
    • Place: APA 6 style requires "City, State" within the USA and "City, Country" outside. So, it's "New York, New York" for the ACM and "Cambridge, UK" for University of Cambridge Press.
  3. Remove anything in the "Extra" field unless it's something you want to be printed every time. Sometimes things like "ⓒ JSTOR" sneaks in.
  4. Make sure that there's a clearly named PDF attached. You can attach PDFs by: Right-click on the item → Add attachment → Attach Stored Copy of File. One the PDF is uploaded, you should rename the PDF to Name-YYYY-Short_title.pdf. After your bibliographic record is cleaned up and accurate, you can do this by: Right click on PDF → Rename File from Parent Metadata
  5. Ensure that there aren't extraneous files attached. Just delete anything that doesn't look critical or useful. For example, when using the browser plugin, sometimes a 'Snapshot' will be attached which is just a link to the website you sourced from (which should be in the URL field anyways). Anything left attached will show up in fulltext searches, which can be a reason to either leave something or remove it depending on what it is.

Tips and Tricks[edit]

  • In Zotero, folders are not like traditional folders. They are like tags. Do not add things if they are already there!
  • Holding down the Option key (macOS), Ctrl key (Windows), or Alt (Zotero 6)/Ctrl (Zotero 7) key (Linux) is extremely useful! It will show you which folders the selected item is in!
  • When you're done adding a bunch of things, look at the Duplicate pseudo-folder underneath the Community Data Science Collective folder.
  • Do not download the Google Scholar add-on. This adds citation data to the "extra" column for all papers in Zotero which shows up in some types of reference lists and is a huge pain to undo.
    • If you had already imported something from Google Scholar, it is usually easier to find the publication online again and re-import the information using the browser connector than to find and input the missing information manually.
  • Note that every time you import a new publication, you will automatically scroll to the newly-added citation. If you are updating information for many citations, keep track of the order so that you don't miss any as you add new citations!
  • Zotero allows you to customize what columns of information are shown if you click on the small grid-like icon in the top right of the citations pane (next to the default paperclip icon that shows whether a citation has any attachments). It may be helpful to add columns such as Year or Extra (to make it easier to identify citations that need to be fixed).

Exporting[edit]

If you want to create a BibTex file from the citations in your folder, right-click your folder in the desktop application, and click "Export Collection". Select BibLaTeX as the format and save to a location of your choice. Be sure to rename the file refs.bib so that it works with the paper-building workflow.

You'll need make, our TeX setup to make this work.

If you are using the latest versions of mako's templates[1] you should be using BibLaTeX.

You should see these lines:

\usepackage[natbib=true,style=apa,backend=biber]{biblatex}

\addbibresource{refs.bib}

That should be it! You should be ready to go! Test it out by running. make refs.bib and make pdf