From CommunityData is a software service that automates transcription of audio from audio and video files. This is very helpful in qualitative research studies, such as interview studies, that rely on transcripts of audio. It could also potentially be helpful in quantitative content analyses that require text as data from video or audio files (e.g. podcasts or videos).

We currently (2023) have a subscription to a Business account, ending/renewing around February 20 each year. This account has one user seat that we share.

Access to the CDSC account[edit]

Multiple projects under the ecology grant benefited from getting this software service, and the username/password was shared the ecology-group listserv via email in Feb 2021. You can get in touch with anyone on the ecology group for login details.

Best practices[edit]

  • We have one user seat on a business account that gives us 6,000 min of transcription a month. This is 100 hours. While this is generous, since there are multiple people using the same account, please be mindful of how many minutes you consume in a month.
  • If your audio contains sensitive and private information (e.g. health information), this may not be the best service for you to use. We know that uses the audio uploaded (in aggregate forms) to train their AI model.
  • Look through the data security section below to help you decide.
  • For now, as best practice:
    • (1) *decouple your audio files* from any identifying metadata as much as possible,
    • (2) upload your audio, and when it's processed fix any issues in the generated transcript (e.g. transcription errors or whatnot), and then download your neat and tidy transcription to your secure machine, and
    • (3) then delete it/the audio from the system as soon as possible.
    • Alternatively you can of course download the transcript generated by without correcting it first, and then fix up the transcript on your personal machine. But in Sohyeon's experience, fixing up the transcript on interface and the downloading the final corrected transcript is easier (because the text is matched to timestamps).

Data security on[edit]

On 24 February, 2021 Sohyeon asked support via email about data security. Their response:

Security at Otter is extremely important to us. You trust Otter to keep your data secure, and responsible custodianship of your data is one of our core values.
Otter files at rest are encrypted using 256-bit Advanced Encryption Standard (AES). We use Secure Sockets Layer (SSL)/Transport Layer Security (TLS) to protect data in transit between Otter apps and our servers located in North America. SSL/TSL creates a secure tunnel protected by 128-bit or higher AES encryption.
We do not sell or share your data with third parties, nor access your data without your explicit permission. You also have full control to delete your conversations. Deleting a conversation permanently deletes it from Otter's servers, and can't be undone.
See our Security Whitepaper
  • Certifications
    • SOC 2 Type 1 certified
    • SOC 2 Type 2 (targeting a completion date of later summer 2021)
See our Terms of Service to review our contract with you and our security practices.
See our Privacy Policy for more information on the information we collect and how it may be used.
See our California Resident Privacy Notice for information to individuals residing in California and compliance with the California Consumer Privacy Act (CCPA).
See a list of subprocessors with access to certain customer data.

Questions and problems[edit]

Feel free to ping Sohyeon with any questions or issues. The support team is also pretty responsive, in Sohyeon's experience, especially if you mention being on a Business account.

Other options[edit]

LLMs have turned out to be very good at transcribing audio, and there are a number of new transcription services. For less sensitive data, you may want to consider: