UncommonVoice: A crowdsourced dataset of dysphonic speech

Meredith Moore; Piyush Papreja; Michael Saxon; Visar Berisha; Sethuraman Panchanathan

doi:10.21437/Interspeech.2020-3093

UncommonVoice: A crowdsourced dataset of dysphonic speech

Meredith Moore, Piyush Papreja, Michael Saxon, Visar Berisha, Sethuraman Panchanathan

Research output: Contribution to journal › Conference article › peer-review

3 Scopus citations

Abstract

To facilitate more accessible spoken language technologies and advance the study of dysphonic speech this paper presents UncommonVoice, a freely-available, crowd-sourced speech corpus consisting of 8.5 hours of speech from 57 individuals, 48 of whom have spasmodic dysphonia. The speech material consists of non-words (prolonged vowels, and the prompt for diadochokinetic rate), sentences (randomly selected from TIMIT prompts and the CAPE-V intelligibility analysis), and spontaneous image descriptions. The data was recorded in a crowdsourced manner using a web-based application. This dataset is a fundamental resource for the development of voice-assistive technologies for individuals with dysphonia as well as the enhancement of the accessibility of voice-based technologies (automatic speech recognition, virtual assistants, etc). Research on articulation differences as well as how best to model and represent dysphonic speech will greatly benefit from a free and publicly available dataset of dysphonic speech. The dataset will be made freely and publicly available at www.uncommonvoice.org. In the following sections, we detail the data collection process as well as provide an initial analysis of the speech corpus.

Original language	English (US)
Pages (from-to)	2532-2536
Number of pages	5
Journal	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume	2020-October
DOIs	https://doi.org/10.21437/Interspeech.2020-3093
State	Published - 2020
Event	21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020 - Shanghai, China Duration: Oct 25 2020 → Oct 29 2020

Keywords

Dataset human-computer interaction
Spasmodic dysphonia
Voice disorder

ASJC Scopus subject areas

Language and Linguistics
Human-Computer Interaction
Signal Processing
Software
Modeling and Simulation

Access to Document

10.21437/Interspeech.2020-3093

Cite this

@article{5e7775aa5b2a45c6a888374fa186bca1,

title = "UncommonVoice: A crowdsourced dataset of dysphonic speech",

abstract = "To facilitate more accessible spoken language technologies and advance the study of dysphonic speech this paper presents UncommonVoice, a freely-available, crowd-sourced speech corpus consisting of 8.5 hours of speech from 57 individuals, 48 of whom have spasmodic dysphonia. The speech material consists of non-words (prolonged vowels, and the prompt for diadochokinetic rate), sentences (randomly selected from TIMIT prompts and the CAPE-V intelligibility analysis), and spontaneous image descriptions. The data was recorded in a crowdsourced manner using a web-based application. This dataset is a fundamental resource for the development of voice-assistive technologies for individuals with dysphonia as well as the enhancement of the accessibility of voice-based technologies (automatic speech recognition, virtual assistants, etc). Research on articulation differences as well as how best to model and represent dysphonic speech will greatly benefit from a free and publicly available dataset of dysphonic speech. The dataset will be made freely and publicly available at www.uncommonvoice.org. In the following sections, we detail the data collection process as well as provide an initial analysis of the speech corpus.",

keywords = "Dataset human-computer interaction, Spasmodic dysphonia, Voice disorder",

author = "Meredith Moore and Piyush Papreja and Michael Saxon and Visar Berisha and Sethuraman Panchanathan",

note = "Funding Information: The authors would like to acknowledge the National Spasmodic Dysphonia Association for their support throughout the development of UncommonVoice, particularly for their effort in recruiting speakers for UncommonVoice. Also a special thank you to the National Science Foundation Graduate Research Fellowship. Publisher Copyright: Copyright {\textcopyright} 2020 ISCA; 21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020 ; Conference date: 25-10-2020 Through 29-10-2020",

year = "2020",

doi = "10.21437/Interspeech.2020-3093",

language = "English (US)",

volume = "2020-October",

pages = "2532--2536",

journal = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",

issn = "2308-457X",

}

TY - JOUR

T1 - UncommonVoice

T2 - 21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020

AU - Moore, Meredith

AU - Papreja, Piyush

AU - Saxon, Michael

AU - Berisha, Visar

AU - Panchanathan, Sethuraman

N1 - Funding Information: The authors would like to acknowledge the National Spasmodic Dysphonia Association for their support throughout the development of UncommonVoice, particularly for their effort in recruiting speakers for UncommonVoice. Also a special thank you to the National Science Foundation Graduate Research Fellowship. Publisher Copyright: Copyright © 2020 ISCA

PY - 2020

Y1 - 2020

N2 - To facilitate more accessible spoken language technologies and advance the study of dysphonic speech this paper presents UncommonVoice, a freely-available, crowd-sourced speech corpus consisting of 8.5 hours of speech from 57 individuals, 48 of whom have spasmodic dysphonia. The speech material consists of non-words (prolonged vowels, and the prompt for diadochokinetic rate), sentences (randomly selected from TIMIT prompts and the CAPE-V intelligibility analysis), and spontaneous image descriptions. The data was recorded in a crowdsourced manner using a web-based application. This dataset is a fundamental resource for the development of voice-assistive technologies for individuals with dysphonia as well as the enhancement of the accessibility of voice-based technologies (automatic speech recognition, virtual assistants, etc). Research on articulation differences as well as how best to model and represent dysphonic speech will greatly benefit from a free and publicly available dataset of dysphonic speech. The dataset will be made freely and publicly available at www.uncommonvoice.org. In the following sections, we detail the data collection process as well as provide an initial analysis of the speech corpus.

AB - To facilitate more accessible spoken language technologies and advance the study of dysphonic speech this paper presents UncommonVoice, a freely-available, crowd-sourced speech corpus consisting of 8.5 hours of speech from 57 individuals, 48 of whom have spasmodic dysphonia. The speech material consists of non-words (prolonged vowels, and the prompt for diadochokinetic rate), sentences (randomly selected from TIMIT prompts and the CAPE-V intelligibility analysis), and spontaneous image descriptions. The data was recorded in a crowdsourced manner using a web-based application. This dataset is a fundamental resource for the development of voice-assistive technologies for individuals with dysphonia as well as the enhancement of the accessibility of voice-based technologies (automatic speech recognition, virtual assistants, etc). Research on articulation differences as well as how best to model and represent dysphonic speech will greatly benefit from a free and publicly available dataset of dysphonic speech. The dataset will be made freely and publicly available at www.uncommonvoice.org. In the following sections, we detail the data collection process as well as provide an initial analysis of the speech corpus.

KW - Dataset human-computer interaction

KW - Spasmodic dysphonia

KW - Voice disorder

UR - http://www.scopus.com/inward/record.url?scp=85098128165&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85098128165&partnerID=8YFLogxK

U2 - 10.21437/Interspeech.2020-3093

DO - 10.21437/Interspeech.2020-3093

M3 - Conference article

AN - SCOPUS:85098128165

SN - 2308-457X

VL - 2020-October

SP - 2532

EP - 2536

JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Y2 - 25 October 2020 through 29 October 2020

ER -

UncommonVoice: A crowdsourced dataset of dysphonic speech

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this