TY - GEN
T1 - COVID-19 Detection using Audio Spectral Features and Machine Learning
AU - Esposito, Michael
AU - Rao, Sunil
AU - Narayanaswamy, Vivek
AU - Spanias, Andreas
N1 - Funding Information:
We began this REU study in the spring of 2020 by first providing “bootcamp” training in research protocols, digital signal processing (DSP) and ML basics. The REU engagement was virtual due to pandemic restrictions. Online lectures, as well as hands-on programming activities, were provided to cover introductory content in DSP, speech processing and machine learning. The lecture topics included basics on Fast Fourier transforms, analog to digital conversion and sampling theory, basic properties of speech and audio signals, and fundamentals of machine learning. Initial hands-on DSP activities were supported by the lab book content [11] and simulations on the Java-DSP (J-DSP) [12] object-oriented environment. Specific preparation for audio included understanding and identifying harmonics, formants, time and frequency domain representations of audio [13], spectrograms, and Mel-frequency cepstral coefficients (MFCCs) [14]. The initial interactive training activities for machine leaning used specific ML J-DSP functions for k-means clustering and sound recognition from the spectrum using [15] followed by an introduction to MATLAB. MATLAB was used to implement the k-means algorithm [16] and also understand the basics of neural network classification. While simple ML concepts and algorithms were introduced in J-DSP and MATLAB [17], it became critical for the REU student to gain skills in Python programming. Sample ML Python code was provided from the SenSIP center labs and a formal online course in the basics of Python [18] was completed as part of this training. The course included instruction on the syntax of Python with quizzes and short projects. In addition to this formal course [18], simulations were run using ML packages [19] in Python in order to provide hands-on experiences. Once pre-training was completed, specific tasks included: a) literature review of recent audio based efforts towards COVID-19 detection, b) building knowledge on feature extraction and c) understanding ML, and more specifically, neural network algorithms. The REU student worked with two PhD students and a faculty member of ASU and held weekly meetings with informal presentations of progress. Feedback on every step was provided and the student began documenting his progress in a working document. Quarterly formal presentations to faculty and industry members of the ASU SenSIP industry-university center were given by the student to sharpen his research presentation skills. The REU student also gave a formal presentation at an undergraduate research conference, namely the NCUR 2021 which this year was held as a virtual meeting [20].
Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - In this research and education REU project, we use audio waveform signatures of coughing to determine whether COVID-19 can be diagnosed. More specifically, we determine coughing audio spectral features and use neural network architectures to develop diagnostics for COVID-19. The non-invasive rapid and remote testing benefits of this approach relative to existing nose swab, saliva, and blood testing make this method attractive for deployment on smart phones. Challenges include distorted or low-quality audio samples, availability of reliable labeled data, confusability with other respiratory diseases, and lack of baseline (healthy) audio recordings for comparison. We have studied, compared, tuned and implemented in Python an array of convolutional neural network architectures. Results using a unique parallel machine learning architecture with a fusion unit are presented.
AB - In this research and education REU project, we use audio waveform signatures of coughing to determine whether COVID-19 can be diagnosed. More specifically, we determine coughing audio spectral features and use neural network architectures to develop diagnostics for COVID-19. The non-invasive rapid and remote testing benefits of this approach relative to existing nose swab, saliva, and blood testing make this method attractive for deployment on smart phones. Challenges include distorted or low-quality audio samples, availability of reliable labeled data, confusability with other respiratory diseases, and lack of baseline (healthy) audio recordings for comparison. We have studied, compared, tuned and implemented in Python an array of convolutional neural network architectures. Results using a unique parallel machine learning architecture with a fusion unit are presented.
KW - COVID-19
KW - cough audio
KW - machine learning
KW - neural networks
KW - spectral features
KW - tachypnea
UR - http://www.scopus.com/inward/record.url?scp=85117492843&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85117492843&partnerID=8YFLogxK
U2 - 10.1109/IEEECONF53345.2021.9723323
DO - 10.1109/IEEECONF53345.2021.9723323
M3 - Conference contribution
AN - SCOPUS:85117492843
T3 - Conference Record - Asilomar Conference on Signals, Systems and Computers
SP - 1146
EP - 1150
BT - 55th Asilomar Conference on Signals, Systems and Computers, ACSSC 2021
A2 - Matthews, Michael B.
PB - IEEE Computer Society
T2 - 55th Asilomar Conference on Signals, Systems and Computers, ACSSC 2021
Y2 - 31 October 2021 through 3 November 2021
ER -