Naturalness and rapport in a pitch adaptive learning companion

Nichola Lubold; Heather Pon-Barry; Erin Walker

doi:10.1109/ASRU.2015.7404781

Naturalness and rapport in a pitch adaptive learning companion

Nichola Lubold, Heather Pon-Barry, Erin Walker

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

29 Scopus citations

Abstract

Observed frequently in human-human interactions, entrainment is a social phenomenon in which speakers become more like each other over the course of a conversation. Acoustic-prosodic entrainment occurs when individuals adapt their acoustic-prosodic speech features, such as pitch and intensity. Correlated with communicative success, naturalness, and conversational flow as well as social variables such as rapport, a dialogue system which automatically entrains has the potential to improve verbal interactions by increasing rapport, naturalness, and conversational flow. In an application like the learning companion, such a socially responsive dialogue system may improve learning and motivation. However, it is not clear how to produce entrainment in an automatic dialogue system in ways that produce the effects seen in human-human dialogue. In this paper, we take the first steps towards implementing a spoken dialogue system which can entrain. We propose three methods of pitch adaptation based on analysis of human entrainment, and design and implement a system which can manipulate the pitch of text-to-speech output adaptively. We find a clear relationship between perceptions of rapport and different forms of pitch adaptations. Certain adaptations are perceived as significantly more natural and rapport-like. Ultimately, adapting by shifting the pitch contour of the text-to-speech output by the mean pitch of the user results in the highest reported measures of rapport and naturalness.

Original language	English (US)
Title of host publication	2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	103-110
Number of pages	8
ISBN (Electronic)	9781479972913
DOIs	https://doi.org/10.1109/ASRU.2015.7404781
State	Published - Feb 10 2016
Event	IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Scottsdale, United States Duration: Dec 13 2015 → Dec 17 2015

Publication series

Name	2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings

Other

Other	IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015
Country/Territory	United States
City	Scottsdale
Period	12/13/15 → 12/17/15

Keywords

adaptation
dialogue system
naturalness
pitch
rapport

ASJC Scopus subject areas

Artificial Intelligence
Computer Networks and Communications
Computer Vision and Pattern Recognition

Access to Document

10.1109/ASRU.2015.7404781

Cite this

Lubold, N., Pon-Barry, H., & Walker, E. (2016). Naturalness and rapport in a pitch adaptive learning companion. In 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings (pp. 103-110). Article 7404781 (2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ASRU.2015.7404781

Naturalness and rapport in a pitch adaptive learning companion. / Lubold, Nichola; Pon-Barry, Heather; Walker, Erin.
2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2016. p. 103-110 7404781 (2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Lubold, N, Pon-Barry, H & Walker, E 2016, Naturalness and rapport in a pitch adaptive learning companion. in 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings., 7404781, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings, Institute of Electrical and Electronics Engineers Inc., pp. 103-110, IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, United States, 12/13/15. https://doi.org/10.1109/ASRU.2015.7404781

Lubold N, Pon-Barry H, Walker E. Naturalness and rapport in a pitch adaptive learning companion. In 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2016. p. 103-110. 7404781. (2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings). doi: 10.1109/ASRU.2015.7404781

Lubold, Nichola ; Pon-Barry, Heather ; Walker, Erin. / Naturalness and rapport in a pitch adaptive learning companion. 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2016. pp. 103-110 (2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings).

@inproceedings{a1fff72a914c47b1b7f9ade8d3c4d536,

title = "Naturalness and rapport in a pitch adaptive learning companion",

abstract = "Observed frequently in human-human interactions, entrainment is a social phenomenon in which speakers become more like each other over the course of a conversation. Acoustic-prosodic entrainment occurs when individuals adapt their acoustic-prosodic speech features, such as pitch and intensity. Correlated with communicative success, naturalness, and conversational flow as well as social variables such as rapport, a dialogue system which automatically entrains has the potential to improve verbal interactions by increasing rapport, naturalness, and conversational flow. In an application like the learning companion, such a socially responsive dialogue system may improve learning and motivation. However, it is not clear how to produce entrainment in an automatic dialogue system in ways that produce the effects seen in human-human dialogue. In this paper, we take the first steps towards implementing a spoken dialogue system which can entrain. We propose three methods of pitch adaptation based on analysis of human entrainment, and design and implement a system which can manipulate the pitch of text-to-speech output adaptively. We find a clear relationship between perceptions of rapport and different forms of pitch adaptations. Certain adaptations are perceived as significantly more natural and rapport-like. Ultimately, adapting by shifting the pitch contour of the text-to-speech output by the mean pitch of the user results in the highest reported measures of rapport and naturalness.",

keywords = "adaptation, dialogue system, naturalness, pitch, rapport",

author = "Nichola Lubold and Heather Pon-Barry and Erin Walker",

note = "Publisher Copyright: {\textcopyright} 2015 IEEE.; IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 ; Conference date: 13-12-2015 Through 17-12-2015",

year = "2016",

month = feb,

day = "10",

doi = "10.1109/ASRU.2015.7404781",

language = "English (US)",

series = "2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "103--110",

booktitle = "2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings",

}

TY - GEN

T1 - Naturalness and rapport in a pitch adaptive learning companion

AU - Lubold, Nichola

AU - Pon-Barry, Heather

AU - Walker, Erin

PY - 2016/2/10

Y1 - 2016/2/10

N2 - Observed frequently in human-human interactions, entrainment is a social phenomenon in which speakers become more like each other over the course of a conversation. Acoustic-prosodic entrainment occurs when individuals adapt their acoustic-prosodic speech features, such as pitch and intensity. Correlated with communicative success, naturalness, and conversational flow as well as social variables such as rapport, a dialogue system which automatically entrains has the potential to improve verbal interactions by increasing rapport, naturalness, and conversational flow. In an application like the learning companion, such a socially responsive dialogue system may improve learning and motivation. However, it is not clear how to produce entrainment in an automatic dialogue system in ways that produce the effects seen in human-human dialogue. In this paper, we take the first steps towards implementing a spoken dialogue system which can entrain. We propose three methods of pitch adaptation based on analysis of human entrainment, and design and implement a system which can manipulate the pitch of text-to-speech output adaptively. We find a clear relationship between perceptions of rapport and different forms of pitch adaptations. Certain adaptations are perceived as significantly more natural and rapport-like. Ultimately, adapting by shifting the pitch contour of the text-to-speech output by the mean pitch of the user results in the highest reported measures of rapport and naturalness.

AB - Observed frequently in human-human interactions, entrainment is a social phenomenon in which speakers become more like each other over the course of a conversation. Acoustic-prosodic entrainment occurs when individuals adapt their acoustic-prosodic speech features, such as pitch and intensity. Correlated with communicative success, naturalness, and conversational flow as well as social variables such as rapport, a dialogue system which automatically entrains has the potential to improve verbal interactions by increasing rapport, naturalness, and conversational flow. In an application like the learning companion, such a socially responsive dialogue system may improve learning and motivation. However, it is not clear how to produce entrainment in an automatic dialogue system in ways that produce the effects seen in human-human dialogue. In this paper, we take the first steps towards implementing a spoken dialogue system which can entrain. We propose three methods of pitch adaptation based on analysis of human entrainment, and design and implement a system which can manipulate the pitch of text-to-speech output adaptively. We find a clear relationship between perceptions of rapport and different forms of pitch adaptations. Certain adaptations are perceived as significantly more natural and rapport-like. Ultimately, adapting by shifting the pitch contour of the text-to-speech output by the mean pitch of the user results in the highest reported measures of rapport and naturalness.

KW - adaptation

KW - dialogue system

KW - naturalness

KW - pitch

KW - rapport

UR - http://www.scopus.com/inward/record.url?scp=84964563990&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84964563990&partnerID=8YFLogxK

U2 - 10.1109/ASRU.2015.7404781

DO - 10.1109/ASRU.2015.7404781

M3 - Conference contribution

AN - SCOPUS:84964563990

T3 - 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings

SP - 103

EP - 110

BT - 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015

Y2 - 13 December 2015 through 17 December 2015

ER -

Naturalness and rapport in a pitch adaptive learning companion

Abstract

Publication series

Other

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this