TY - GEN
T1 - Naturalness and rapport in a pitch adaptive learning companion
AU - Lubold, Nichola
AU - Pon-Barry, Heather
AU - Walker, Erin
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2016/2/10
Y1 - 2016/2/10
N2 - Observed frequently in human-human interactions, entrainment is a social phenomenon in which speakers become more like each other over the course of a conversation. Acoustic-prosodic entrainment occurs when individuals adapt their acoustic-prosodic speech features, such as pitch and intensity. Correlated with communicative success, naturalness, and conversational flow as well as social variables such as rapport, a dialogue system which automatically entrains has the potential to improve verbal interactions by increasing rapport, naturalness, and conversational flow. In an application like the learning companion, such a socially responsive dialogue system may improve learning and motivation. However, it is not clear how to produce entrainment in an automatic dialogue system in ways that produce the effects seen in human-human dialogue. In this paper, we take the first steps towards implementing a spoken dialogue system which can entrain. We propose three methods of pitch adaptation based on analysis of human entrainment, and design and implement a system which can manipulate the pitch of text-to-speech output adaptively. We find a clear relationship between perceptions of rapport and different forms of pitch adaptations. Certain adaptations are perceived as significantly more natural and rapport-like. Ultimately, adapting by shifting the pitch contour of the text-to-speech output by the mean pitch of the user results in the highest reported measures of rapport and naturalness.
AB - Observed frequently in human-human interactions, entrainment is a social phenomenon in which speakers become more like each other over the course of a conversation. Acoustic-prosodic entrainment occurs when individuals adapt their acoustic-prosodic speech features, such as pitch and intensity. Correlated with communicative success, naturalness, and conversational flow as well as social variables such as rapport, a dialogue system which automatically entrains has the potential to improve verbal interactions by increasing rapport, naturalness, and conversational flow. In an application like the learning companion, such a socially responsive dialogue system may improve learning and motivation. However, it is not clear how to produce entrainment in an automatic dialogue system in ways that produce the effects seen in human-human dialogue. In this paper, we take the first steps towards implementing a spoken dialogue system which can entrain. We propose three methods of pitch adaptation based on analysis of human entrainment, and design and implement a system which can manipulate the pitch of text-to-speech output adaptively. We find a clear relationship between perceptions of rapport and different forms of pitch adaptations. Certain adaptations are perceived as significantly more natural and rapport-like. Ultimately, adapting by shifting the pitch contour of the text-to-speech output by the mean pitch of the user results in the highest reported measures of rapport and naturalness.
KW - adaptation
KW - dialogue system
KW - naturalness
KW - pitch
KW - rapport
UR - http://www.scopus.com/inward/record.url?scp=84964563990&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84964563990&partnerID=8YFLogxK
U2 - 10.1109/ASRU.2015.7404781
DO - 10.1109/ASRU.2015.7404781
M3 - Conference contribution
AN - SCOPUS:84964563990
T3 - 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings
SP - 103
EP - 110
BT - 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015
Y2 - 13 December 2015 through 17 December 2015
ER -