Text-to-Speech Software and Learning

Investigating the Relevancy of the Voice Effect

Scotty Craig, Noah L. Schroeder

Research output: Contribution to journalArticle

Abstract

Technology advances quickly in today’s society. This is particularly true in regard to instructional multimedia. One increasingly important aspect of instructional multimedia design is determining the type of voice that will provide the narration; however, research in the area is dated and limited in scope. Using a randomized pretest–posttest design, we examined the efficacy of learning from an instructional animation where narration was provided by an older text-to-speech engine, a modern text-to-speech engine, or a recorded human voice. In most respects, those who learned from the modern text-to-speech engine were not statistically different in regard to their perceptions, learning outcomes, or cognitive efficiency measures compared with those who learned from the recorded human voice. Our results imply that software technologies may have reached a point where they can credibly and effectively deliver the narration for multimedia learning environments.

Original languageEnglish (US)
JournalJournal of Educational Computing Research
DOIs
StateAccepted/In press - Jan 1 2018
Externally publishedYes

Fingerprint

narration
multimedia
learning
Engines
learning environment
efficiency
Animation
software

Keywords

  • multimedia learning
  • narration
  • synthesized voice
  • voice effect

ASJC Scopus subject areas

  • Education
  • Computer Science Applications

Cite this

@article{e2f67b7753744b5082ece89ce3fb39d3,
title = "Text-to-Speech Software and Learning: Investigating the Relevancy of the Voice Effect",
abstract = "Technology advances quickly in today’s society. This is particularly true in regard to instructional multimedia. One increasingly important aspect of instructional multimedia design is determining the type of voice that will provide the narration; however, research in the area is dated and limited in scope. Using a randomized pretest–posttest design, we examined the efficacy of learning from an instructional animation where narration was provided by an older text-to-speech engine, a modern text-to-speech engine, or a recorded human voice. In most respects, those who learned from the modern text-to-speech engine were not statistically different in regard to their perceptions, learning outcomes, or cognitive efficiency measures compared with those who learned from the recorded human voice. Our results imply that software technologies may have reached a point where they can credibly and effectively deliver the narration for multimedia learning environments.",
keywords = "multimedia learning, narration, synthesized voice, voice effect",
author = "Scotty Craig and Schroeder, {Noah L.}",
year = "2018",
month = "1",
day = "1",
doi = "10.1177/0735633118802877",
language = "English (US)",
journal = "Journal of Educational Computing Research",
issn = "0735-6331",
publisher = "Baywood Publishing Co. Inc.",

}

TY - JOUR

T1 - Text-to-Speech Software and Learning

T2 - Investigating the Relevancy of the Voice Effect

AU - Craig, Scotty

AU - Schroeder, Noah L.

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Technology advances quickly in today’s society. This is particularly true in regard to instructional multimedia. One increasingly important aspect of instructional multimedia design is determining the type of voice that will provide the narration; however, research in the area is dated and limited in scope. Using a randomized pretest–posttest design, we examined the efficacy of learning from an instructional animation where narration was provided by an older text-to-speech engine, a modern text-to-speech engine, or a recorded human voice. In most respects, those who learned from the modern text-to-speech engine were not statistically different in regard to their perceptions, learning outcomes, or cognitive efficiency measures compared with those who learned from the recorded human voice. Our results imply that software technologies may have reached a point where they can credibly and effectively deliver the narration for multimedia learning environments.

AB - Technology advances quickly in today’s society. This is particularly true in regard to instructional multimedia. One increasingly important aspect of instructional multimedia design is determining the type of voice that will provide the narration; however, research in the area is dated and limited in scope. Using a randomized pretest–posttest design, we examined the efficacy of learning from an instructional animation where narration was provided by an older text-to-speech engine, a modern text-to-speech engine, or a recorded human voice. In most respects, those who learned from the modern text-to-speech engine were not statistically different in regard to their perceptions, learning outcomes, or cognitive efficiency measures compared with those who learned from the recorded human voice. Our results imply that software technologies may have reached a point where they can credibly and effectively deliver the narration for multimedia learning environments.

KW - multimedia learning

KW - narration

KW - synthesized voice

KW - voice effect

UR - http://www.scopus.com/inward/record.url?scp=85059340445&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85059340445&partnerID=8YFLogxK

U2 - 10.1177/0735633118802877

DO - 10.1177/0735633118802877

M3 - Article

JO - Journal of Educational Computing Research

JF - Journal of Educational Computing Research

SN - 0735-6331

ER -