Training parents, and other primary caregivers, in pivotal response treatment (PRT) has been shown to help children with autism increase their communication skills. This is most effective when the parent maintains a high degree of fidelity to the PRT methodology. Evaluation of a parent’s implementation is currently limited to manual review of PRT sessions by a trained clinician. This process is time consuming and limited in the amount of feedback that can be provided. It also makes long term support for parents who have undergone training difficult. Providing automated data extraction and analysis would alleviate the costs of providing feedback to parents. Since vocal communication is of the most common target skills for PRT implementation, audio analysis is critical to a successful feedback system. Speech patterns in PRT sessions are atypical to common speech that provide a change for audio analysis systems. Adults involved in the treatment often use child-directed language and over exaggerated exclamations as a means of engaging the child. Child speech recognition is a difficult problem that is compounded when children have limited vocal expression. Additionally, PRT sessions depict joint play activities, often producing loud, sustained noise. To address these challenges, audio classification techniques were explored to determine a methodology for labeling audio segments in videos of PRT sessions. By implementing separate support vector machine (SVM) implementations for speech activity, and speaker separation, an average accuracy of 79% was achieved.