In simulation-based surgical training, a key task is to rate the performance of the operator, which is done currently by senior surgeons. This is a costly practice and objectively quantifiable assessment metrics are often missing. Researchers have been working towards building automated systems to achieve computational understanding of surgical skills, largely through analysis of motion data captured by video or other sensors. In this paper, we extend the Hierarchical Dirichlet Process Hidden Markov Model (HDP-HMM) for this purpose. We start with detecting spatial temporal interest points from the video capturing the tool motion of an operator, and then generate visual words from the descriptors of those interest points. For each frame, we construct a histogram with the associated interest points, i.e. the "bag of words", and then every video is represented by a sequence of those histograms. For sequences of each motion expertise level, we infer an HDP-HMM model. Finally, the classification of the motion expertise level for a testing sequence is based on choosing a model that maximizes the likelihood of the given sequence. Compared with the other action recognition algorithms, such as kernel SVM, our method leads to a better result. Further, the proposed approach also provides some important cues on the patterns of motion for each expertise level.