Serious games show promise as an effective training method, but such games are complex and few guidelines exist for their effective evaluation. We draw on the design science literature to develop a serious game evaluation framework that emphasizes grounding evaluation in each of four key areas - theoretical, technical, empirical, and external. We further recommend that serious game developers assume an iterative, adaptive approach to grounding an evaluation effort in these four areas, emphasizing some areas more than others at different stages of the development cycle. We illustrate our framework using a case study of a large-scale serious game development project. The case study illustrates a holistic approach to serious game evaluation that is valuable to both researchers and practitioners.