Inverse statistical methods to analyze utility billing data and/or hourly/daily monitored energy consumption data are being increasingly adopted by building professionals engaged in improving the energy efficiency of existing buildings. The field is quite mature but still evolving as it adapts to the surge of energy interval data provided by smart meters. For example, many of the simpler physics-based models relying on regression methods are being enhanced by the use of machine learning or data mining techniques coupled with knowledge bases of energy use data from different generic building types and other types of data streams. Although quite sophisticated, the published literature seems lacking in robust and sensitive methods to perform some of the simpler analysis tasks. Two such tasks are related to the automated identification and removal of outlier data points during days when the building is operated abnormally, and to the identification of day types or, more correctly, to the identification of a parsimonious set of building operational and occupancy schedules. This paper proposes a methodology to perform those tasks (which can be automated) using robust techniques borrowed from the data mining literature. These tasks are essential if one wishes to subsequently use the data to operate the buildings more efficiently - for example, to identify baseline models, to perform condition monitoring, or for short-term load prediction for demand response programs. Such issues are addressed in a companion paper (Jalori and Reddy 2015). The application of the proposed methodology is illustrated using two different buildings, one synthetic (the U.S. Department of Energy (DOE) medium-office prototype) building and another, an actual office building.