As manufacturing transitions to real-time sensing, it becomes more important to handle multiple, high-dimensional (non-stationary) time series that generate thousands of measurements for each batch. Predictive models are often challenged by such high-dimensional data and it is important to reduce the dimensionality for better performance. With thousands of measurements, even wavelet coefficients do not reduce the dimensionality sufficiently. We propose a two-stage method that uses energy statistics from a discrete wavelet transform to identify process variables and appropriate resolutions of wavelet coefficients in an initial (screening) model. Variable importance scores from a modern random forest classifier are exploited in this stage. Coefficients that correspond to the identified variables and resolutions are then selected for a second-stage predictive model. The approach is shown to provide good performance, along with interpretable results, in an example where multiple time series are used to indicate the need for preventive maintenance. In general, the two-stage approach can handle high dimensionality and still provide interpretable features linked to the relevant process variables and wavelet resolutions that can be used for further analysis.
- discrete wavelet transformation
- preventive maintenance
- random forest
ASJC Scopus subject areas
- Safety, Risk, Reliability and Quality
- Management Science and Operations Research