One challenge in traffic state estimation (TSE) is to consider spatiotemporal dependence between traffic states when the traffic states deviate from historical patterns. Although many data-driven learning methods, e.g. Markov Chain (MC) model, have been utilized to estimate the traffic state variables including flow, density, and speed, it is still difficult to update the evolution of traffic states by integrating traffic flow fundamentals and real-time data. This paper aims to combine Newell's kinematic wave (KW) model with the MC model to overcome the limitation. The MC is used to capture the regular patterns of dynamic traffic states, and the impacts of daily deviations are inferred based on the forward and backward propagation of kinematic waves on freeways. A Bayesian Classifier and weight average model allow the merging of scores of probabilities. A discretized state representation on fundamental diagrams is used to express the traffic state variables. The traffic speed and count data from detectors of the Arizona Department of Transportation (ADOT) are applied in training and validating the method. Through a case study, we also attempt to provide insights for the following question: What kinds of state sets should we need to achieve the best estimation?