The origin of switching parameter variations in metal oxide resistive switching random access memory (RRAM) is studied. The stochastic formation/rupture of the conductive filaments (CFs) is modeled and incorporated with a trap-assisted-tunneling (TAT) current solver. The experimental DC I-V characteristics and pulse transient waveform featuring the current fluctuation during the reset process are reproduced by Monte Carlo simulations. It is found that the wide spread of high resistance states (HRS) are due to the variation of tunneling gap distances, and the tail bits of the HRS are due to the newly generated traps near the electrode at the end of the reset process. To solve the over-reset and tail bits problems, a device structure with active/buffer bi-layer oxides combined with the reset-verify technique is proposed. Our model is corroborated by measured experimental data of HfO x based RRAM.