Memory array architecture based on emerging non-volatile memory devices have been proposed for on-chip acceleration of dot-product computation in neural networks. As recent advances in machine learning have shown that precision reduction is a useful technique to reduce the computation and memory storage, it is desired to evaluate their hardware cost. In this paper, we use a circuit-level macro model, i.e. NeuroSim, to benchmark the circuit-level performance metrics, such as chip area, latency, and dynamic energy for the XNOR-RRAM and conventional 8-bit RRAM architectures. Both architectures are implemented to process the dot-product operation of a 512×512 synaptic matrix in sequential row-by-row and parallel read-out fashion separately. The simulation results are based on RRAM models and 32nm CMOS PDK, the energy-efficiency of the parallel XNOR-RRAM architecture could achieve 311 TOPS/W, showing at least ~15× and ~621× improvement compared to the parallel and sequential conventional 8-bit RRAM architectures respectively.