Bayesian compressive sensing (BCS) helps address ill-posed signal recovery problems using the Bayesian estimation framework. Gibbs sampling is a technique used in Bayesian estimation that iteratively draws samples from conditional posterior distributions. However, Gibbs sampling is inherently sequential and existing parallel implementations focus on reducing the communication between computing units at the cost of increase in recovery error. In this work, we propose a two-stage parallel coefficient update scheme for wavelet-based Bayesian compressive sensing, where the first stage approximates the real distributions of the wavelet coefficients and the second stage computes the final estimate of the coefficients. While in the first stage the parallel computing units share information with each other, in the second stage, the parallel units work independently. We propose a new coefficient update scheme that updates coefficients in both stages based on data generated a few rounds ago. Such a scheme helps in relaxing the timing constraints for communication in the first stage and computations in the second stage. We design the corresponding parallel architecture and synthesize it in 7 nm technology node. We show that in a system with 8 computing units, our method helps reduce the execution time by 17.4× compared to a sequential implementation without any increase in the signal recovery error.