A systolic array architecture for image coding using adaptive vector quantization is presented. Vector quantization (VQ) essentially involves two steps: codebook generation and encoding. In the codebook generation step, a set (codebook) of representative vectors (codewords) is generated from a training set of vectors. In the encoding step, the input vectors to be coded are quantized to the closest codeword and the corresponding index (label) of the codeword is transmitted. The high-speed architectures for VQ, reported thus far in the literature, only implement image encoding but not codebook generation. In the proposed architecture, the encoding and codebook generation operations are overlapped in the same structure. This structure consists of three different modules: 1) Systolic array module, 2) Label extractor module, and 3) Delay buffer module. The systolic array module essentially consists of an array of L x N basic systolic cells connected in parallel and pipeline in the direction of the vector dimension, L, and codeword dimension, N, respectively. The basic systolic cell is designed with two modes of operation (forward and reverse). In the forward mode, the cell executes the basic operation in a VQ encoder, namely, distortion computation. In the reverse mode, the cell executes the new codeword (centroid) computation operation. The label extractor module consists of N label cells, which are connected in pipeline. The label cells are designed to determine the label and also have two modes of operation (forward and reverse). We note that the two modes of operation in both the systolic array module and the label extractor module take place simultaneously and the synchronism is maintained by the delay buffer cells in the delay buffer module. This architecture results in a speedup proportional to NL, and has the following advantages: 1) there is no need for separate hardware to compute the new centroids; 2) there is no need for a high speed interface to transfer the new centroids into the systolic array; and 3) there are no delays involved in the computation and transfer of new centroids. The regular and iterable structure makes possible the VLSI implementation of the architecture.
|Original language||English (US)|
|Number of pages||8|
|Journal||IEEE Transactions on Circuits and Systems for Video Technology|
|State||Published - Jun 1991|
ASJC Scopus subject areas
- Media Technology
- Electrical and Electronic Engineering