A distributed canny edge detector: Algorithm and FPGA implementation

Qian Xu; Srenivas Varadarajan; Chaitali Chakrabarti; Lina Karam

doi:10.1109/TIP.2014.2311656

A distributed canny edge detector: Algorithm and FPGA implementation

Qian Xu, Srenivas Varadarajan, Chaitali Chakrabarti, Lina Karam

Research output: Contribution to journal › Article › peer-review

153 Scopus citations

Abstract

The Canny edge detector is one of the most widely used edge detection algorithms due to its superior performance. Unfortunately, not only is it computationally more intensive as compared with other edge detection algorithms, but it also has a higher latency because it is based on frame-level statistics. In this paper, we propose a mechanism to implement the Canny algorithm at the block level without any loss in edge detection performance compared with the original frame-level Canny algorithm. Directly applying the original Canny algorithm at the block-level leads to excessive edges in smooth regions and to loss of significant edges in high-detailed regions since the original Canny computes the high and low thresholds based on the frame-level statistics. To solve this problem, we present a distributed Canny edge detection algorithm that adaptively computes the edge detection thresholds based on the block type and the local distribution of the gradients in the image block. In addition, the new algorithm uses a nonuniform gradient magnitude histogram to compute block-based hysteresis thresholds. The resulting block-based algorithm has a significantly reduced latency and can be easily integrated with other block-based image codecs. It is capable of supporting fast edge detection of images and videos with high resolutions, including full-HD since the latency is now a function of the block size instead of the frame size. In addition, quantitative conformance evaluations and subjective tests show that the edge detection performance of the proposed algorithm is better than the original frame-based algorithm, especially when noise is present in the images. Finally, this algorithm is implemented using a 32 computing engine architecture and is synthesized on the Xilinx Virtex-5 FPGA. The synthesized architecture takes only 0.721 ms (including the SRAM read/write time and the computation time) to detect edges of \(512\times 512\) images in the USC SIPI database when clocked at 100 MHz and is faster than existing FPGA and GPU implementations.

Original language	English (US)
Article number	6774938
Pages (from-to)	2944-2960
Number of pages	17
Journal	IEEE Transactions on Image Processing
Volume	23
Issue number	7
DOIs	https://doi.org/10.1109/TIP.2014.2311656
State	Published - Jul 2014

Keywords

Canny edge detector
Distributed image processing
FPGA
high throughput
parallel processing

ASJC Scopus subject areas

Software
Computer Graphics and Computer-Aided Design

Access to Document

10.1109/TIP.2014.2311656

Cite this

@article{747dde5140a940a2a04ccaf3f272ec8d,

title = "A distributed canny edge detector: Algorithm and FPGA implementation",

abstract = "The Canny edge detector is one of the most widely used edge detection algorithms due to its superior performance. Unfortunately, not only is it computationally more intensive as compared with other edge detection algorithms, but it also has a higher latency because it is based on frame-level statistics. In this paper, we propose a mechanism to implement the Canny algorithm at the block level without any loss in edge detection performance compared with the original frame-level Canny algorithm. Directly applying the original Canny algorithm at the block-level leads to excessive edges in smooth regions and to loss of significant edges in high-detailed regions since the original Canny computes the high and low thresholds based on the frame-level statistics. To solve this problem, we present a distributed Canny edge detection algorithm that adaptively computes the edge detection thresholds based on the block type and the local distribution of the gradients in the image block. In addition, the new algorithm uses a nonuniform gradient magnitude histogram to compute block-based hysteresis thresholds. The resulting block-based algorithm has a significantly reduced latency and can be easily integrated with other block-based image codecs. It is capable of supporting fast edge detection of images and videos with high resolutions, including full-HD since the latency is now a function of the block size instead of the frame size. In addition, quantitative conformance evaluations and subjective tests show that the edge detection performance of the proposed algorithm is better than the original frame-based algorithm, especially when noise is present in the images. Finally, this algorithm is implemented using a 32 computing engine architecture and is synthesized on the Xilinx Virtex-5 FPGA. The synthesized architecture takes only 0.721 ms (including the SRAM read/write time and the computation time) to detect edges of \(512\times 512\) images in the USC SIPI database when clocked at 100 MHz and is faster than existing FPGA and GPU implementations.",

keywords = "Canny edge detector, Distributed image processing, FPGA, high throughput, parallel processing",

author = "Qian Xu and Srenivas Varadarajan and Chaitali Chakrabarti and Lina Karam",

year = "2014",

month = jul,

doi = "10.1109/TIP.2014.2311656",

language = "English (US)",

volume = "23",

pages = "2944--2960",

journal = "IEEE Transactions on Image Processing",

issn = "1057-7149",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "7",

}

TY - JOUR

T1 - A distributed canny edge detector

T2 - Algorithm and FPGA implementation

AU - Xu, Qian

AU - Varadarajan, Srenivas

AU - Chakrabarti, Chaitali

AU - Karam, Lina

PY - 2014/7

Y1 - 2014/7

N2 - The Canny edge detector is one of the most widely used edge detection algorithms due to its superior performance. Unfortunately, not only is it computationally more intensive as compared with other edge detection algorithms, but it also has a higher latency because it is based on frame-level statistics. In this paper, we propose a mechanism to implement the Canny algorithm at the block level without any loss in edge detection performance compared with the original frame-level Canny algorithm. Directly applying the original Canny algorithm at the block-level leads to excessive edges in smooth regions and to loss of significant edges in high-detailed regions since the original Canny computes the high and low thresholds based on the frame-level statistics. To solve this problem, we present a distributed Canny edge detection algorithm that adaptively computes the edge detection thresholds based on the block type and the local distribution of the gradients in the image block. In addition, the new algorithm uses a nonuniform gradient magnitude histogram to compute block-based hysteresis thresholds. The resulting block-based algorithm has a significantly reduced latency and can be easily integrated with other block-based image codecs. It is capable of supporting fast edge detection of images and videos with high resolutions, including full-HD since the latency is now a function of the block size instead of the frame size. In addition, quantitative conformance evaluations and subjective tests show that the edge detection performance of the proposed algorithm is better than the original frame-based algorithm, especially when noise is present in the images. Finally, this algorithm is implemented using a 32 computing engine architecture and is synthesized on the Xilinx Virtex-5 FPGA. The synthesized architecture takes only 0.721 ms (including the SRAM read/write time and the computation time) to detect edges of \(512\times 512\) images in the USC SIPI database when clocked at 100 MHz and is faster than existing FPGA and GPU implementations.

AB - The Canny edge detector is one of the most widely used edge detection algorithms due to its superior performance. Unfortunately, not only is it computationally more intensive as compared with other edge detection algorithms, but it also has a higher latency because it is based on frame-level statistics. In this paper, we propose a mechanism to implement the Canny algorithm at the block level without any loss in edge detection performance compared with the original frame-level Canny algorithm. Directly applying the original Canny algorithm at the block-level leads to excessive edges in smooth regions and to loss of significant edges in high-detailed regions since the original Canny computes the high and low thresholds based on the frame-level statistics. To solve this problem, we present a distributed Canny edge detection algorithm that adaptively computes the edge detection thresholds based on the block type and the local distribution of the gradients in the image block. In addition, the new algorithm uses a nonuniform gradient magnitude histogram to compute block-based hysteresis thresholds. The resulting block-based algorithm has a significantly reduced latency and can be easily integrated with other block-based image codecs. It is capable of supporting fast edge detection of images and videos with high resolutions, including full-HD since the latency is now a function of the block size instead of the frame size. In addition, quantitative conformance evaluations and subjective tests show that the edge detection performance of the proposed algorithm is better than the original frame-based algorithm, especially when noise is present in the images. Finally, this algorithm is implemented using a 32 computing engine architecture and is synthesized on the Xilinx Virtex-5 FPGA. The synthesized architecture takes only 0.721 ms (including the SRAM read/write time and the computation time) to detect edges of \(512\times 512\) images in the USC SIPI database when clocked at 100 MHz and is faster than existing FPGA and GPU implementations.

KW - Canny edge detector

KW - Distributed image processing

KW - FPGA

KW - high throughput

KW - parallel processing

UR - http://www.scopus.com/inward/record.url?scp=84902213086&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84902213086&partnerID=8YFLogxK

U2 - 10.1109/TIP.2014.2311656

DO - 10.1109/TIP.2014.2311656

M3 - Article

AN - SCOPUS:84902213086

SN - 1057-7149

VL - 23

SP - 2944

EP - 2960

JO - IEEE Transactions on Image Processing

JF - IEEE Transactions on Image Processing

IS - 7

M1 - 6774938

ER -

A distributed canny edge detector: Algorithm and FPGA implementation

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this