A distributed canny edge detector

Algorithm and FPGA implementation

Qian Xu, Srenivas Varadarajan, Chaitali Chakrabarti, Lina Karam

Research output: Contribution to journalArticle

69 Citations (Scopus)

Abstract

The Canny edge detector is one of the most widely used edge detection algorithms due to its superior performance. Unfortunately, not only is it computationally more intensive as compared with other edge detection algorithms, but it also has a higher latency because it is based on frame-level statistics. In this paper, we propose a mechanism to implement the Canny algorithm at the block level without any loss in edge detection performance compared with the original frame-level Canny algorithm. Directly applying the original Canny algorithm at the block-level leads to excessive edges in smooth regions and to loss of significant edges in high-detailed regions since the original Canny computes the high and low thresholds based on the frame-level statistics. To solve this problem, we present a distributed Canny edge detection algorithm that adaptively computes the edge detection thresholds based on the block type and the local distribution of the gradients in the image block. In addition, the new algorithm uses a nonuniform gradient magnitude histogram to compute block-based hysteresis thresholds. The resulting block-based algorithm has a significantly reduced latency and can be easily integrated with other block-based image codecs. It is capable of supporting fast edge detection of images and videos with high resolutions, including full-HD since the latency is now a function of the block size instead of the frame size. In addition, quantitative conformance evaluations and subjective tests show that the edge detection performance of the proposed algorithm is better than the original frame-based algorithm, especially when noise is present in the images. Finally, this algorithm is implemented using a 32 computing engine architecture and is synthesized on the Xilinx Virtex-5 FPGA. The synthesized architecture takes only 0.721 ms (including the SRAM read/write time and the computation time) to detect edges of $$512\times 512$$ images in the USC SIPI database when clocked at 100 MHz and is faster than existing FPGA and GPU implementations.

Original language English (US) 6774938 2944-2960 17 IEEE Transactions on Image Processing 23 7 https://doi.org/10.1109/TIP.2014.2311656 Published - 2014

Fingerprint

Field programmable gate arrays (FPGA)
Edge detection
Detectors
Statistics
Static random access storage
Hysteresis
Engines

Keywords

• Canny edge detector
• Distributed image processing
• FPGA
• high throughput
• parallel processing

ASJC Scopus subject areas

• Computer Graphics and Computer-Aided Design
• Software

Cite this

A distributed canny edge detector : Algorithm and FPGA implementation. / Xu, Qian; Varadarajan, Srenivas; Chakrabarti, Chaitali; Karam, Lina.

In: IEEE Transactions on Image Processing, Vol. 23, No. 7, 6774938, 2014, p. 2944-2960.

Research output: Contribution to journalArticle

@article{747dde5140a940a2a04ccaf3f272ec8d,
title = "A distributed canny edge detector: Algorithm and FPGA implementation",
abstract = "The Canny edge detector is one of the most widely used edge detection algorithms due to its superior performance. Unfortunately, not only is it computationally more intensive as compared with other edge detection algorithms, but it also has a higher latency because it is based on frame-level statistics. In this paper, we propose a mechanism to implement the Canny algorithm at the block level without any loss in edge detection performance compared with the original frame-level Canny algorithm. Directly applying the original Canny algorithm at the block-level leads to excessive edges in smooth regions and to loss of significant edges in high-detailed regions since the original Canny computes the high and low thresholds based on the frame-level statistics. To solve this problem, we present a distributed Canny edge detection algorithm that adaptively computes the edge detection thresholds based on the block type and the local distribution of the gradients in the image block. In addition, the new algorithm uses a nonuniform gradient magnitude histogram to compute block-based hysteresis thresholds. The resulting block-based algorithm has a significantly reduced latency and can be easily integrated with other block-based image codecs. It is capable of supporting fast edge detection of images and videos with high resolutions, including full-HD since the latency is now a function of the block size instead of the frame size. In addition, quantitative conformance evaluations and subjective tests show that the edge detection performance of the proposed algorithm is better than the original frame-based algorithm, especially when noise is present in the images. Finally, this algorithm is implemented using a 32 computing engine architecture and is synthesized on the Xilinx Virtex-5 FPGA. The synthesized architecture takes only 0.721 ms (including the SRAM read/write time and the computation time) to detect edges of $$512\times 512$$ images in the USC SIPI database when clocked at 100 MHz and is faster than existing FPGA and GPU implementations.",
keywords = "Canny edge detector, Distributed image processing, FPGA, high throughput, parallel processing",
author = "Qian Xu and Srenivas Varadarajan and Chaitali Chakrabarti and Lina Karam",
year = "2014",
doi = "10.1109/TIP.2014.2311656",
language = "English (US)",
volume = "23",
pages = "2944--2960",
journal = "IEEE Transactions on Image Processing",
issn = "1057-7149",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "7",

}

TY - JOUR

T1 - A distributed canny edge detector

T2 - Algorithm and FPGA implementation

AU - Xu, Qian

AU - Varadarajan, Srenivas

AU - Chakrabarti, Chaitali

AU - Karam, Lina

PY - 2014

Y1 - 2014

N2 - The Canny edge detector is one of the most widely used edge detection algorithms due to its superior performance. Unfortunately, not only is it computationally more intensive as compared with other edge detection algorithms, but it also has a higher latency because it is based on frame-level statistics. In this paper, we propose a mechanism to implement the Canny algorithm at the block level without any loss in edge detection performance compared with the original frame-level Canny algorithm. Directly applying the original Canny algorithm at the block-level leads to excessive edges in smooth regions and to loss of significant edges in high-detailed regions since the original Canny computes the high and low thresholds based on the frame-level statistics. To solve this problem, we present a distributed Canny edge detection algorithm that adaptively computes the edge detection thresholds based on the block type and the local distribution of the gradients in the image block. In addition, the new algorithm uses a nonuniform gradient magnitude histogram to compute block-based hysteresis thresholds. The resulting block-based algorithm has a significantly reduced latency and can be easily integrated with other block-based image codecs. It is capable of supporting fast edge detection of images and videos with high resolutions, including full-HD since the latency is now a function of the block size instead of the frame size. In addition, quantitative conformance evaluations and subjective tests show that the edge detection performance of the proposed algorithm is better than the original frame-based algorithm, especially when noise is present in the images. Finally, this algorithm is implemented using a 32 computing engine architecture and is synthesized on the Xilinx Virtex-5 FPGA. The synthesized architecture takes only 0.721 ms (including the SRAM read/write time and the computation time) to detect edges of $$512\times 512$$ images in the USC SIPI database when clocked at 100 MHz and is faster than existing FPGA and GPU implementations.

AB - The Canny edge detector is one of the most widely used edge detection algorithms due to its superior performance. Unfortunately, not only is it computationally more intensive as compared with other edge detection algorithms, but it also has a higher latency because it is based on frame-level statistics. In this paper, we propose a mechanism to implement the Canny algorithm at the block level without any loss in edge detection performance compared with the original frame-level Canny algorithm. Directly applying the original Canny algorithm at the block-level leads to excessive edges in smooth regions and to loss of significant edges in high-detailed regions since the original Canny computes the high and low thresholds based on the frame-level statistics. To solve this problem, we present a distributed Canny edge detection algorithm that adaptively computes the edge detection thresholds based on the block type and the local distribution of the gradients in the image block. In addition, the new algorithm uses a nonuniform gradient magnitude histogram to compute block-based hysteresis thresholds. The resulting block-based algorithm has a significantly reduced latency and can be easily integrated with other block-based image codecs. It is capable of supporting fast edge detection of images and videos with high resolutions, including full-HD since the latency is now a function of the block size instead of the frame size. In addition, quantitative conformance evaluations and subjective tests show that the edge detection performance of the proposed algorithm is better than the original frame-based algorithm, especially when noise is present in the images. Finally, this algorithm is implemented using a 32 computing engine architecture and is synthesized on the Xilinx Virtex-5 FPGA. The synthesized architecture takes only 0.721 ms (including the SRAM read/write time and the computation time) to detect edges of $$512\times 512$$ images in the USC SIPI database when clocked at 100 MHz and is faster than existing FPGA and GPU implementations.

KW - Canny edge detector

KW - Distributed image processing

KW - FPGA

KW - high throughput

KW - parallel processing

UR - http://www.scopus.com/inward/record.url?scp=84902213086&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84902213086&partnerID=8YFLogxK

U2 - 10.1109/TIP.2014.2311656

DO - 10.1109/TIP.2014.2311656

M3 - Article

VL - 23

SP - 2944

EP - 2960

JO - IEEE Transactions on Image Processing

JF - IEEE Transactions on Image Processing

SN - 1057-7149

IS - 7

M1 - 6774938

ER -