Darkroom: Compiling high-level image processing code into hardware pipelines

James Hegarty, John Brunhaver, Zachary DeVito, Jonathan Ragan-Kelley, Noy Cohen, Steven Bell, Artem Vasilyev, Mark Horowitz, Pat Hanrahan

Research output: Contribution to journalArticle

59 Citations (Scopus)

Abstract

Specialized image signal processors (ISPs) exploit the structure of image processing pipelines to minimize memory bandwidth using the architectural pattern of line-buffering, where all intermediate data between each stage is stored in small on-chip buffers. This provides high energy efficiency, allowing long pipelines with tera-op/sec. image processing in battery-powered devices, but traditionally requires painstaking manual design in hardware. Based on this pattern, we present Darkroom, a language and compiler for image processing. The semantics of the Darkroom language allow it to compile programs directly into line-buffered pipelines, with all intermediate values in local line-buffer storage, eliminating unnecessary communication with off-chip DRAM. We formulate the problem of optimally scheduling line-buffered pipelines to minimize buffering as an integer linear program. Finally, given an optimally scheduled pipeline, Darkroom synthesizes hardware descriptions for ASIC or FPGA, or fast CPU code. We evaluate Darkroom implementations of a range of applications, including a camera pipeline, low-level feature detection algorithms, and deblurring. For many applications, we demonstrate gigapixel/sec. performance in under 0.5mm2 of ASIC silicon at 250 mW (simulated on a 45nm foundry process), realtime 1080p/60 video processing using a fraction of the resources of a modern FPGA, and tens of megapixels/sec. of throughput on a quad-core x86 processor.

Original languageEnglish (US)
Article number144
JournalACM Transactions on Computer Systems
Volume33
Issue number4
DOIs
StatePublished - 2014
Externally publishedYes

Fingerprint

Image processing
Pipelines
Hardware
Application specific integrated circuits
Field programmable gate arrays (FPGA)
Buffer storage
Dynamic random access storage
Foundries
Program processors
Energy efficiency
Semantics
Cameras
Scheduling
Throughput
Bandwidth
Data storage equipment
Silicon
Communication
Processing

Keywords

  • Domain-specific languages
  • FPGAs
  • Hardware synthesis
  • Image processing
  • Video processing.

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Darkroom : Compiling high-level image processing code into hardware pipelines. / Hegarty, James; Brunhaver, John; DeVito, Zachary; Ragan-Kelley, Jonathan; Cohen, Noy; Bell, Steven; Vasilyev, Artem; Horowitz, Mark; Hanrahan, Pat.

In: ACM Transactions on Computer Systems, Vol. 33, No. 4, 144, 2014.

Research output: Contribution to journalArticle

Hegarty, J, Brunhaver, J, DeVito, Z, Ragan-Kelley, J, Cohen, N, Bell, S, Vasilyev, A, Horowitz, M & Hanrahan, P 2014, 'Darkroom: Compiling high-level image processing code into hardware pipelines', ACM Transactions on Computer Systems, vol. 33, no. 4, 144. https://doi.org/10.1145/2601097.2601174
Hegarty, James ; Brunhaver, John ; DeVito, Zachary ; Ragan-Kelley, Jonathan ; Cohen, Noy ; Bell, Steven ; Vasilyev, Artem ; Horowitz, Mark ; Hanrahan, Pat. / Darkroom : Compiling high-level image processing code into hardware pipelines. In: ACM Transactions on Computer Systems. 2014 ; Vol. 33, No. 4.
@article{27086f979c2e410ead478598af8b93ba,
title = "Darkroom: Compiling high-level image processing code into hardware pipelines",
abstract = "Specialized image signal processors (ISPs) exploit the structure of image processing pipelines to minimize memory bandwidth using the architectural pattern of line-buffering, where all intermediate data between each stage is stored in small on-chip buffers. This provides high energy efficiency, allowing long pipelines with tera-op/sec. image processing in battery-powered devices, but traditionally requires painstaking manual design in hardware. Based on this pattern, we present Darkroom, a language and compiler for image processing. The semantics of the Darkroom language allow it to compile programs directly into line-buffered pipelines, with all intermediate values in local line-buffer storage, eliminating unnecessary communication with off-chip DRAM. We formulate the problem of optimally scheduling line-buffered pipelines to minimize buffering as an integer linear program. Finally, given an optimally scheduled pipeline, Darkroom synthesizes hardware descriptions for ASIC or FPGA, or fast CPU code. We evaluate Darkroom implementations of a range of applications, including a camera pipeline, low-level feature detection algorithms, and deblurring. For many applications, we demonstrate gigapixel/sec. performance in under 0.5mm2 of ASIC silicon at 250 mW (simulated on a 45nm foundry process), realtime 1080p/60 video processing using a fraction of the resources of a modern FPGA, and tens of megapixels/sec. of throughput on a quad-core x86 processor.",
keywords = "Domain-specific languages, FPGAs, Hardware synthesis, Image processing, Video processing.",
author = "James Hegarty and John Brunhaver and Zachary DeVito and Jonathan Ragan-Kelley and Noy Cohen and Steven Bell and Artem Vasilyev and Mark Horowitz and Pat Hanrahan",
year = "2014",
doi = "10.1145/2601097.2601174",
language = "English (US)",
volume = "33",
journal = "ACM Transactions on Computer Systems",
issn = "0734-2071",
publisher = "Association for Computing Machinery (ACM)",
number = "4",

}

TY - JOUR

T1 - Darkroom

T2 - Compiling high-level image processing code into hardware pipelines

AU - Hegarty, James

AU - Brunhaver, John

AU - DeVito, Zachary

AU - Ragan-Kelley, Jonathan

AU - Cohen, Noy

AU - Bell, Steven

AU - Vasilyev, Artem

AU - Horowitz, Mark

AU - Hanrahan, Pat

PY - 2014

Y1 - 2014

N2 - Specialized image signal processors (ISPs) exploit the structure of image processing pipelines to minimize memory bandwidth using the architectural pattern of line-buffering, where all intermediate data between each stage is stored in small on-chip buffers. This provides high energy efficiency, allowing long pipelines with tera-op/sec. image processing in battery-powered devices, but traditionally requires painstaking manual design in hardware. Based on this pattern, we present Darkroom, a language and compiler for image processing. The semantics of the Darkroom language allow it to compile programs directly into line-buffered pipelines, with all intermediate values in local line-buffer storage, eliminating unnecessary communication with off-chip DRAM. We formulate the problem of optimally scheduling line-buffered pipelines to minimize buffering as an integer linear program. Finally, given an optimally scheduled pipeline, Darkroom synthesizes hardware descriptions for ASIC or FPGA, or fast CPU code. We evaluate Darkroom implementations of a range of applications, including a camera pipeline, low-level feature detection algorithms, and deblurring. For many applications, we demonstrate gigapixel/sec. performance in under 0.5mm2 of ASIC silicon at 250 mW (simulated on a 45nm foundry process), realtime 1080p/60 video processing using a fraction of the resources of a modern FPGA, and tens of megapixels/sec. of throughput on a quad-core x86 processor.

AB - Specialized image signal processors (ISPs) exploit the structure of image processing pipelines to minimize memory bandwidth using the architectural pattern of line-buffering, where all intermediate data between each stage is stored in small on-chip buffers. This provides high energy efficiency, allowing long pipelines with tera-op/sec. image processing in battery-powered devices, but traditionally requires painstaking manual design in hardware. Based on this pattern, we present Darkroom, a language and compiler for image processing. The semantics of the Darkroom language allow it to compile programs directly into line-buffered pipelines, with all intermediate values in local line-buffer storage, eliminating unnecessary communication with off-chip DRAM. We formulate the problem of optimally scheduling line-buffered pipelines to minimize buffering as an integer linear program. Finally, given an optimally scheduled pipeline, Darkroom synthesizes hardware descriptions for ASIC or FPGA, or fast CPU code. We evaluate Darkroom implementations of a range of applications, including a camera pipeline, low-level feature detection algorithms, and deblurring. For many applications, we demonstrate gigapixel/sec. performance in under 0.5mm2 of ASIC silicon at 250 mW (simulated on a 45nm foundry process), realtime 1080p/60 video processing using a fraction of the resources of a modern FPGA, and tens of megapixels/sec. of throughput on a quad-core x86 processor.

KW - Domain-specific languages

KW - FPGAs

KW - Hardware synthesis

KW - Image processing

KW - Video processing.

UR - http://www.scopus.com/inward/record.url?scp=84905730031&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84905730031&partnerID=8YFLogxK

U2 - 10.1145/2601097.2601174

DO - 10.1145/2601097.2601174

M3 - Article

AN - SCOPUS:84905730031

VL - 33

JO - ACM Transactions on Computer Systems

JF - ACM Transactions on Computer Systems

SN - 0734-2071

IS - 4

M1 - 144

ER -