To accomplish frequent, simple tasks with high efficiency, it is necessary to leverage low-power, microcontroller-like processors that are increasingly available on mobile systems. However, existing solutions require developers to directly program the low-power processors and carefully manage inter-processor communication. We present Reflex, a suite of compiler and runtime techniques that significantly lower the barrier for developers to leverage such low-power processors. The heart of Reflex is a software Distributed Shared Memory (DSM) that enables shared memory objects with release consistency among code running on loosely coupled processors. In order to achieve high energy efficiency without sacrificing performance much, the Reflex DSM leverages (i) extreme architectural asymmetry between low-power processors and powerful central processors, (ii) aggressive compile-time optimization, and (iii) a minimalist runtime that supports efficient message passing and event-driven execution. We report a complete realization of Reflex that runs on a TI OMAP4430-based development platform as well as on a custom tri-processor mobile platform. Using smartphone sensing applications reported in recent literature, we show that Reflex supports a programming style very close to contemporary smartphone programming. Compared to message passing, the Reflex DSM greatly reduces efforts in programming heterogeneous smartphones, eliminating up to 38% of the source lines of application code. Compared to running the same applications on existing smartphones, Reflex reduces the average system power consumption by up to 81%.