The past few years have witnessed the explosive growth of Internet of Things (IoT) devices. The necessity of real-time edge intelligence for IoT applications demands that decision making must take place right here right now at the network edge, thus dictating that a high percentage of the IoT created data should be stored and analyzed locally. However, the computing resources are constrained and the amount of local data is often very limited at edge nodes. To tackle these challenges, we propose a distributionally robust optimization (DRO)-based edge intelligence framework, which is based on an innovative synergy of cloud knowledge transfer and local learning. More specifically, the knowledge transfer from the cloud learning is in the form of a reference distribution and its associated uncertainty set. Further, based on its local data, the edge device constructs an uncertainty set centered around its empirical distribution. The edge learning problem is then cast as a DRO problem subject to the above two distribution uncertainty sets. Building on this framework, we investigate two problem formulations for DRO-based edge intelligence, where the uncertainty sets are constructed using the Kullback-Leibler divergence and the Wasserstein distance, respectively. Numerical results demonstrate the effectiveness of the proposed DRO-based framework.