This paper investigates a novel research direction that leverages vision to help overcome the critical wireless communication challenges. In particular, this paper considers millimeter wave (mmWave) communication systems, which are principal components of 5G and beyond. These systems face two important challenges: (i) the large training overhead associated with selecting the optimal beam and (ii) the reliability challenge due to the high sensitivity to link blockages. Interestingly, most of the devices that employ mmWave arrays will likely also use cameras, such as 5G phones, self-driving vehicles, and virtual/augmented reality headsets. Therefore, we investigate the potential gains of employing cameras at the mmWave base stations and leveraging their visual data to help overcome the beam selection and blockage prediction challenges. To do that, this paper exploits computer vision and deep learning tools to predict mmWave beams and blockages directly from the camera RGB images and the sub-6GHz channels. The experimental results reveal interesting insights into the effectiveness of such solutions. For example, the deep learning model is capable of achieving over 90% beam prediction accuracy, which only requires snapping a shot of the scene and zero overhead.