Scaling the number of antennas up is a key characteristic of current and future wireless communication systems. The hardware cost and power consumption, however, motivate large-scale MIMO systems, especially at millimeter wave (mmWave) bands, to rely on analog-only or hybrid analog/digital transceiver architectures. With these architectures, mmWave base stations normally use pre-defined beamforming codebooks for both initial access and data transmissions. Current beam codebooks, however, generally adopt single-lobe narrow beams and scan the entire angular space. This leads to high beam training overhead and loss in the achievable beamforming gains. In this paper, we propose a new machine learning framework for learning beamforming codebooks in hardware-constrained large-scale MIMO systems. More specifically, we develop a neural network architecture that accounts for the hardware constraints and learns beam codebooks that adapt to the surrounding environment and the user locations. Simulation results highlight the capability of the proposed solution in learning multi-lobe beams and reducing the codebook size, which leads to noticeable gains compared to classical codebook design approaches.