In this paper we formulate a Multi-Armed Bandit Compressive Spectrum Sensing (MAB-CSS) problem, in which a Cognitive Receiver (CR) decides dynamically how to best sense N sub-channels states, that switch from being occupied to being available as independent and statistically identical Markov chains. We assume that the CR is endowed with K CSS samplers each sensing an arbitrary mixture of the N signals in the sub-channels, and upon deciding what channels are available, it collects an equal reward from each channel unoccupied that is sensed. The MAB-CSS problem accounts for the ability of the CR of sweeping a large spectrum and being able to reconstruct the exact support of the N channels occupancy pattern, as long as the latter is sufficiently sparse. This is a generalization of the typical model in which the CR can sense K out of the N sub-channels. In choosing the compressive sensing strategy, the CR needs to consider how to gather the most informative statistics on the spectrum while not exceeding the limits beyond which the occupancy is no longer identifiable. In this work, we study a simplified and noiseless discrete sensing model and establish the structure of the optimum MAB-CSS myopic policy.