A key challenge in computational material design is to optimize for particular material properties by searching in an often high-dimensional design space of microstructures. A tractable approach to this optimization task is to identify an encoder that maps from microstructures, which are 2D or 3D images, to a lower-dimensional feature space, and a decoder that generates new microstructures based on samples from the feature space. This two-way mapping has been achieved through feature learning, as common features often exist in microstructures from the same material system. Yet existing approaches limit the size of the generated images to that of the training samples, making it less applicable to designing microstructures at an arbitrary scale. This paper proposes a hybrid model that learns both common features and the spatial distributions of them. We show through various material systems that unlike existing reconstruction methods, our method can generate new microstructure samples of arbitrary sizes that are both visually and statistically close to the training samples while preserving local microstructure patterns.