Interactive virtual characters are nowadays commonplace in games, animations, and Virtual Reality (VR) applications. However, relatively few work has so far considered the animation of interactive object manipulations performed by virtual humans. In this paper, we first present a hierarchical control architecture incorporating plans, behaviors, and motor programs that enables virtual humans to accurately manipulate scene objects using different grasp types. Furthermore, as second main contribution, we introduce a method by which virtual humans learn to imitate object manipulations performed by human VR users. To this end, movements of the VR user are analyzed and processed into abstract actions. A new data structure called grasp events is used for storing information about user interactions with scene objects. High-level plans are instantiated based on grasp events to drive the virtual humans' animation. Due to their high-level representation, recorded manipulations often naturally adapt to new situations without losing plausibility.