In this paper, a novel system is presented to detect and track multiple targets in Unmanned Air Vehicles (UAV) video sequences. Since the output of the system is based on target motion, we first segment foreground moving areas from the background in each video frame using background subtraction. To stabilize the video, a multi-point-descriptor-based image registration method is performed where a projective model is employed to describe the global transformation between frames. For each detected foreground blob, an object model is used to describe its appearance and motion information. Rather than immediately classifying the detected objects as targets, we track them for a certain period of time and only those with qualified motion patterns are labeled as targets. In the subsequent tracking process, a Kalman filter is assigned to each tracked target to dynamically estimate its position in each frame. Blobs detected at a later time are used as observations to update the state of the tracked targets to which they are associated. The proposed overlap-rate-based data association method considers the splitting and merging of the observations, and therefore is able to maintain tracks more consistently. Experimental results demonstrate that the system performs well on real-world UAV video sequences. Moreover, careful consideration given to each component in the system has made the proposed system feasible for real-time applications.