This paper proposes a trainable computer vision approach for visual object registration relative to a collection of training images obtained a priori. The algorithm first identifies whether or not the image belongs to the scene location, and should it belong, it will identify objects of interest within the image and geo-register them. To accomplish this task, the processing chain relies on 3-D structure derived from motion to represent feature locations in a proposed model. Using current state-of- the-art algorithms, detected objects are extracted and their two-dimensional sizes in pixel quantities are converted into relative 3-D real-world coordinates using scene information, homography, and camera geometry. Locations can then be given with distance alignment information. The tasks can be accomplished in an efficient manner. Finally, algorithmic evaluation is presented with receiver operating characteristics, computational analysis, and registration errors in physical distances.