A system, computer-readable medium, and method for creating an annotated 3D model are provided. First, 3D coordinates of at least two real alignment points for/on a real object are acquired. Second, 3D virtual space, in which a 3D model exists, is merged with 3D real space, in which the real object exists, to thereby align the 3D model with the real object, by matching at least two virtual alignment points of the 3D model with the at least two real alignment points of the real object. Third, an annotated 2D image/video of the real object is prepared and projected to surfaces of the 3D model by translating a 3D coordinate and orientation of the visual sensor in the 3D real space used to acquire the annotated 2D image/video to a 3D coordinate and orientation of the visual sensor in the 3D virtual space, to thereby create an annotated 3D model.