When a vehicle navigation system is manipulated by taking pictures of a user hand motion and gesture with a camera, as the number of apparatuses and operational objects increases, the associated hand shapes and hand motions increase, thus causing a complex manipulation for a user. Furthermore, in detecting a hand with the camera, when the image of a face having color tone information similar to that of a hand appears in an image taken with a camera, or outside light rays such as sun rays or illumination rays vary, detection accuracy is reduced. To overcome such problems, a manipulation input device is provided that includes a limited hand manipulation determination unit and a menu representation unit, whereby a simple manipulation can be achieved and manipulation can accurately be determined. In addition, detection accuracy can be improved by a unit that selects a single result from results determined by a plurality of determination units, based on images taken with a plurality of cameras.