Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

271 results about "Monocular image" patented technology

Deep learning system for cuboid detection

Systems and methods for cuboid detection and keypoint localization in images are disclosed. In one aspect, a deep cuboid detector can be used for simultaneous cuboid detection and keypoint localization in monocular images. The deep cuboid detector can include a plurality of convolutional layers and non-convolutional layers of a trained convolution neural network for determining a convolutional feature map from an input image. A region proposal network of the deep cuboid detector can determine a bounding box surrounding a cuboid in the image using the convolutional feature map. The pooling layer and regressor layers of the deep cuboid detector can implement iterative feature pooling for determining a refined bounding box and a parameterized representation of the cuboid.
Owner:MAGIC LEAP INC

Full convolution neural network (FCN)-based monocular image depth estimation method

The invention discloses a full convolution neural network (FCN)-based monocular image depth estimation method. The method comprises the steps of acquiring training image data; inputting the training image data into a full convolution neural network (FCN), and sequentially outputting through pooling layers to obtain a characteristic image; subjecting each characteristic image outputted by a last pooling layer sequentially to amplification treatment to obtain a new characteristic image the same with the dimension of a characteristic image outputted by a previous pooling layer, and fusing the twocharacteristic images; sequentially fusing the outputted characteristic image of each pooling layer from back to front so as to obtain a final prediction depth image; training the parameters of the full convolution neural network (FCN) by utilizing a random gradient descent method (SGD) during training; acquiring an RGB image required for depth prediction, and inputting the RGB image into the well trained full convolution neural network (FCN) so as to obtain a corresponding prediction depth image. According to the method, the problem that the resolution of an output image is low in the convolution process can be solved. By adopting the form of the full convolution neural network, a full-connection layer is removed. The number of parameters in the network is effectively reduced.
Owner:NANJING UNIV OF POSTS & TELECOMM

Depth estimation method for monocular image based on multi-scale CNN and continuous CRF

A depth estimation method for a monocular image based on a multi-scale CNN and a continuous CRF is disclosed in this invention. A CRF module is adopted to calculate a unary potential energy according to the output depth map of a DCNN, and the pairwise sparse potential energy according to input RGB images. MAP (maximum a posteriori estimation) algorithm is used to infer the optimized depth map at last. The present invention integrates optimization theories of the multi-scale CNN with that of the continuous CRF. High accuracy and a clear contour are both achieved in the estimated depth map; the depth estimated by the present invention has a high resolution and detailed contour information can be kept for all objects in the scene, which provides better visual effects.
Owner:ZHEJIANG GONGSHANG UNIVERSITY

Image file processing apparatus, image file reproduction apparatus and image file working/editing apparatus

The present invention is provided with: a stereo image data generation unit for generating stereo image data indicating multiple monocular images obtained with a predetermined parallax between the images for a same subject; a metadata generation unit for generating metadata about the stereo image data; an image characteristic information generation unit for generating image characteristic information showing characteristics of an image indicated on the basis of the stereo image data; and an image file generation unit for synthesizing the stereo image data generated by the stereo image data generation unit and the metadata generated by the metadata generation unit to generate an image file; wherein the image file generation unit adds the image characteristic information generated by the image characteristic information generation unit to the metadata generated by the metadata generation unit.
Owner:OLYMPUS IMAGING CORP

Monocular 3D pose estimation and tracking by detection

Methods and apparatus are described for monocular 3D human pose estimation and tracking, which are able to recover poses of people in realistic street conditions captured using a monocular, potentially moving camera. Embodiments of the present invention provide a three-stage process involving estimating (10, 60, 110) a 3D pose of each of the multiple objects using an output of 2D tracking-by detection (50) and 2D viewpoint estimation (46). The present invention provides a sound Bayesian formulation to address the above problems. The present invention can provide articulated 3D tracking in realistic street conditions.The present invention provides methods and apparatus for people detection and 2D pose estimation combined with a dynamic motion prior. The present invention provides not only 2D pose estimation for people in side views, it goes beyond this by estimating poses in 3D from multiple viewpoints. The estimation of poses is done in monocular images, and does not require stereo images. Also the present invention does not require detection of characteristic poses of people.
Owner:TECH UNIV DARMSTADT +1

Monocular image-oriented three-dimensional object detection method based on three-dimensional reconstruction

The invention discloses a monocular image-oriented three-dimensional object detection method based on three-dimensional reconstruction, and belongs to the field of image processing and computer vision. The method comprises the following steps: firstly, converting input data from a two-dimensional image plane into a three-dimensional point cloud space by utilizing an independent module so as to obtain better input representation; performing three-dimensional detection by using a PointNet network as a backbone network to obtain a three-dimensional position, a three-dimensional size and a three-dimensional direction of the object. In order to improve the recognition capability of the point cloud, the invention provides a multi-modal feature fusion module, and RGB information of points and RGBfeatures of ROI are supplemented and embedded into the generated point cloud representation. Compared with a two-dimensional image, the method for deriving the three-dimensional bounding box from thethree-dimensional scene is more efficient; compared with a similar three-dimensional object detection method based on a monocular camera, the method provided by the invention is more efficient.
Owner:DALIAN UNIV OF TECH

Three-dimensional object recognition system and method based on laser radar and monocular vision

The invention discloses a three-dimensional object recognition system and method based on laser radar and monocular vision. The method comprises the following steps: projecting point cloud onto a two-dimensional image, extracting and matching features of the projected image and a monocular image, and calculating a projective transformation relationship between the projected image and the monocularimage; carrying out object recognition through rich color information of the monocular image and extracting an object area of interest; inversely calculating corresponding point cloud blocks of the object in laser radar point cloud through a transformation relationship between laser radar and a monocular, and finally outputting space position information of the point cloud blocks. The three-dimensional object recognition system and method overcome the disadvantages of difficult recognition of a three-dimensional point cloud object, and achieve recognition and positioning of an object in a three-dimensional scene by means of the mature technique of two-dimensional image object recognition and the high-precision ranging performance of laser radar.
Owner:XI'AN INST OF OPTICS & FINE MECHANICS - CHINESE ACAD OF SCI

Article classification method based on depth recovery information

The invention relates to the technical field of article classification and monocular image depth estimation in the field of computer vision, and proposes a model which enhances the classification performance through introduction of depth information and only needs an RGB image instead of a real depth image acquired by a sensor as input. The invention provides an article classification method basedon depth recovery information. The method includes following steps: (1) preprocessing a data set; (2) establishing a depth recovery model in the model; (3) training two image classification models which respectively receive RGB and depth images as input; (4) establishing a final fusion model, and performing training and tests; (5) migrating a trained fusion network in step 4 to a classified dataset of a natural image; and (6) comparing effects of image classification of the model on two disclosed data sets and performing visualization. The method is mainly applied to occasions of article classification and monocular image estimation in the field of computer vision.
Owner:TIANJIN UNIV

Extraction method of monocular image depth map based on foreground and background fusion

The invention discloses an extraction method of a monocular image depth map based on foreground and background fusion, and belongs to the three-dimensional image reconstruction field of the computer vision. The method of the invention comprises the following steps: step A, a non-parametric machine learning method is used to extract a foreground depth map from an original monocular image; step B, a linear perspective method is used to estimate a background depth map with an integral distribution trend in the original monocular image; step C, the foreground depth map and the background depth map of the original monocular image perform global integration, so as to get a final depth map of the original monocular image. Compared with the prior art, the extraction method of the monocular image depth map based on the foreground and background fusion does not need to compute the camera parameter, is low in computational complexity, and is simple and practicable.
Owner:NANJING UNIV OF POSTS & TELECOMM

Sparse laser observation-based image depth estimation method

The invention discloses a sparse laser observation-based image depth estimation method. The method proposes that monocular image-based depth dense reconstruction is realized by utilizing sparse observation of single-line laser or multi-line laser. A deep neural network is trained in a mode of constructing a reference depth map and a residual error depth map, and sparse partial observation depth information is fully utilized. Compared with a method for performing depth estimation only by using a monocular image, the method provided by the invention has remarkable advantages.
Owner:ZHEJIANG UNIV

Image outputting apparatus and program

An operation unit switches the operating mode between a 3D display output mode in which the images of the 3D image file are displayed as a stereoscopic images and a 2D display output mode in which one of the images of the 3D image file is displayed as an ordinary planar image. The 3D image file is composed of stereoscopic image data which represents a plurality of monocular images constituting a multi-ocular stereoscopic image and information which is added to the stereoscopic image data and which indicates that the data is stereoscopic data. In the 2D display output mode, a control unit makes an output unit display an image of the 3D image file read from a medium by a media reader and also display a mark on the same screen, indicating that the image displayed in the 2D display output mode is based on the 3D image file.
Owner:OLYMPUS IMAGING CORP +2

Unmanned aerial vehicle scene dense reconstruction method based on VI-SLAM and depth estimation network

The invention relates to an unmanned aerial vehicle scene dense reconstruction method based on VISLAM and depth estimation, and the method comprises the steps: (1) fixing an inertial navigation unit IMU to an unmanned aerial vehicle, and calibrating the internal parameters and external parameters of a monocular camera of the unmanned aerial vehicle and the external parameters of the IMU; (2) collecting an image sequence and IMU information of an unmanned aerial vehicle scene by using an unmanned aerial vehicle monocular camera and an IMU; (3) processing the image and the IMU information acquired in the step (2) by using VISLAM to obtain a camera pose with scale information; (4) inputting the monocular image information as an original view into a viewpoint generation network to obtain a right view, and inputting the original view and the right view into a depth estimation network to obtain depth information of the image; (5) combining the camera attitude obtained in the step (3) with the depth map obtained in the step (4) to obtain a local point cloud; and (6) through point cloud optimization and registration, fusing the SLAM tracking trajectory with the local point cloud to obtainan unmanned aerial vehicle scene dense point cloud model.
Owner:BEIHANG UNIV

Method and apparatus of generating three dimensional image data having one file structure and recording the image data on a recording medium, and recording medium for storing the three dimensional image data having one file structure

A stereo digital camera which generates a three-dimensional image or stereograph having one file structure, and records the three-dimensional image on a recording medium includes a stereo adapter having optical axes corresponding to parallax, an imaging lens for transferring an object image via the stereo adapter, and a single CCD pickup unit having monocular regions on which the monocular images of the object image are projected by the imaging lens. Two monocular region images from the monocular regions form one multocular stereo image, and are compressed as one image data. This image data is appended with header information which contains an item indicating that this image data is a stereo image, an item corresponding to the number of monocular images which form the stereo image, and an item associated with addresses of the first and second monocular images, and is stored in a recording medium as one file.
Owner:OM DIGITAL SOLUTIONS CORP +1

Method of Estimating Depths from a Single Image Displayed on Display

A method of estimating depths on a monocular image displayed on a display is utilized for improving correctness of depths shown on the display. Feature vectors are calculated for each patch on the monocular image for determining an intermediate depth map of the monocular image in advance. For improving the correctness of the intermediate depth map, an energy function in forms of vectors is minimized for calculating a best solution of the depth map of the monocular image. Therefore, the display may display the monocular image according to a calculated output depth map for having an observer of the display to correctly perceive depths on the monocular image.
Owner:ARCSOFT CORP LTD

Monocular image depth estimation method based on pyramid pooling module

The invention discloses a monocular image depth estimation method based on a pyramid pooling module. In a training stage, a neural network is firstly constructed, which comprises an input layer, a hidden layer and an output layer. The hidden layer includes a separate first convolution layer, a feature extraction network framework, a scale recovery network framework, a separate second convolution layer, a pyramid pooling module, and a separate connection layer. Each original monocular image in the training set is used as the original input image. The optimal weight vector and the optimal bias term of the trained neural network model are obtained by calculating the loss function value between the predicted depth image and the real depth image corresponding to each original monocular image inthe training set and inputting it into the neural network for training. In the testing phase, the monocular image to be predicted is input into the neural network model, and the predicted depth imageis obtained by using the optimal weight vector and the optimal bias term. The advantages are high prediction accuracy and low computational complexity.
Owner:ZHEJIANG UNIVERSITY OF SCIENCE AND TECHNOLOGY

UAV sequence monocular image based method for distance detection of ground feature under power line

ActiveCN107314762ARealize safe distance detectionPicture interpretationPoint cloudDistance detection
The embodiment of the invention provides a UAV sequence monocular image based method for distance detection ground feature under a power line. The method is characterized in that GPS-assisted aerotriangulation is performed through sequence monocular camera image with absolute GPS positioning information; then dense three-dimensional point cloud and a stereo measurement conductor vector model of the ground feature under the power line are obtained based on the aerotriangulation result; and then the safe distance detection of the ground feature under the power line can be realized based on the conductor vector model and the dense three-dimensional point cloud of the ground feature under the power line. Therefore, the safe distance detection of the ground feature under the power line can be quickly automatically achieved with high accuracy, and as a result, the technical problem that an existing method for the distance detection of the ground feature under the power line can realize accurate measuring under high measuring condition or needs manual assistance can be solved. The embodiment of the invention also provides a UAV sequence monocular image based device for the distance detection of the ground feature under the power line.
Owner:ELECTRIC POWER RES INST OF GUANGDONG POWER GRID

Image processing apparatus, image processing and editing apparatus, image file reproducing apparatus, image processing method, image processing and editing method and image file reproducing method

In generating stereo image data based on a plurality of monocular images of the same subject with a predetermined parallax, a metadata generating section generates collateral data, related to stereo image data, and an image file generating section generates information related to a date and time at which collateral data was generated or updated. Stereo image data and collateral data are synthesized into an image file. Information, related to a date and time at which collateral data was generated or updated, and information, related to a date and time at which the image file is generated or updated, are further added for conversion to a predetermined file format to generate the image file. Using information related to such date and time makes it possible to appropriately process, edit and reproduce a subsequent stereo image even when the stereo image is processed and edited with non-3D-compliant equipment or software.
Owner:OLYMPUS CORP

Method of restoring three-dimensional human body posture from unmarked monocular image in combination with height map

The invention discloses a method of restoring a three-dimensional human body posture from an unmarked monocular image in combination with a height map. The method comprises the following steps: 1) a color image and a height image are used for training to obtain a deep convolutional network-based two-dimensional joint point recognition model; 2) a video frame image sequence and a camera parameter are inputted, and a height map corresponding to each frame of image is calculated; 3) the video frame image and the height map obtained in the second step are inputted, and the two-dimensional joint point recognition model obtained through training the first step is used to obtain two-dimensional joint point coordinates of a human body in each frame of image; and 4) the two-dimensional joint point coordinates obtained in the third step are inputted, and the human body three-dimensional posture is restored according to an optimization model. During the two-dimensional joint point recognition process, the color image and the height image are used integrally, and the two-dimensional joint point recognition accuracy is improved; and time sequence consistency constraints are added to the optimization model which can restore the three-dimensional human body posture from the two-dimensional joint point, and thus the restored three-dimensional human body posture is closer to the real human body posture.
Owner:ZHEJIANG UNIV

Three-dimensional target detection method and device and storage medium

The invention provides a three-dimensional target detection method and device and a storage medium, and the method comprises the steps: setting a first coordinate center of a target object in a monocular image as a second coordinate center of a 3D bounding box; setting space coordinate constraints of the 3D bounding box according to external parameters and internal parameters, setting a directionloss function and a size loss function of the 3D bounding box, and generating a model loss function; and training the convolutional neural network model by using the monocular image training sample and based on the spatial coordinate constraint and the model loss function so as to perform three-dimensional target detection processing on the monocular image. The spatial coordinate constraint, the direction loss function and the size loss function are set, and the convolutional neural network model is trained to construct the multi-task neural network, so that 3D target detection of a monocularimage can be realized; the efficiency and precision of three-dimensional target detection can be improved, and the use cost is reduced.
Owner:JINGDONG TECH HLDG CO LTD

Monocular 3D pose estimation and tracking by detection

Methods and apparatus are described for monocular 3D human pose estimation and tracking, which are able to recover poses of people in realistic street conditions captured using a monocular, potentially moving camera. Embodiments of the present invention provide a three-stage process involving estimating (10, 60, 110) a 3D pose of each of the multiple objects using an output of 2D tracking-by detection (50) and 2D viewpoint estimation (46). The present invention provides a sound Bayesian formulation to address the above problems. The present invention can provide articulated 3D tracking in realistic street conditions.The present invention provides methods and apparatus for people detection and 2D pose estimation combined with a dynamic motion prior. The present invention provides not only 2D pose estimation for people in side views, it goes beyond this by estimating poses in 3D from multiple viewpoints. The estimation of poses is done in monocular images, and does not require stereo images. Also the present invention does not require detection of characteristic poses of people.
Owner:TECH UNIV DARMSTADT +1

Front face feature-based vehicle type recognition method

The invention provides a front face feature-based vehicle type recognition method, which comprises the following parts: S01, executing an image histogram information-based road surface vehicle automatic extraction method: analyzing road surface images sent back by a traffic checkpoint on a road, and extracting possible vehicle areas in the road surface images by adopting a monocular image analysis method; S02, executing a color and gradient information-fused vehicle front face interception method: analyzing the color and the gradient information of a target in vehicle area images obtained in the step S01 to complete the interception of a vehicle front face; S03, performing heterogeneous sample analysis-based vehicle type online training, and establishing vehicle templates of various vehicle types; S04, executing a vehicle front face feature subspace-based vehicle type judging method: matching the vehicle front face intercepted in the step S02 and the vehicle templates obtained in the step S03 to obtain the judging decision of the vehicle types. According to the front face feature-based vehicle type recognition method disclosed by the invention, the automatic recognition of the vehicle types can be accurately performed, and the daily work of relevant departments requiring vehicle type information is greatly facilitated.
Owner:江苏博世建设有限公司

Three-dimensional image quality objective evaluation method based on visual fidelity

The invention discloses a three-dimensional image quality objective evaluation method based on visual fidelity. The method includes: in a training stage, selecting multiple original distortionless three-dimensional images to form a training image set, determining whether pixel points in the distortionless three-dimensional images belong to a shielding area or a matching area through area detection, and structuring a monocular vision dictionary table and a binocular vision dictionary table to the training image set through an unsupervised learning mode; in a testing stage, for testing three-dimensional images and the original distortionless three-dimensional images, estimating sparse coefficient array of each subblock, belonging to the shielding area and the matching area, in the testing three-dimensional images and the corresponding distortionless three-dimensional images according to the monocular vision dictionary table and the binocular vision dictionary table, calculating monocular image quality objective evaluation prediction value and binocular image quality objective evaluation prediction value through the sparse coefficient array, and finally combining to acquire an image quality evaluation predication value. The three-dimensional image quality objective evaluation method has the advantage that the acquired image quality objective evaluation predication value is highly uniform with a subjective evaluation value.
Owner:NINGBO UNIV

Monocular vision scene depth estimation method based on deep learning

The invention relates to a monocular vision scene depth estimation method based on deep learning, and the method comprises the steps of employing a VGG-13 network model, employing a depth separable convolution layer to replace a standard convolution layer so as to reduce the model parameter quantity, and obtaining a network model which can be used for obtaining a parallax image; inputting the monocular image into the trained network model to generate the disparity maps of multiple scales, and generating a single disparity map consistent with the input image in scale by combining multi-scale fusion and disparity map smoothing; and generating a corresponding depth image according to the geometric transformation relationship between the disparity map and the depth map in the multi-view geometry. The method has the beneficial effects that the simple and easily available binocular visible light image is used for training the network model without acquiring the high-cost real depth data, andthe standard convolution is replaced with the depth separable convolution, so that the parameter quantity of the network model can be reduced to one seventh of the previous parameter quantity, and the reasoning speed of the model is increased.
Owner:NORTHWESTERN POLYTECHNICAL UNIV

Object detecting apparatus and object detecting method

An object of the present invention is to further improve the detection accuracy of a lateral position of a target in an object detecting apparatus for detecting an object by using a radar and a monocular image sensor. In the present invention, a target corresponding to a target recognized by the radar is extracted and a right edge and a left edge of the target are acquired from an image picked up by the monocular image sensor. Further, locus approximation lines, which are straight lines or predetermined curved lines for approximating loci of the right edge and the left edge, are derived for the both edges. The edge, which has a larger number of edges existing on the locus approximation line, is selected as a true edge of the target from the right edge and the left edge. The lateral position of the target is derived on the basis of the position of the selected edge.
Owner:TOYOTA JIDOSHA KK

Monocular image depth-of-field real-time calculation method based on unsupervised deep learning

The invention discloses a monocular image depth-of-field real-time calculation method based on unsupervised deep learning, and the method comprises the steps: constructing a supervision signal throughemploying a geometric constraint relationship between binocular sequence images, replacing a conventional manual mark data set, and completing the design of an unsupervised algorithm. In a Depth-CNNnetwork, a loss function not only considers geometric constraints between images, but also designs depth-of-field estimation result consistency constraint terms for the left image and the right image,so that the algorithm accuracy is improved; the output of Depth-CNN as a part of the Pose-CNN input to construct an overall objective function, and the geometric relationship between binocular imagesand the geometric relationship between sequence images are used to construct a supervision signal, thereby further improving the accuracy and robustness of the algorithm.
Owner:XIAMEN UNIV +1

Automobile cruise control method based on monocular vision and implement system thereof

The invention discloses a vehicle cruise control method based on monocular vision and a system for realizing thereof. The methods comprises the following steps: a camera is fixedly arranged near a rearview mirror in front of a driver; a monocular image is obtained from the camera; segmentation and automobile detection is carried out for the image; if no car is in the front, a cruise speed is kept or restored and treatment of current frame image data is finished; if a car is detected in the front, distance and speed of the car are calculated; if the distance of the front car is less than the safe distance, appropriate braking is carried out through an automobile braking control system; or the treatment of the current frame image data is finished to switch to the treatment of next frame image data. The method is realized by a system which consists of the camera, an embedded calculation module, a displayer and a loudspeaker. The vehicle cruise control method has the advantages of high security, small calculation complexity, low requirement for hardware configuration, easy butch production and low cost.
Owner:NANJING UNIV OF SCI & TECH

Method and device for detecting road obstacle

The invention discloses a method and device for detecting a road obstacle. The method comprises: acquiring road images through a binocular device and processing the road images to obtain a disparity map of the road images; carrying out pavement-information-based image segmentation to obtain a binary image with an obstacle region as a foreground and other regions as a background; carrying out morphological operation on the binary image to obtain a to-be-detected region template of a suspected obstacle; carrying out information fusion of the to-be-detected region template and a monocular image being one of the road images to obtain a grayscale image of the to-be-detected region; and inputting the grayscale image into a preset obstacle detection model to carry out obstacle determination on the road images and determining the suspected obstacle based on a determination result. According to the invention, on the basis of fusion of respective advantages of binocular disparity information andmonocular image information, machine-learning-based target recognition and target classification are carried out to obtain a quick and accurate obstacle detection result.
Owner:BEIJING SMARTER EYE TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products