Three-dimensional model classification and retrieval method based on visual saliency information sharing
A 3D model and information sharing technology, which is applied in neural learning methods, biological neural network models, character and pattern recognition, etc., can solve the problems of large amount of information in 3D models, high time and space complexity of 3D model classification and retrieval tasks, etc. , to achieve the effect of comprehensive description, increasing scientificity and accuracy, increasing flexibility and stability
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0031] A 3D model classification and retrieval method based on visual saliency information sharing, see figure 1 , mainly includes three parts: one is attention-based view weight calculation; the other is view attention pooling; the final shape descriptor generation, the specific implementation steps are as follows:
[0032] 101: Given a 3D model, extract a view every 30 degrees around the Z-axis of the 3D model, and extract the feature descriptor of each virtual image through a deep convolutional neural network;
[0033] 102: The feature descriptor is used as the input of the visual saliency branch, the weight of the view is generated through the first LSTM module and the soft attention mechanism, and the feature descriptor of the visual saliency branch is generated through the second LSTM module;
[0034] 103: Also use the feature descriptor as the input of the MVCNN branch, use the view weight to guide the fusion of visual information in the MVCNN model, and then obtain the...
Embodiment 2
[0055] Combined with the following network structure, figure 1 , figure 2 The scheme in Example 1 is further introduced, see the following description for details:
[0056] For extracting the first feature descriptor, the present invention takes the Z axis as the center of rotation, samples the perspective of the 3D model at an interval of 30°, and extracts the feature descriptor of the view through a mature deep convolutional neural network, as follows:
[0057] 1. Normalize each 3D model with NPCA (3D Principal Component Analysis) method. Then, a visualization tool developed by OpenGL acts as a human observer to extract a view every 30 degrees around the Z-axis direction of each 3D model. 12 views are extracted to represent the visual and structural information of the 3D model. Thus, these views can be seen as a sequence of images v 1 ,v 2 ,...v 12 , which is very important for the network structure of the present invention.
[0058] 2. Use the network structure of C...
Embodiment 3
[0071] Below in conjunction with specific example, the scheme in embodiment 1 and 2 is carried out feasibility verification, see the following description for details:
[0072] The database in the embodiment of the present invention is based on ModelNet40 and ShapeNetCore55. ModelNet40 is a subset of ModelNet that contains 12,311 CAD models grouped into 40 categories. The model has been manually cleaned, but the pose has not been normalized. The ModelNet40 models used in the embodiment of the present invention are all in *.off format. ShapeNetCore55 is a subset of ShapeNet, which contains 55 categories and about 51,300 3D models. Each category is subdivided into several subcategories, including 70% training set, 10% validation set, and 20% test set. The ShapeNetCore55 model used in the embodiment of the present invention is in *.obj format.
[0073] The following table shows the accuracy of different parts of the network in the ModelNet40 data set for classification experime...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


