Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

309 results about "Feature aggregation" patented technology

Lightweight semantic segmentation method for high-resolution remote sensing image

ActiveCN112183360ASolve the inefficiency of operationRun fastScene recognitionComputation complexityEncoder decoder
A lightweight semantic segmentation method for a high-resolution remote sensing image comprises the steps of network construction, training and testing. Specifically, a deep semantic segmentation network of an encoder-decoder structure is constructed for a pytorch deep learning framework, after network training is carried out based on a remote sensing image data sample set, a to-be-tested remote sensing image serves as network input. A segmentation result of the remote sensing image is obtained. According to the method, on one hand, model parameters are reduced by decomposing depth separable convolution, the calculation complexity is reduced, the semantic segmentation time of the high-resolution remote sensing image is shortened, and the semantic segmentation efficiency of the high-resolution remote sensing image is improved; and on the other hand, semantic segmentation precision is improved through multi-scale feature aggregation, a spatial attention module and gating convolution, sothat the proposed lightweight deep semantic segmentation network can accurately and efficiently realize semantic segmentation of a high-resolution remote sensing image.
Owner:SHANGHAI JIAO TONG UNIV

Point cloud data classification method based on deep learning

ActiveCN110197223AGuaranteed affine transformation invarianceExcellent division effectCharacter and pattern recognitionPoint cloudData set
The invention discloses a point cloud data classification method based on deep learning. The method provides a multi-scale point cloud classification network, and comprises the steps of firstly, providing a multi-scale local area division algorithm on the basis of completeness, adaptivity, overlap and multi-scale characteristic requirements of the local area division, and obtaining a multi-scale local area by taking the point cloud and the characteristics of different levels as input; and then constructing the multi-scale point cloud classification network comprising a single-scale feature extraction module, a low-level feature aggregation module, a multi-scale feature fusion module and the like. The network fully simulates the action principle of the convolutional neural network, and hasthe basic characteristics that the local receptive field becomes larger and larger and the feature abstraction degree becomes higher and higher along with the increase of the network scale and depth.The method of the invention respectively obtains the 94.71% and 91.73% classification accuracies at the standard public data set ModelNet 10 and ModelNet 40, is in a leading or equivalent level in thesimilar work, and the feasibility and effectiveness of the method are verified.
Owner:BEIFANG UNIV OF NATITIES

Cross-mode pedestrian re-identification method and system based on a heterogeneous hierarchical attention mechanism

The invention provides a cross-modal pedestrian re-identification method and system based on a heterogeneous hierarchical attention mechanism, and the method comprises the steps: extracting pedestrianimage features and text description features, and enabling the pedestrian image features and the text description features to serve as initial global features of a pedestrian image channel and a textdescription channel; Establishing a heterogeneous hierarchical attention model, and enhancing pedestrian picture features and text description features by using a bidirectional cross-mode fine-grained matching attention mechanism and a context-guided local feature aggregation attention mechanism by the model; A heterogeneous hierarchical attention model is trained in a two-stage training mode, preliminary training is carried out in the first stage by utilizing pedestrian category supervision information, training in the second stage is carried out by utilizing cross-modal samples to match thepedestrian category supervision information on the basis, and pedestrian re-identification is carried out by utilizing the trained model. The pedestrian re-identification method and device can improve the accuracy of pedestrian re-identification.
Owner:中科人工智能创新技术研究院(青岛)有限公司

Crowd counting model construction method, counting method and device

The invention relates to a crowd counting model construction method, counting method and device, and the method comprises the steps: carrying out the head labeling of a crowd image in a pre-stored data set, obtaining a crowd density map, and enabling the crowd image and the crowd density map corresponding to the crowd image to be combined, and obtaining a target data set; performing data amplification processing on a training set obtained by dividing the target data set according to a first preset proportion to obtain a target training set; and finally, training the multi-scale feature aggregation convolutional neural network model after parameter initialization processing by using the training crowd image and the training crowd density map in the target training set to obtain a crowd counting model. By the adoption of the technical scheme, when the model is trained and applied, the problem of scale change of figures in the image can be solved, the calculation amount can be reduced, the recognition efficiency can be improved, a perspective drawing does not need to be used in the training process, and the applicability of the model is improved.
Owner:CHANGSHA UNIVERSITY OF SCIENCE AND TECHNOLOGY

Frame level feature aggregation method for video target detection

The invention provides a frame level feature aggregation method for video target detection, and relates to the technical field of computer vision. The invention provides a frame-level feature aggregation method for video target detection. The method comprises the following steps: firstly, extracting deep features from a single-frame image through a feature network; extracting an inter-frame optical flow by using an optical flow network FlowNet; aligning the frame-level features of the adjacent frames to the current frame based on the optical flow to realize frame-level feature propagation; finally, calculating the scaling cosine similarity weight through the mapping network and the weight scaling network, and using the scaling cosine similarity weight to aggregate the multi-frame featuresto generate the aggregated features. According to the frame-level feature aggregation method for video target detection provided by the invention, the weight distribution is more reasonable, and the aggregated features are input into the video target detection network, so that video detection in complex scenes such as motion blur, low pixels, lens zooming and shielding can have a better detectioneffect and robustness.
Owner:NORTHEASTERN UNIV

License plate detection model based on improved YOLOv3 network and construction method

The invention discloses a license plate detection model based on an improved YOLOv3 network and a construction method. The improved YOLOv3 network is used for inputting a license plate image and extracting three feature maps of different scales; carrying out up-sampling on the obtained three feature maps with different scales, scaling depth features to the same proportion, then carrying out down-sampling, and carrying out decoding through a constructed convolution layer to generate a feature map after feature enhancement; performing feature aggregation on the generated feature maps of the three different scales after feature enhancement and the feature maps of the three different scales extracted from the YOLOv3 feature extraction network to generate a feature pyramid, and obtaining an improved license plate detection model of the YOLOv3 network; and training the license plate detection model to obtain a final model. According to the method, the detection speed is greatly improved, thepyramid multi-scale feature network is introduced to enhance the features of the backbone network and generate a more effective multi-scale feature pyramid, and the features are better extracted fromthe input image.
Owner:NANJING UNIV OF POSTS & TELECOMM

Image semantic segmentation method and device, electronic equipment and readable storage medium

The invention discloses an image semantic segmentation method and device and electronic equipment. The method is implemented through a semantic segmentation model comprising a feature extraction module, a feature aggregation module and a feature fusion module, and comprises the following steps: extracting shallow features and deep features of a target image through the feature extraction module, and constructing a feature pyramid of the target image according to the deep features; wherein the feature pyramid comprises deep features of the corresponding image on different scales; aggregating the deep features of different scales in the feature pyramid of the target image through a feature aggregation module to obtain an aggregated feature map; and fusing the shallow features of the target image and the aggregation feature map through a feature fusion module to obtain a fusion feature map, and obtaining a corresponding semantic segmentation result according to the fusion feature map.
Owner:QINGDAO RES INST OF BEIHANG UNIV

RGBD saliency detection method based on feature aggregation

The invention provides an RGBD saliency detection method based on feature aggregation. The RGBD saliency detection method comprises the steps that firstly, an input image is preprocessed; then constructing a significance detection network; the saliency detection network comprises a feature extraction network and a feature aggregation network. The feature extraction network is a pair of asymmetricdouble-flow backbone networks constructed based on ResNet50, and is divided into an RGB image feature extraction branch and a depth image feature extraction branch. The feature aggregation network comprises K nearest neighbors GNNs, a region enhancement module, a hierarchical fusion module and a block Non-local module. And finally, training a saliency detection network, and performing saliency detection through the trained saliency detection network. According to the method, 2D appearance and 3D geometric information are efficiently combined and reasoned, information from two different modes of an RGB image and a depth image is fully fused, and the multi-scale expression ability of the model is further improved through the hierarchical fusion module, so that rough-level features and fine-level features are well fused together.
Owner:HANGZHOU DIANZI UNIV

Target identification method based on quality evaluation

ActiveCN108765394ASolve the problem of object recognitionValid descriptionImage enhancementImage analysisImaging qualityGoal recognition
The invention provides a target identification method based on quality evaluation. The target identification method based on quality evaluation includes the steps: constructing a target identificationmodel which includes a quality evaluation network, a feature extraction network, and a feature aggregation network, wherein the target identification model is used for extracting the target feature from a video so as to characterize the overall structural information and local information of the target; training the target identification model, and adjusting the parameters of the quality evaluation network and the feature extraction network during the training process so as to enable the target identification model to output the target feature according with the preset demand; and performingtarget identification on the video through the trained target identification model. Therefore, the target identification method based on quality evaluation solves the target identification problem caused by changeable appearance and irregular image quality in a video sequence, and adds the interframe correlation information in quality evaluation so as to obtain more effective target information toenable characterization of the target to be more accurate, thus improving the identification accuracy.
Owner:SHANGHAI JIAO TONG UNIV

Optical flow multilayer frame feature propagation and aggregation method for video target detection

The invention provides an optical flow multilayer frame feature propagation and aggregation method for video target detection, and relates to the technical field of computer vision. The method comprises the following steps: firstly, extracting multilayer features of adjacent frames through a feature network, extracting optical flow through an optical flow network, then propagating multilayer framelevel features of a previous frame of a current frame and a next frame of the current frame to the current frame by utilizing the optical flow, and performing up-sampling or down-sampling on the optical flow by layers with different step lengths to obtain multilayer propagation features; and then sequentially aggregating propagation characteristics of each layer by layer, and finally generating multi-layer aggregated frame level characteristics for final video target detection. According to the optical flow multilayer frame feature propagation and aggregation method oriented to video target detection provided by the invention, the output frame level aggregation feature has the advantages of high shallow network resolution and deep network high-dimensional semantic feature, the detection performance can be improved, and the detection performance of the multilayer feature aggregation method on a small target is improved.
Owner:NORTHEASTERN UNIV

Remote sensing image semantic segmentation method, system and equipment and storage medium

The invention discloses a remote sensing image semantic segmentation method, system, equipment and a storage medium, belongs to the field of image processing, and aims to solve the technical problems of low semantic segmentation precision and low segmentation efficiency. The method comprises the following steps: constructing, training and testing a network, wherein the network is specifically a deep semantic segmentation network of an encoder-decoder structure constructed by a Pytorch deep learning framework; performing network training based on the remote sensing image data sample set; and taking a to-be-measured remote sensing image as network input to obtain a segmentation result of the remote sensing image. On one hand, model parameters are reduced through a bottleneck type module, depth separable convolution, asymmetric convolution, convolution with holes and the like, the calculation complexity is reduced, and the time of remote sensing image semantic segmentation is shortened; on the other hand, the semantic segmentation precision is improved through multi-scale feature aggregation and a mixed attention module, so that the provided remote sensing image semantic segmentation network can accurately and efficiently realize the semantic segmentation of the remote sensing image.
Owner:HUANENG CLEAN ENERGY RES INST +2

Robot loopback detection method and device

The invention provides a robot loopback detection method and device, and the method comprises the steps: obtaining a current image collected by a robot, inputting the current image to a densely connected convolutional neural network DenseNet, and obtaining a global feature; Wherein the densely connected convolutional neural network DenseNet is composed of multiple layers of dense blocks, and eachlayer of dense blocks is connected with other layers of dense blocks in a feedforward mode; According to a feature mapping decoupling algorithm, decoupling the global feature to obtain a local feature; Encoding the local features according to a weighted local feature aggregation descriptor encoding algorithm to obtain an encoding result; And calculating a first local sensitive hash value corresponding to the coding result, and determining a target image similar to the current image according to the first local sensitive hash value. According to the method, the robustness of resisting transformation of view angles, illumination, seasons and the like when the robot performs loop detection can be improved, and meanwhile, the recognition capability of different scenes including similar textures or similar surface features is improved.
Owner:TSINGHUA UNIV

Training method and device of RGBT target tracking model

The invention discloses a training method and device of a RGBT target tracking model, and the method comprises the steps: 1) building a tracking model which consists of a dense feature aggregation module and a classification module in sequence, and the dense feature aggregation module comprising a first convolution layer sequence for extracting the features of a visible light image; extracting a second convolution layer sequence of the thermal infrared image features, wherein the convolution layer with the same depth as the second convolution layer in the first convolution layer is a paired convolution layer; wherein the paired convolution layers except the first paired convolution layer correspond to one feature aggregation layer, and the convolution result of the first paired convolutionlayer is input into the feature aggregation layer of the next paired convolution layer; the classification module comprising a plurality of full connection layers which are sequentially connected inseries; and 2) training a tracking model by using the pre-marked visible light image sample and the pre-marked thermal infrared image sample to obtain a target tracking model. According to the method,a target identification result can be more accurate.
Owner:ANHUI UNIVERSITY

Pedestrian detection method and apparatus based on Haar-like intermediate layer filtering features

The present invention discloses a pedestrian detection method based on Haar-like intermediate layer filtering features, comprising the steps of: extracting object features of various training images in a training image set, training an Adaboost classifier based on decision trees by using the extracted object feature data to obtain a classification model; extracting object features of an image to be detected under a plurality of scales and inputting the object features to the classification model to obtain a pedestrian detection result, wherein a method for extracting the object features comprises the following steps: respectively extracting a plurality of different channel features of an original image to obtain multiple channel feature patterns of the original image; respectively performing downsampling for each channel feature pattern; respectively extracting corresponding Haar-like features of each channel feature pattern which has been subjected to the downsampling by using a group of preset Haar-like feature templates; and clustering all the Haar-like features of the original image into the object features of the original image. The invention also discloses a pedestrian detection apparatus based on Haar-like intermediate layer filtering features. The pedestrian detection method and the pedestrian detection apparatus can effectively improve pedestrian detection performance.
Owner:SOUTHEAST UNIV

Graph embedding method and device, and storage medium

The invention provides a graph embedding method and device, and a storage medium, and the method comprises the steps: reading graph structure data and node feature values in a target graph, and building a graph structure model; regarding each node in the graph structure model as a target node, and sampling a first-order neighbor node of each target node according to the non-uniform neighbor node sampling function to obtain a first-order neighborhood of each target node; constructing second-order neighborhoods of the target nodes according to the first-order neighborhoods of the target nodes, aggregating the second-order neighborhoods to the first-order neighborhoods corresponding to the target nodes, and inputting the aggregated features of the second-order neighborhoods into the fully-connected neural network to obtain new features of the first-order neighborhoods of the target nodes; and aggregating the new features to the corresponding target nodes, and inputting the aggregated newfeatures of the first-order neighborhood into the fully-connected neural network to obtain output features of the target nodes. The neighborhood can be flexibly and effectively constructed for each node in the graph, and feature aggregation can be rapidly carried out, so that the graph embedding effect based on the graph neural network is improved.
Owner:GUILIN UNIV OF ELECTRONIC TECH

Automatic problem solving method for application problem based on graph neural network

The invention discloses an automatic problem solving method for an application problem based on a graph neural network, and the method comprises the steps: firstly employing a cyclic neural network tocode an inputted application problem text, constructing a numerical value unit graph and a numerical value comparison graph, and enabling the output (word-level representation) of the cyclic neural network to serve as a node feature; inputting node features and two constructed graphs into a graph neural network-based encoder together to learn graph representation features of questions, so that the final graph features can contain text relationships and size information of numerical values; using one pooling item for aggregating the graph features of different groups into one, so that the output of the graph converter is obtained; finally, using the output graph features as inputs to a tree structure-based decoder to generate a final solution expression tree. According to the method, the task performance is improved through numerical representation in rich problems, and a better problem solving effect can be achieved.
Owner:UNIV OF ELECTRONICS SCI & TECH OF CHINA +1

Single image rain removing method based on multi-scale aggregation features

The invention belongs to the field of computer vision, and relates to a single image rain removing method based on multi-scale aggregation features. The method is based on a multi-scale feature aggregation dense connection convolutional network framework and is composed of encoding-decoding networks, and each encoding network corresponds to one decoding network; and the coding network records theposition of the maximum pooling index in the maximum pooling process through dimensionality reduction and down-sampling of the maximum pooling layer, and the pooling index guides the up-sampling recovery process of the corresponding decoding network. The coding network and the decoding network are the same in a convolution layer, are both feature aggregation densely connected with a convolution module, and only have the maximum pooling process different from the corresponding upsampling process. According to the method, the rain stripes with different densities can be effectively removed, andmeanwhile, the details of the image are well reserved.
Owner:DALIAN UNIV OF TECH

Object-level depth feature aggregation method for image retrieval

The invention relates to the field of digit media and provides an object-level depth feature aggregation method for image retrieval. First, an unsupervised method is used to generate candidate regions that may contain objects; then corresponding convolution neural network characteristics are extracted; finally, the area features are aggregated to obtain image feature representation having high robustness for image transformation for the use of image retrieval applications. The present invention addresses the lack of geometric transformation and spatial layout invariance of existing models, and the object-based mode is adopted to solve the problems in the prior art; the image features generated by the method have high robustness on image geometric transformation and spatial arrangement transformation; the accuracy of image retrieval is increased; the obtain image is quit compact and concise so that complexity of similarity calculation among images is reduced and retrieval efficiency is increased.
Owner:DALIAN UNIV OF TECH

A music automatic labeling method based on label depth analysis

InactiveCN109918535AImprove performanceOvercoming problems such as poor learning effectMetadata audio data retrievalNeural architecturesLearning basedData set
The invention discloses a music automatic labeling method based on label depth analysis. The method comprises the following steps: S1, collecting music data and cleaning the data by combining a musiclabel system; S2, sampling the music data, converting the music data into a Mel-frequency spectrogram, and slicing the Mel-frequency spectrogram; S3, constructing an audio multi-level feature extraction network based on the one-dimensional convolutional network, and performing parameter pre-training through supervised learning; S4, performing music label vector representation learning based on thetwo-dimensional convolutional network, and obtaining music label characteristics; S5, realizing feature aggregation of the audio multi-level features and the music tag features; and S6, performing final music label prediction based on the aggregation characteristics. According to the method, the difficulty that a traditional music labeling mode cannot be applied to a large-scale music data set isovercome, the music is automatically labeled according to the audio content, the workload of manually maintaining a music label library is reduced, and the method has very good usability.
Owner:SOUTH CHINA UNIV OF TECH

Visual rich document information extraction method for actual OCR scene

ActiveCN112801010ASolve the problem of extraction error accumulationDecoupling prediction resultsSemantic analysisSpecial data processing applicationsPattern recognitionNamed entity classification
The invention discloses a visual rich document information extraction method for an actual OCR scene. The method comprises the following steps: collecting a visual rich text image in the actual scene; extracting text word embedding features and position embedding features of character levels and word levels by utilizing a pre-training word embedding model; training a named entity classification module; constructing a global document graph structure based on graph convolution GAT, and introducing a self-attention mechanism; training a named entity boundary positioning module; constructing a multi-feature aggregation structure; and training an error semantic correction module, adopting a decoding structure of a GRU, extracting a coding hidden state of a corresponding dimension feature according to an optimal path of a CRF, and guiding output of a decoder every time by taking category information of a named entity as prior guidance information to obtain entity naming information in a standard format. According to the visual rich document information extraction method, the precision of the visual rich document information extraction method in actual OCR detection and recognition application is effectively improved, and the visual rich document information extraction method is of great significance to structured storage of visual rich document information.
Owner:SOUTH CHINA UNIV OF TECH

Scene text detection method and system based on sequential deformation

A method and a system for detecting a scene text may include extracting a first feature map for a scene image input based on a convolutional neural network, and delivering the first feature map to a sequential deformation module; obtaining sampled feature maps corresponding to sampling positions by performing iterative sampling for the first feature map, obtaining a second feature map by performing a concatenation operation in deep learning according to a channel dimension for the first feature map and the sampled feature maps; obtaining a third feature map by performing a feature aggregation operation for the second feature map in the channel dimension, and delivering the third feature map to the object detection baseline network; and performing text area candidate box extraction for the third feature map and obtaining a text area prediction result as a scene text detection result through regression fitting.
Owner:TSINGHUA UNIV +2

A social network rumor identification method based on feature aggregation

ActiveCN109685153ASolve the problem that it is difficult to deal with heterogeneous informationSolve problems that are difficult to feed into machine learning modelsData processing applicationsCharacter and pattern recognitionStudy methodsData quality
The invention discloses a social network rumor identification method based on feature aggregation, and the method comprises the steps: designing time sequence propagation mode features acceptable by adeep neural network and time sequence text features, constructing a rumor detection model by using a feature aggregation technology, and carrying out the final detection and early detection of a rumor. The problem that propagation mode characteristics of social network event propagation are difficult to serve as machine learning model input is solved, the propagation mode characteristics do not depend on characteristic engineering and field knowledge, the influence of various factors in the actual propagation process is comprehensively embodied, and the method can be effectively applied to different rumor identification scenes; The defect that the quality of feature data is reduced due to huge difference of the number of messages contained in different samples is avoided, the problem thata single model is difficult to deal with heterogeneous information in a traditional machine learning method is solved, and compared with an existing rumor identification method, the accuracy is obviously improved.
Owner:WUHAN UNIV

Spatial-temporal feature aggregation method and system combined with attention mechanism and terminal

PendingCN111967310AImprove pedestrian recognition rateBiometric pattern recognitionNeural architecturesTime domainAlgorithm
The invention provides a spatial-temporal feature aggregation method and system combined with an attention mechanism, and a terminal, and the method comprises the steps: extracting the spatial domainfeatures of a pedestrian in a deep network through a convolutional neural network, and obtaining the time domain features of the pedestrian through the spatial domain features comprehensively extracted through a recurrent neural network; respectively generating corresponding quality-sensitive attention scores and frame-sensitive attention scores by adopting a feature extraction network so as to dynamically fuse spatial domain and time domain features; carrying out linear superposition fusion to obtain quality-sensitive spatial domain features and frame-sensitive time domain features to obtainpedestrian space-time feature expression; carrying out network training on the upper, middle and lower parts of a pedestrian to obtain corresponding local features with complementary properties, and obtaining feature expressions with higher discrimination through splicing. The method and system have good robustness, and can better solve and adapt to the conditions of shielding, light change and the like; and by combining the spatial domain and time domain characteristics of the pedestrian, the detail characteristics of the pedestrian are mined, so that the method and system can play better performance and efficiency in the next step of pedestrian recognition.
Owner:SHANGHAI JIAO TONG UNIV

Method and device for processing event roles in text, equipment and storage medium

The invention provides a method and device for processing event roles in a text, electronic equipment and a storage medium. The method comprises the steps of encoding words in a text through an encoder in a role processing model to obtain encoding information of the words in the text; performing feature aggregation on the encoding information of the words in the text through an encoder to obtain aggregation features of the text; associating a plurality of upper-layer concepts of at least one event role with the encoding information of the words in the text through an attention model in the role processing model to obtain a semantic vector of the text facing the event role; and classifying at least one semantic vector, facing the event role, of the text and the aggregation feature of the text through a classifier in the role processing model to obtain the event role corresponding to the text. Through the method and the device, the event roles in the text can be automatically and accurately extracted according to the upper-layer concept of the event roles.
Owner:TSINGHUA UNIV +1

Panoramic segmentation method with bidirectional connection and shielding processing

The invention discloses a panoramic segmentation method with bidirectional connection and shielding processing. The method comprises the following steps: 1) obtaining a data set for training panoramicsegmentation, and defining an algorithm target; 2) performing feature learning on intra-group images by using a full convolutional network; 3) extracting semantic features from a feature map throughsemantic feature extraction branches; 4) extracting instance features from the feature map through instance feature extraction branches; 5) establishing connection from instance segmentation to semantic segmentation, and aggregating the semantic features and the instance features to perform semantic segmentation; 6) establishing connection from semantic segmentation to instance segmentation, and aggregating the instance features and the semantic features to perform instance segmentation; and 7) using an occlusion processing algorithm, fusing the results of semantic segmentation and instance segmentation, and outputting the result of panoramic segmentation. According to the method, the complementarity between the semantic segmentation and the instance segmentation is fully utilized, meanwhile, the occlusion processing algorithm provided by the apparent information of the bottom-layer features is applied, and the panoramic segmentation of the image is efficiently completed.
Owner:ZHEJIANG UNIV

Vehicle tire pressure anomaly recognition method and device and data analysis equipment

The invention provides a vehicle tire pressure anomaly recognition method and device and data analysis equipment, and relates to the technical field of vehicle safety performance detection. The methodcomprises the following steps: firstly, obtaining tire pressure values of a plurality of vehicles in a preset time period every day; then, extracting a tire pressure sample value from the tire pressure value, and extracting tire pressure feature data meeting feature aggregation based on the tire pressure sample value; and finally, judging whether the tire pressure of each vehicle is abnormal or not according to whether the tire pressure characteristic data of each vehicle is located in the data range of high-density aggregation of the tire pressure characteristic data or not. According to thetire pressure value of the vehicle every day, the vehicle with abnormal tire pressure can be identified when the tire pressure of the vehicle is slowly reduced, and a vehicle owner can conveniently know the tire pressure condition of the vehicle in time.
Owner:亚美智联数据科技有限公司

Target detection method based on dense connection structure

PendingCN112541532AAlleviating the Gradient Descent ProblemImprove detection efficiencyImage enhancementImage analysisPattern recognitionData set
The invention provides a target detection method based on a dense connection structure, and the method comprises the steps: defining a to-be-detected target type, labeling a target object in collectedimage data, obtaining the actual frame of the target object in the image data, and marking the target type of the target object, and obtaining a data set; constructing a target detection network model composed of a basic network module, a feature fusion module, a dense connection module and a feature aggregation module, and determining a loss function; training the constructed target detection network model by using the data set until the loss function converges, finishing the training process, and storing the corresponding weight parameter at the moment to obtain a trained target detection network model; and inputting the image of the to-be-detected target category into the trained target detection model to realize target detection. According to the method, a dense connection mode and afeature fusion and aggregation mode are combined, so that the feature extraction capability is improved, the gradient descent problem is relieved, and the detection efficiency and accuracy are effectively improved.
Owner:CHANGSHA UNIVERSITY OF SCIENCE AND TECHNOLOGY

Video annotation method based on multiple modes

The invention discloses a video annotation method based on multiple modes, and belongs to the technical field of computer vision and video annotation. The method comprises the following steps: extracting a key frame of a video through a clustering method; extracting features of the key frames, and aggregating continuous key frame features through a learning pool to generate visual features of thevideo; extracting audios in the video, and dividing the audios into a plurality of independent frames; extracting audio frame features, and aggregating the continuous audio frame features through a learning pool to generate audio features of the video; fusing the visual features and the audio features and inputting into a prediction module; and performing video tagging. Compared with the prior art, the method has the advantages that visual features and audio features of the video are considered at the same time, and an attention mechanism is added during frame feature aggregation, so that theextracted video features are more representative, and the video annotation accuracy is greatly improved.
Owner:HUAZHONG UNIV OF SCI & TECH

Remote sensing image semantic segmentation method and device

The invention relates to a remote sensing image semantic segmentation method and device. The method comprises the steps of obtaining a to-be-segmented remote sensing image; inputting the to-be-segmented remote sensing image into a pre-established self-attention multi-scale feature aggregation network, and obtaining an initial prediction result of the to-be-segmented remote sensing image output bythe pre-established self-attention multi-scale feature aggregation network; up-sampling the initial prediction result of the remote sensing image to be segmented to the image size of the remote sensing image to be segmented, and obtaining the final prediction result of the remote sensing image to be segmented; according to the method provided by the invention, the association between modal features and spatial information can be effectively enhanced, the multi-scale target downward text information perception capability is improved, and a finer semantic annotation result is obtained.
Owner:AEROSPACE INFORMATION RES INST CAS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products