Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

612 results about "Attention model" patented technology

Attention is focused in this model on information deemed important by the individual, while information seen as not as important is processed less thoroughly by the human brain. During this attenuation model, the information is processed for physical characteristics and the recognition of words through a filter.

Space-time attention based video classification method

ActiveCN107330362AImprove classification performanceTime-domain saliency information is accurateCharacter and pattern recognitionAttention modelTime domain
The invention relates to a space-time attention based video classification method, which comprises the steps of extracting frames and optical flows for training video and video to be predicted, and stacking a plurality of optical flows into a multi-channel image; building a space-time attention model, wherein the space-time attention model comprises a space-domain attention network, a time-domain attention network and a connection network; training the three components of the space-time attention model in a joint manner so as to enable the effects of the space-domain attention and the time-domain attention to be simultaneously improved and obtain a space-time attention model capable of accurately modeling the space-domain saliency and the time-domain saliency and being applicable to video classification; extracting the space-domain saliency and the time-domain saliency for the frames and optical flows of the video to be predicted by using the space-time attention model obtained by learning, performing prediction, and integrating prediction scores of the frames and the optical flows to obtain a final semantic category of the video to be predicted. According to the space-time attention based video classification method, modeling can be performing on the space-domain attention and the time-domain attention simultaneously, and the cooperative performance can be sufficiently utilized through joint training, thereby learning more accurate space-domain saliency and time-domain saliency, and thus improving the accuracy of video classification.
Owner:PEKING UNIV

Image subtitle generation method and system fusing visual attention and semantic attention

The invention discloses an image subtitle generation method and system fusing visual attention and semantic attention. The method comprises the steps of extracting an image feature from each image tobe subjected to subtitle generation through a convolutional neural network to obtain an image feature set; building an LSTM model, and transmitting a previously labeled text description correspondingto each image to be subjected to subtitle generation into the LSTM model to obtain time sequence information; in combination with the image feature set and the time sequence information, generating avisual attention model; in combination with the image feature set, the time sequence information and words of a previous time sequence, generating a semantic attention model; according to the visual attention model and the semantic attention model, generating an automatic balance policy model; according to the image feature set and a text corresponding to the image to be subjected to subtitle generation, building a gLSTM model; according to the gLSTM model and the automatic balance policy model, generating words corresponding to the image to be subjected to subtitle generation by utilizing anMLP (multilayer perceptron) model; and performing serial combination on all the obtained words to generate a subtitle.
Owner:CHINA UNIV OF PETROLEUM (EAST CHINA)

Fused attention model-based Chinese text classification method

The invention discloses a fused attention model-based Chinese text classification method. The method comprises the following steps of: respectively segmenting a text into a corresponding word set anda corresponding character set through word segmentation preprocessing and character segmentation preprocessing, and training a word vector and a character vector corresponding to the text by adoptionof a feature embedding method according to the obtained word set and character set; respectively carrying out semantic encoding on the word vector and the character vector by taking a bidirectional gate circulation unit neural network as an encoder, and obtaining a word attention vector and a character attention vector in the text by adoption of a word vector attention mechanism and a character vector attention mechanism; obtaining a fused attention vector; and predicting a category of the text through a softmax classifier. The method is capable of solving the problem that more redundant features exist in the classification process as existing Chinese text classification methods neglects character feature information of texts, the extracted texts are single in features, all the pieces of semantic information of the texts are difficult to cover and features having obvious contribution to the classification are not focused.
Owner:中国科学院电子学研究所苏州研究院

Character identifying method and character identifying system

An embodiment of the invention provides a character identifying method and system. The method includes collecting an original image of a nature scene; performing pre-treatment on the original image; performing OCR layout analysis on the original image subjected to the pre-treatment and obtaining a plurality of pixel matrixes; performing characteristic extraction on the pixel matrixes by adopting a CNN (Convolutional Neural Network) and obtaining a plurality of characteristic patterns; performing character identification on the characteristic patterns by adopting an LSTM (Long Short Term Memory) provided with an Attention Model and obtaining a character sequence, wherein a forget gate of the LSTM provided with the Attention Model is replaced with the Attention Model. According to the invention, by utilizing the LSTM algorithm provided with the Attention Model, the characteristic sequence extracted by using the CNN algorithm is identified as the corresponding character sequence, so that required text information is obtained and operation parameters are reduced. At the same time, through control of different influence on current characters by different context content, information in long term memory can be transmitted to the current characters perfectly, so that character identification accuracy is improved.
Owner:BEIJING SINOVOICE TECH CO LTD

Speech recognition model establishing method based on bottleneck characteristics and multi-scale and multi-headed attention mechanism

The invention provides a speech recognition model establishing method based on bottleneck characteristics and a multi-scale and multi-headed attention mechanism, and belongs to the field of model establishing methods. A traditional attention model has the problems of poor recognition performance and simplex attention scale. According to the speech recognition model establishing method based on thebottleneck characteristics and the multi-scale and multi-headed attention mechanism, the bottleneck characteristics are extracted through a deep belief network to serve as a front end, the robustnessof a model can be improved, a multi-scale and multi-headed attention model constituted by convolution kernels of different scales is adopted as a rear end, model establishing is conducted on speech elements at the levels of phoneme, syllable, word and the like, and recurrent neural network hidden layer state sequences and output sequences are calculated one by one; and elements of the positions where the output sequences are located are calculated through decoding networks corresponding to attention networks of all heads, and finally all the output sequences are integrated into a new output sequence. The recognition effect of a speech recognition system can be improved.
Owner:HARBIN INST OF TECH

CNN and selective attention mechanism based SAR image target detection method

InactiveCN107247930AImprove accuracyOvercoming pixel-level processingScene recognitionNeural architecturesAttention modelData set
The invention discloses a CNN and selective attention mechanism based SAR image target detection method. An SAR image is obtained; a training data set is expanded; a classification model composed of the CNN is constructed; the expanded training data set is used to train the classification model; significance test is carried out on a test image via a simple attention model (a spectral residual error method) of image visual significance to obtain a significant characteristic image; and morphological processing is carried out on the significant characteristic image, the processed characteristic image is marked with connected domains, target candidate areas corresponding to different mass centers are extracted by taking the mass centers of the connected domains as the centers, and the target candidate areas are translated within pixels in the surrounding to generate an target detection result. According to the invention, the CNN and the selective attention mechanism are applied to SAR image target detection in a combined way, the efficiency and accuracy of SAR image target detection are improved, the method can be applied to target classification and identification, and the problem that detection in the prior art is low in detection efficiency and accuracy is solved mainly.
Owner:XIDIAN UNIV

Semantic segmentation method and system for RGB-D image

The invention discloses a semantic segmentation method and system for an RGB-D image. The semantic segmentation method comprises the steps: extracting RGB coding features and depth coding features ofan RGB-D image in multiple stages; inputting the RGB coding features and the depth coding features of each stage in the plurality of stages into an attention model to obtain each multi-mode fusion feature corresponding to each stage; extracting context semantic information of the multi-modal fusion features in the fifth stage by using a long short-term memory network; splicing the multi-modal fusion features and the context semantic information in the fifth stage to obtain context semantic features; and performing up-sampling on the context semantic features, and fusing the context semantic features with the multi-modal fusion features of the corresponding stage by using a jump connection mode to obtain a semantic segmentation map and a semantic segmentation model. By extracting RGB codingfeatures and depth coding features of the RGB-D image in multiple stages, the semantic segmentation method effectively utilizes color information and depth information of the RGB-D image, and effectively mines context semantic information of the image by using a long short-term memory network, so that the semantic segmentation accuracy of the RGB-D image is improved.
Owner:HANGZHOU WEIMING XINKE TECH CO LTD +1

Recurrent neural network attention model-based pedestrian attribute recognition network and technology

ActiveCN108921051AHigh pedestrian attribute recognition accuracyAccuracy of highlight pedestrian attribute recognitionCharacter and pattern recognitionNeural architecturesAttention modelPrediction probability
The invention provides a recurrent neural network attention model-based pedestrian attribute recognition network and technology. The pedestrian attribute recognition network comprises a first convolutional neural network, a recurrent neural network and a second convolutional neural network, wherein the first convolutional neural network is used for extracting a whole body image feature of a pedestrian by taking an original body image of the pedestrian as an input; the recurrent neural network is used for outputting an attention heat map of an attribute group concerned at the current moment andlocally highlighted pedestrian features by taking the whole body image feature of the pedestrian as a first input and taking an attention heat map of an attribute group concerned at the last moment as a second input; and the second convolutional neural network is used for outputting an attribute prediction probability of the currently concerned group by taking the locally highlighted pedestrian feature as an input. According to the network and technology, a recurrent neural network attention model is utilized to mine an association relationship of pedestrian attribute area space positions soas to highlight positions of areas corresponding to attributes in images, so that higher pedestrian attribute recognition precision is realized.
Owner:TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products