Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

326results about How to "Enhanced Representational Capabilities" patented technology

Target detection method based on a dense connection characteristic pyramid network

The invention discloses a target detection method based on a dense connection characteristic pyramid network, and relates to the image processing and computer vision technology. The method comprises the steps of collecting an image data set labeled with a target bounding box and category information; constructing a dense connection feature pyramid network containing a Squeeze-Exciton structure capable of learning a dependency relationship between feature channels as a feature extraction backbone network; alternately training the RPN subnet and the R-FCN subnet to obtain a target detection model; and detecting a specific target in the image by using the model. The squeeze-Exciton structure and a dense connection structure are introduced into a feature extraction trunk network, characterization capability of the model is enhanced, adaptability of the model to targets of different sizes is enhanced through the feature pyramid structure, calculation sharing of the whole network model is achieved to the maximum degree through the R-FCN detection head, calculation resources are saved, and performance of the whole target detection model is improved. .
Owner:SOUTH CHINA UNIV OF TECH

Method for detecting and identifying curve characters in natural scene image

The invention discloses a method for detecting and identifying curve characters in a natural scene image, which is used for solving the problems of fuzzy boundary and low background contrast ratio in curve character identification and improving the curve character detection precision. The method mainly comprises the following steps: 1) training a curve character detection network based on a Mask RCNN network, detecting a natural scene image by using the trained curve character detection network, and detecting a character region in the image; (2) correcting curve characters in the character area into horizontal characters by utilizing a correction network, and outputting a corrected image, and (3) training a curve character recognition network, extracting convolution characteristics of the corrected image by utilizing the trained curve character recognition network, decoding the convolution characteristics, and recognizing the characters.
Owner:INST OF INFORMATION ENG CAS

Implementation method for fusing network question and answer system based on multi-attention mechanism

The invention discloses an implementation method of a fusion network question and answer system based on a multi-attention mechanism, which comprises the following steps of constructing a question andanswer system network model, preprocessing an original data set to obtain a standby data set, and performing text length distribution analysis; subjecting text in standby data set to one-hot vector representation, using a CBOW model to train one-hot word vector and forming a word2vec word list; adjusting the sequence length of each sentence in the text, and adding a sentence end mark; training the word2vec vector by using an ELMO language model to obtain an ELMO word vector; encoding the ELMO vector to obtain a sentence vector; performing coarse-fine granularity attention on the sentence vectors respectively to obtain memory vectors and attention vectors based on each word; carrying out vector splicing to obtain expression vectors based on words and sentences; and decoding an answer representing the vector generation question sentence. According to the method, the representation ability of sentences is improved through an ELMO language model; and various attention mechanisms are fused, so that the decision making accuracy of the system is improved, and the interpretability of the system is enhanced.
Owner:GUANGDONG UNIV OF TECH

Hyperspectral image classification method based on depth feature cross fusion

The invention puts forward a hyperspectral image classification method based on depth feature cross fusion, and mainly solves the problem of spatial feature loss of a traditional convolutional neuralnetwork during hyperspectral data classification. According to the technical scheme, the method comprises the following steps: 1, reading hyperspectral data and preprocessing each spectral band; 2, constructing a data sample by using the preprocessed hyperspectral data, and generating a training set and test set data; 3, constructing a hyperspectral image classification network based on depth feature cross fusion; 4, training the network by using the training set data; and 5, performing classification prediction on the test set data by using the trained network. According to the method, depthfeatures of different branch stages and different scales are fused for multi-channel original data, information exchange is continuously carried out among multi-scale representation, and then the depth feature expression capability of the model is improved; the multi-scale spatial information of different layer depth features of the hyperspectral data is effectively utilized, and the classification precision is improved.
Owner:XIDIAN UNIV

Multi-class entity recognition model training method, entity recognition method, server and terminal

The invention discloses a multi-class entity recognition model training method, an entity recognition method, a server and a terminal. The multi-class entity recognition model training method comprises the steps: carrying out the entity and entity class labeling of corpus information, and obtaining the target annotated corpus information comprising an entity and an entity class label; performing multi-dimensional feature analysis processing on the corpus information in the target annotated corpus information to obtain multi-dimensional information of the target annotated corpus information; performing multi-class entity recognition training on the preset deep learning model based on the multi-dimensional information and entities and entity class tags in the target annotation corpus information to obtain a multi-class entity recognition model, wherein the preset deep learning model comprises a feature input conversion layer, a semantic sequence representation layer, an entity feature screening layer and a class entity output layer. By utilizing the technical scheme provided by the invention, the entities and the entity categories in the corpus information can be quickly and accurately identified, and multi-category entity identification is realized.
Owner:TENCENT TECH (SHENZHEN) CO LTD

Text word vector model training method, electronic equipment and computer storage medium

The invention relates to the technical field of computer processing, and discloses a training method of a text word vector model, electronic equipment and a computer readable storage medium. The training method of the text word vector model comprises the steps: determining a sub-model corresponding to each training statement according to statement tags included in each training statement, the statement tags being used for indicating the sub-models corresponding to the training statements; Respectively training the corresponding semantic word vector quantum model and the text classification sub-model through each training statement to update a first word vector matrix of the text word vector model, so as to train the text word vector model by updating the first word vector matrix. Accordingto the method provided by the embodiment of the invention, through the combination of the semantic word vector quantum model and the text classification sub-model, the close connection and fusion between the word vector training method and the text classification method are realized, and the representation capability of the word vector is enhanced.
Owner:TENCENT TECH (SHENZHEN) CO LTD

Multi-class small target detection method based on metric learning

The invention relates to a multi-class small target detection method based on metric learning, and designs a novel deep neural network structure by combining the feature expression capability of deeplearning with the similarity discrimination capability of metric learning according to the recognition characteristics of multi-class small targets. The method is characterized in that a Faster RCNN (Recurrent Convolutional Neural Network) network structure combined with a Feature Pyramid Network (FPN) is adopted to detect multiple types of small targets on the basis of data of a whole image; a graph network module is embedded into the network to carry out transmission calculation on similarity information among all regions in the image; a similarity measurement module based on triple loss isadopted at the rear end of the network to distinguish detail information among samples, feature information of small targets and similarity relations among the targets are fully extracted, and the accuracy of multi-class small target detection is improved.
Owner:NORTHWESTERN POLYTECHNICAL UNIV

Multi-mode complex activity recognition method based on deep learning model

The invention discloses a multi-mode complex activity recognition method based on a deep learning model. To be specific, the method comprises: step one, classifying different-mode time sequence data into different types and carrying out expression extraction by using convolutional neural networks (CNN) with different structures; step two, carrying out fusion on expressions in different modes by using a longitudinal splicing layer and the convolutional layers; and step three, extracting sequence features further by using an LSTM network to obtain a complex activity tag. According to the invention, complex activities are identified by using a deep learning model. The multi-mode complex activity recognition method has the broad application prospects in fields of health care, industrial assistance, skill evaluation and the like.
Owner:ZHEJIANG UNIV

Pedestrian re-identification method based on natural language description

The invention discloses a pedestrian re-identification method based on natural language description and relates to processing of a recording carrier for identifying graphics. Specifically, the invention discloses an image and a natural language are designed to describe a double-branch network structure. In the image branch network structure a MobileNet convolutional network is adopted to carry out image feature extraction to extract text features of a natural language description branch network structure through a BiLSTM network. a stacking loss function is created for similarity measurementparts between image features and text features, and network training is carried out. A corresponding pedestrian image contained in the to-be-detected image set is searched by using a trained network to achieve pedestrian re-identification based on natural language description of a stacking loss function . The defects that in the prior art, the text feature characterization of a feature extractionpart is not high, a loss function part is difficult to train a network for a long time, and a large amount of memory is consumed in the training process are overcome.
Owner:HEBEI UNIV OF TECH

Gesture detection method based on multi-feature fusion

The invention provides a gesture detection method based on multi-feature fusion. The method includes the steps that a cascaded Gentle Adaboost classifier is trained according to a method of fusion of the HOG feature, the variance feature and the Haar feature, so that a gesture classifier is formed; a skin color pre-examination is conducted through a skin color pre-examination module on an image collected by a camera and regions with the color similar to the skin color are screened out; the skin color regions which are screened out are traversed according to a sliding window method, and the input image considered to contain gestures is calibrated through rectangular frames; repeated rectangular region windows which are repeatedly judged as candidate gesture regions multiple times by the classifier are combined, and therefore the gesture of the image is calibrated. The method of fusion of the HOG feature, the variance feature and the Haar feature is selected for training the classifier and the image is measured through various feature values according to multiple peculiarities of the hands, so that the representing performance of the human hands is improved and the accuracy of a detection system is improved.
Owner:XI AN JIAOTONG UNIV

Fabric defect detection method based on multi-feature matrix low-rank decomposition

The invention discloses a fabric defect detection method based on multi-feature matrix low-rank decomposition. The method comprises the steps of image blocking, multi-channel feature matrix extraction, united low-rank decomposition and saliency map generation and partitioning, wherein a fabric image is divided into image blocks with the same size, a second-order gradient direction map of each image block is calculated, a retina P-type ganglion cell coding mode is adopted to extract image features, and a feature matrix is generated; an effective low-rank decomposition model is constructed according to the feature matrix, optimal solving is performed through a direction alternating multiplier method, and a low-rank matrix and a sparse matrix are generated; and a threshold segmentation algorithm is adopted to partition a saliency map generated by the sparse matrix, and defect positions are found. According to the method, the complexity of fabric texture features and the diversity of defect types are comprehensively considered, second-order features capable of effectively representing the fabric texture features are extracted, the untied low-rank decomposition model is adopted to effectively realize quick separation of defects and a background, and the method has high detection precision.
Owner:ZHONGYUAN ENGINEERING COLLEGE

A natural interaction method of virtual learning environment based on speech emotion recognition

The invention relates to a natural interactive method of a virtual learning environment based on speech emotion recognition, belonging to the field of depth learning. The method comprises the following steps: 1, collecting speech signals of students and users through kinect, resampling, adding windows by frames, and mute processing to obtain short-time single frame signals; 2, carrying out fast Fourier transform on that signal to obtain the frequency domain data, obtaining the pow spectrum thereof, and adopting a Mel filter bank to obtain a Mel spectrum diagram; 3, inputting the features of the Mel spectrum map into a convolution neural network, performing convolution operation and pooling operation, and inputting the matrix vectors of the last desample layer to the whole connecting layerto form a vector output feature; 4, compressing and inputting the output characteristic into a bi-directional long-short time memory neural network; 5, inputting the output features into a support vector machine to classify and output a classification result; 6, feeding back the classification result to the virtual learning system for virtual learning environment interaction. The invention driveslearners to adjust the learning state and enhances the practicability of the virtual learning environment.
Owner:CHONGQING UNIV OF POSTS & TELECOMM

A Method for Inkjet Printing Texture Image Registration Based on Cell Decomposition Optical Flow Field

ActiveCN102262781AReduce bias defectsAvoid difficultiesImage analysisPattern recognitionIncomplete Cholesky factorization
The invention relates to a method for the registration of an ink-jet printing texture image based on a unit decomposition optical flow field. The method comprises the following steps of: 1, inputting a reference image and a distorted image and setting iteration implementation parameters; 2, performing incomplete Cholesky decomposition on a coefficient matrix of a registration displacement vector;3, taking a primary function as a linear primary function, and solving a unit decomposition coefficient column vector by a preconditioned conjugate gradient method according to an iteration error value; 4, calculating an overall error estimation value, and adjusting a scale space to obtain a rough scale space; 5, taking the primary function as a second order primary function, and solving the unitdecomposition coefficient column vector on the rough scale space according to a local error estimation result to obtain a fine scale space; and 6, stacking and expanding the unit decomposition coefficient column vector on the fine scale space to obtain the registration displacement vector and complete registration. By the method, the registration representation capability of a feature texture curve can be effectively improved, the registration precision in noise environment is improved, and the registration speed in noise environment is increased. Therefore, the method is applicable for the registration of the ink-jet printing texture image.
Owner:ZHEJIANG UNIV OF TECH

Method for recognizing compressed-domain sensitive images based on visual attention models

The invention discloses a method for recognizing compressed-domain sensitive images based on visual attention models, and belongs to the image recognition field. Prior methods for recognizing sensitive images based on visual words are improved, visual attention models are built according to human visual attention mechanisms, sensitive areas in accordance with human subjective feelings are detected, and relevant characteristics are extracted to generate a visual word bank, accordingly, representation of the visual words is improved effectively, characteristics capable of describing image sensitive information accurately are obtained, and the purpose for improving sensitive image recognition accuracy is achieved. Besides, the invention further provides a compressed-domain sensitive image processing technology, characteristics of joint photographic experts group (JPEG) image compressed data are used, the sensitive areas of images are detected rapidly, and characteristics of the sensitive areas are extracted, so that visual word bank building and image recognition speeds are increased effectively.
Owner:BEIJING UNIV OF TECH

Full convolutional network fabric defect detection method based on attention mechanism

The invention provides a full convolutional network fabric defect detection method based on an attention mechanism. The full convolutional network fabric defect detection method comprises the steps: firstly extracting a multi-stage and multi-scale intermediate depth feature map of a fabric image through an improved VGG16 network, and carrying out the processing through the attention mechanism, andobtaining a multi-stage and multi-scale depth feature map; then, performing up-sampling on the multi-level and multi-scale depth feature maps by utilizing bilinear interpolation to obtain multi-levelfeature maps with the same size, and performing fusion by utilizing a short connection structure to obtain a multi-level saliency map; and finally, fusing the multistage saliency maps by adopting weighted fusion to obtain a final saliency map of the defect image. According to the full convolutional network fabric defect detection method, complex defect characteristics and various backgrounds of the fabric image are comprehensively considered, and the representation capability of the fabric image is improved by simulating an attention mechanism of human visual attention cognition, and the noise influence in the image is eliminated, so that the detection result has higher adaptivity and detection precision.
Owner:ZHONGYUAN ENGINEERING COLLEGE

Speaker recognition method based on Gaussian super vector and deep neural network

The invention discloses a speaker recognition method based on a Gaussian super vector and a deep neural network. The method comprises a speaker feature extraction stage, a deep neural network design stage, and a speaker identification and decision-making stage. According to the invention, the deep neural network is fused with a speaker recognition system model, and the obvious effect of a multilayer structure combining the Gaussian super vector and the deep neural network in the aspect of improving the characterization capability of an evaluation model is achieved. The speaker recognition method provided by the invention can effectively improve the recognition performance of a system in the environment of background noise, reduces the influence of the noise on the system performance, improves the robustness of the system noise, optimizes the system structure, and improves the competitiveness of a corresponding speaker recognition product.
Owner:HUBEI UNIV OF TECH

Handwritten character recognition method

The invention relates to a handwritten character recognition method. The method comprises steps: a, handwritten input data are normalized, a neuronal number is defined, an automatic encoder model is built, and weight and bias are initialized; b, through compressing a perceptual model, data compression and sampling are carried out; c, obtained data are automatically encoded and decoded, handwritten input data are rebuilt, and errors between the rebuilt data and original handwritten input data are minimized; d, the built models are stacked layer by layer to form an n-layer neuron feature depth learning model, depth feature learning is carried out on the n-layer neuron traversal, wherein n is a natural number; and e, the recognized handwritten character is outputted. Through simulating features of sensing objects by human brain visual neurons, compression perception and depth learning are combined, detailed features representing handwritten characters are dug automatically, the representation ability of the handwritten character and the model learning efficiency are effectively improved, and the recognition precision and the recognition efficiency of a handwritten character, especially a handwritten number, are greatly improved.
Owner:SOUTHWEST JIAOTONG UNIV

Super-resolution reconstruction method and system for image

PendingCN111461973ASuper resolution effect is goodThe importance of effective identificationGeometric image transformationCharacter and pattern recognitionPattern recognitionComputer graphics (images)
The invention discloses a super-resolution reconstruction method and system for an image. The method is based on a channel attention mechanism and a position attention mechanism of an image. Featureson each image block are extracted at different depths and then fused, and the importance of the characteristic channel information and the position information is effectively identified; and since thefeatures of different depths make different contributions to the image super-resolution, the method and the system make full use of and fuses the features of different depths, the obtained image features retain a larger receptive field and better detail features at the same time, and the super-resolution effect of the obtained image is better after reconstruction based on the features. Besides, the relation between the features is fully mined so that the performance bottleneck problem caused by the too deep network can be alleviated, the feature representation capability of the image is great, and the robustness can be realized.
Owner:HUAZHONG UNIV OF SCI & TECH

DSPP (Discriminant Sparsity Preserving Projections) method for unconstrained face recognition

The invention provides a DSPP (Discriminant Sparsity Preserving Projections) method for unconstrained face recognition. The method comprises the following steps of: (1) when calculating a sample reconstructing relation matrix W, increasing weight coefficients of similar non-neighboring samples by using category labels and intra-class compactness constraints; (2) when calculating a low-dimensionalmapping matrix P, adding a global constraint factor to further reduce the influence of dissimilar false-neighboring samples on a projection matrix to ensure that a low-dimensional manifold essential structure hidden in complex redundant data can be more accurately mined; and (3) realizing low-dimensional linear mapping of high-dimensional sample data. The method provided by the invention has the beneficial effects that for unconstrained face images obtained in the real environment, redundant information in high-dimensional data the DSPP can more accurately eliminated, essential features are extracted and the representation ability is enhanced; and in the meantime, the data dimension is also reduced, the storage space is saved and the reliability and the effectiveness of the face recognition are greatly increased.
Owner:NANJING INST OF TECH

Video saliency detection method based on Bayesian fusion

The invention discloses a video saliency detection method based on Bayesian fusion and mainly solves the problem that an existing video saliency detection method cannot detect small targets. The implementation scheme is that 1) extracting a video sequence static boundary probability saliency map, a color mean value saliency map and a color contrast saliency map, and carrying out weight fusion to generate a static saliency map; 2) extracting a video sequence dynamic boundary probability saliency map, a PCA prior saliency map and a background prior saliency map, and carrying out weight fusion togenerate a dynamic saliency map; and 3) carrying out fusion on the static saliency map and the dynamic saliency map through a Bayesian model to obtain a video sequence saliency map. Compared with a conventional video saliency algorithm, the method enhances characteristic space and time representation ability, reduces influence of complex backgrounds on the detection effect, can detect the small targets in a video effectively, and can be used for early-stage pretreatment of video target tracking and video segmentation.
Owner:XIDIAN UNIV

Optical remote sensing scene classification method based on deep twin capsule network

InactiveCN110321859AImprove performanceEliminate the vanishing gradient problemScene recognitionSensing dataData set
The invention discloses an optical remote sensing image scene classification method based on a deep twin capsule network. The method comprises the following steps of 1, deleting an average pooling layer of a deep residual network deep layer and a layer behind the average pooling layer; 2, taking the fine-tuned deep residual network as a feature extractor; 3, features of the input images are extracted respectively, and the obtained features are converted into capsule features; 4, introducing the thought of a twin network, and copying a single deep capsule network into a double deep capsule network to form two identical networks with shared feature extractor parameters; 5, calculating the distance between the two features to represent the similarity degree of the image pairs; and 6, carryingout capsule propagation by using a dynamic routing algorithm to finish image classification. Characteristic space information is stored by utilizing capsules, and a twinning network structure is combined, so that the characteristics have higher identifiability on a remote sensing data set. In addition, a regularization term is added to enhance the robustness of the model.
Owner:CHINA UNIV OF MINING & TECH

Gearbox fault diagnosis method and system based on singular value spectrum manifold analysis

The invention relates to the technical field of mechanical fault diagnosis, pattern recognition and the like, in particular to a gearbox fault diagnosis method and system based on singular value spectrum manifold analysis. The method comprises the steps: acquiring a fault vibration signal of a gearbox and preprocessing the signal to form a plurality of one-dimensional original vibration signal data, carrying out phase space reconstruction processing to obtain a plurality of two-dimensional matrixes; carrying out singular value decomposition on the reconstructed two-dimensional matrixes to obtain a singular value spectrum of the two-dimensional matrixes; calculating a slope of the singular value spectrum to obtain manifold characteristics of the singular value spectrum; training a support vector machine by adopting the characteristic data and constructing a fault diagnosis model; and inputting vibration signal data of a to-be-tested gearbox into the fault diagnosis model and outputtinga fault diagnosis classification result of the to-be-tested gearbox. According to the invention, the characteristic extraction of fault data of the gearbox is realized by using the singular value spectrum manifold analysis; the change trend of signal components can be effectively extracted; the influence of noise can be removed; the characterization capability of faults by characteristics can be enhanced; and the gearbox fault diagnosis precision can be improved.
Owner:CHONGQING UNIV OF POSTS & TELECOMM

Rolling bearing residual life prediction method based on improved multi-granularity cascade forest

The invention discloses a rolling bearing residual life prediction method based on an improved multi-granularity cascade forest, belongs to the field of rolling bearing residual life prediction, and solves the problems of poor precision and low operation efficiency of an existing artificial intelligence method in rolling bearing residual life prediction. The method includes: firstly, carrying outiterative computation on a rolling bearing frequency domain signal obtained through fast Fourier transform to obtain iterative features; replacing a multi-granularity scanning structure in the multi-granularity cascade forest with a convolutional neural network, extracting deep features of iterative features by using the convolutional neural network, and constructing a performance degradation feature set; and then integrating a single CATBoost model capable of achieving GPU parallel acceleration, introducing a determination coefficient R2 to construct a CasCatBoost structure so as to improve the representation capability of the model, and selecting the average life percentage p of the last cascade layer of the model to represent output; and finally, fitting p by using a linear function topredict the residual life of the bearing. The method has relatively high operation efficiency and prediction precision.
Owner:HARBIN UNIV OF SCI & TECH

High-speed correlation filtering tracking method based on a high-confidence updating strategy

The invention relates to a high-speed correlation filtering tracking method based on a high-confidence updating strategy. A target positioning module and a high-confidence updating module are respectively designed. In the tracking process, the target positioning module fuses the gray scale, the direction gradient histogram and the color space characteristics, combines with a characteristic dimension reduction method to train a related filter, and achieves the quick positioning of a target center based on a related filtering algorithm. The high-confidence degree updating module designs a high-confidence degree updating strategy by utilizing the response graph obtained by the target positioning module, namely. the highest response value of the response graph and the average peak correlationenergy (Average Peak-to-Correlation Energy, APCE) are calculated, and scale estimation and model updating are carried out only when the two index values meet conditions at the same time, so that redundant scale estimation operation and filter model updating operation which may introduce noise and cause tracking drift under the condition of low confidence are avoided, and complex scenes such as complicated background and shielding are adapted.
Owner:NORTHWESTERN POLYTECHNICAL UNIV

Speaker identification method based on deep stack autoencoder network

The invention relates to a speaker identification method based on the deep stack autoencoder network. The method comprises steps of S1, speaker feature extraction; S2, stack autoencoder network design; and S3: speaker identification and decision making. The method is advantaged in that compared with traditional speaker identification, the deep stack autoencoder network is fused with a speaker identification system model, in combination with the multi-layer structure of a stack autoencoder to improve the characterization ability of an evaluation model, system identification performance in the presence of background noise can be finitely improved, influence of the noise on the system performance is reduced, system noise robustness is improved, the system structure is optimized, and identification timeliness is effectively enhanced.
Owner:HUBEI UNIV OF TECH

Monocular vision road target detection and distance estimation method based on improved YOLOv3

The invention discloses a monocular vision road target detection and distance estimation method based on improved YOLOv3, belongs to the field of image processing and computer vision, and is suitablefor intelligent systems of aided driving, road environment perception and the like. The implementation process of the method comprises the following steps: acquiring a road target image marked with road target bounding box information, distance information and category information; constructing an improved YOLOv3 network model based on a Global-Context structure and a cavity convolution pooling pyramid structure; redesigning a loss function of the network in combination with a perspective projection relationship of a camera system and a frame prediction mechanism of YOLOv3; and training and obtaining a road target detection and distance estimation model, and realizing detection and distance estimation of the road target by utilizing the model. Through the monocular vision road target detection and distance estimation method, the distance of the road target can be estimated while the road target is quickly and accurately recognized and positioned in a natural scene, and the monocular vision road target detection and distance estimation method has important significance for practical application.
Owner:SOUTH CHINA UNIV OF TECH

Mobile equipment source identification method and system based on multimode fusion depth features

The invention belongs to the technical field of voice evidence obtaining. The invention discloses mobile device source identification method and system based on multimode fusion depth features. The method comprises the following steps: firstly, extracting MFCCs and GSV features of test data, correspondingly segmenting the features into multiple paths, then respectively training CNNs and performingfusion to obtain fused depth features, then determining the fused depth features by using a trained depth residual error network, and finally carrying out joint decision on the determination resultsof short samples of each path by adopting a voting method. According to the method, when the GMM-UBM model is trained, the data is screened according to the characteristics of phonemes and tones of the voice data, and a small amount of representative data is selected, so that the representation generalization of the model is ensured, the data calculation amount is reduced, and the modeling efficiency is improved; according to the method, the deep neural network is used for supervised training to extract the deep features, redundant and interference information in the feature data is eliminated, the feature data is simplified, the characterization of the data is improved, the dimensionality of the data is reduced, and the calculation amount is simplified.
Owner:HUAZHONG NORMAL UNIV

Bag-of-word model-based image security retrieval method for cloud environment

The invention belongs to the field of multimedia information security protection and particularly relates to a bag-of-word model-based image security retrieval method for a cloud environment. The method can be used for security retrieval of ciphertext images. A content owner extracts feature operation data of the images based on a bag-of-word model, performs orthogonal decomposition to obtain a password operation domain and a feature extraction operation domain, performs encryption operation and feature extraction operation, superposes operation results in a same data domain through orthogonal inverse transformation to form encrypted features, and uploads the encrypted features to a cloud server. When a user needs to retrieve the images, the cloud server can calculate out the features of the ciphertext images in the data domain of feature extraction by directly performing orthogonal decomposition on ciphertext features without performing decryption; the features of the ciphertext images and the features of the images requested to be retrieved are subjected to similarity measurement; and the images with the highest similarity degree are the to-be-retrieved images. The retrieval method does not depend on a specific encryption method, is high in security and has higher universality; and the bag-of-word model-based retrieval method has very high retrieval precision.
Owner:WUHAN UNIV

Chinese tourism field named entity identification method based on graph convolution neural network

According to the Chinese tourism field named entity identification method based on the graph convolution neural network, the graph convolution neural network comprises an input layer, an embedded layer, a graph convolution layer and a hierarchical structure, and an input body comprises a named entity and a non-entity; the method comprises the following steps: S1, simultaneously expanding to two sides by taking any non-entity of a tourism domain text as a center until a single character in a complete sentence is traversed; S2, extracting character features; S3, extracting character features; S4, inputting and training; S5, optimizing a graph convolution layer; S6, labeling all named entities in the tourism field text data, introducing a Laplace regularization loss function into the graph convolution layer so as to mine node internal structure information and extract character features; and S7, obtaining a hierarchical relationship between the named entity and the non-entity. According to the method, the character feature extraction method is constructed by using the graph convolution neural network, and semantic modeling is performed on the character features so as to realize correct identification of the named entities in the text.
Owner:XINJIANG UNIVERSITY

Violent behavior recognition method based on sequential guidance of spatial attention

The invention discloses a violent behavior recognition method based on sequential guidance of spatial attention. According to the method, RGB image and frame difference image features are extracted byadopting a deep convolutional network with double-flow parameter sharing, the RGB image and frame difference image features are respectively used as representations of spatial domain and time domaininformation, and the double-flow features are fused, so that the characterization capability of the features for violent behaviors is improved; a spatial attention module is guided in a time sequence;a strategy of guiding a spatial attention weight by adopting an implicit time sequence state of ConvLSTM is adopted; compared with the traditional self-attention, the spatial attention guided by thetime sequence is endowed with a spatial weight according to the global motion information, the network is guided to pay attention to the motion area, the interference of background information is ignored, and meanwhile, the missed detection when the target is relatively small can be reduced by increasing the proportion of the motion area characteristics. The test result on the public data set verifies the effectiveness of the method for improving the violent behavior recognition performance.
Owner:XI AN JIAOTONG UNIV +1
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products