Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

42 results about "Word-sense disambiguation" patented technology

In computational linguistics, word-sense disambiguation (WSD) is an open problem concerned with identifying which sense of a word is used in a sentence. The solution to this problem impacts other computer-related writing, such as discourse, improving relevance of search engines, anaphora resolution, coherence, inference.

Word multi-prototype vector representation and word sense disambiguation method based on CRP clustering

The invention discloses a word multi-prototype vector representation and word sense disambiguation method based on CRP clustering, which comprises the following steps: the text in the massive text corpus is purified and pretreated to obtain plain text, CRP algorithm is used to cluster the context window representation of target polysemous word in the text corpus set. The target polysemous words inthe text corpus set are marked according to the clustering classification, and the polysemous words are trained on the marked text corpus set to obtain the multi-prototype vector representation of the polysemous words; 2, the target short text is preprocessed to obtain a short text word sequence, a target polysemous word in a word sequence is identifued, the contextual window of the target polysemous words is used to represent the similarity between the centroids of clusters corresponding to the words in the text corpus, and the word vector corresponding to the maximum similarity clusters isused as the word vector representation of the specific meaning of the polysemous words in the context to disambiguate the meanings of the polysemous words. The invention solves the problem of polysemyexpression in word expression and the problem of ambiguity identification in word meaning expression.
Owner:NORTH CHINA UNIV OF WATER RESOURCES & ELECTRIC POWER

Training method and device for word sense disambiguation model

The embodiment of the invention provides a training method and device for a word sense disambiguation model, and the method comprises the steps: obtaining a word co-occurrence graph and a semantic association graph; and selecting a first word from the training text; obtaining a positive example sample and a negative example sample corresponding to the first word; calculating the similarity betweeneach word in the training text and the word represented by each node in each semantic association graph, and selecting a target association graph based on the similarity;based on the target association graph, determining a semantic vector of the first word, and based on the word co-occurrence graph, determining word vectors of other words; based on the determined semantic vector and word vector,encoding by utilizing an encoder; based on the word co-occurrence graph, determining a word vector of each word in the two samples; carrying out encoding by using an encoder according to the determined word vecto; based on the coding result, calculating a first text distance between the training text and the positive example sample, and calculating a second text distance between the training textand the negative example sample; and training an encoder by taking the condition that the first text distance is smaller than the second text distance as a target.
Owner:ALIPAY (HANGZHOU) INFORMATION TECH CO LTD

A word sense disambiguation method and system based on graph model

The invention discloses a word sense disambiguation method and system based on a graph model, and belongs to the field of natural language processing technology. The technical problem to be solved bythe present invention is how to combine multiple Chinese and English resources, complement each other's advantages, realize full exploitation of disambiguation knowledge in resources, and improve wordsense disambiguation performance.The technical scheme adopted is as follows: 1, a word sense disambiguation method based on graph model, comprising the following steps: S1, extracting contextual knowledge: carrying out part-of-speech tagging on ambiguous sentences, extracting substantive words as contextual knowledge, wherein the substantive words refer to nouns, verbs, adjectives and adverbs; S2, similarity calculation: performing similarity calculation based on English, similarity calculation based on word vector and similarity calculation based on HowNet; 3, constructing a disambiguation graph; S4, performing the correct choice of word meaning. 2, A word sense disambiguation system based on graph model, which comprises a context knowledge extraction unit, a similarity calculation unit,a disambiguation graph construction unit and a word sense correct selection unit.
Owner:ZAOZHUANG UNIV

Ambiguity elimination method and apparatus for e-commerce product comment vocabularies

The invention discloses an ambiguity elimination method and apparatus for e-commerce product comment vocabularies. The method comprises the steps of obtaining a historical comment text, and extracting historical feature words and historical sentiment words matched with the historical feature words from the historical comment text; according to a co-occurrence relationship between the historical feature words and the historical sentiment words corresponding to the historical feature words, screening out a combination of the historical feature words and the historical sentiment words, which occur most frequently; according to the combination of the historical feature words and the historical sentiment words, which occur most frequently, generating meaning item annotations of the historical feature words; obtaining a new comment text, and extracting a combination of new feature words and corresponding new sentiment words from the new comment text; and according to the combination of the new feature words and the new sentiment words, querying the combination of the historical feature words and the historical sentiment words, matched with the combination of the new feature words and the new sentiment words, and taking the meaning item annotations of the matched historical feature words as word meanings of the new feature words. According to the technical scheme, word meaning analysis and meaning item determination of the product comment vocabularies are realized, so that accurate paraphrases of product feature words occurring in contexts are accurately judged in different comment contexts.
Owner:美云智数科技有限公司

Word sense disambiguation method and device based on combination of graph model and word vectors

The invention discloses a word sense disambiguation method and device based on combination of a graph model and a word vector, belongs to the field of natural language processing, and is used for solving the problem of low accuracy of an existing word sense disambiguation method. The method is characterized by comprising the steps of preprocessing a data set to obtain ambiguous words; constructinga graph model, and obtaining context background knowledge according to the graph model; training a word vector model, and performing word vector representation on the obtained ambiguous words and context background knowledge according to the word vector model; and performing cross-weighted similarity calculation on the ambiguous words expressed by the word vectors and the context background knowledge, taking a mean value, and determining the ambiguous word with the highest similarity mean value as a correct meaning item of the ambiguous words. According to the invention, the graph model and the word vector are combined, so that the correct rate of word sense disambiguation is improved, and a good disambiguation effect is achieved. The method is superior to a traditional word sense disambiguation method and can well meet the requirements of practical application.
Owner:INNER MONGOLIA UNIV OF SCI & TECH +1

Bayesian word sense disambiguation method based on mass pseudo-data

The invention particularly relates to a new bayesian word sense disambiguation method based on mass pseudo-data. The problems that a current word sense disambiguation method is poor in disambiguation effect and capable of wasting time and labor when disambiguation knowledge is obtained are solved. The new bayesian word sense disambiguation method includes the steps that through a dependency grammar analyzer, training examples containing ambiguous words in a training corpus base are subjected to syntactic analysis, and tuples with the dependence relationship with the ambiguous words are collected; then through a machine translation system, example sentences containing the tuples in a machine translation corpus base are searched. The steps are repeatedly carried out in a mode, the searched example sentences are added into a pseudo-training corpus base, and then through the training corpus base and the pseudo-training corpus base, a bayesian disambiguation model is trained; word meanings of the ambiguous words are decided through the disambiguation model, and on the basis of a small amount of manually-annotated corpuses, the data sparsity problem of word sense disambiguation can be effectively solved, the accuracy of word sense disambiguation is increased, and the new bayesian word sense disambiguation method has broad development prospects.
Owner:SHANXI UNIV

Method and device for noun word sense disambiguation based on dependency constraint and knowledge

The invention discloses a method and a device for noun word sense disambiguation based on dependency constraint and knowledge. The method comprises: performing dependency syntactic analysis on large-scale corpus, collecting obtained dependency tuples and counting frequency number of the dependency tuples, establishing a dependency knowledge base; performing dependency syntactic analysis on a sentence where an ambiguity noun is in, extracting 16 kinds of dependency tuples which meet a set condition, and using the dependency tuples as a dependency constraint set of the ambiguity noun; according to a semantic dictionary, for each meaning of the ambiguity noun, extracting a synonym set, an antonym set, and a hypernym set in sequence, and using the synonym set, the antonym set, and the hypernym set as word meaning representative word sets of the corresponding word meanings; according to the dependency knowledge base and the word meaning representative word sets, calculating posterior probability of each meaning of the ambiguity noun in the dependency constraint set; and according to the posterior probability, selecting the correct meaning of the ambiguity noun. Using the method can fully develop effect of the dependency syntactic analysis, so as to accurately and effectively determine the meaning of the ambiguity noun.
Owner:QILU UNIV OF TECH

Chinese word sense disambiguation method based on graph convolutional neural network

The invention relates to a Chinese word sense disambiguation method based on a graph convolutional neural network (GCN). According to the invention, firstly, Chinese corpora are preprocessed; word segmentation, part-of-speech tagging and semantic tagging processing are performed on statements, containing ambiguous words, of the training and testing corpora; a word sense disambiguation feature graph is constructed by taking sentences where ambiguous words are located and word forms, part-of-speech and semantics contained in the sentences as disambiguation features and nodes, and weights are embedded into the nodes and edges by using Word2Vec and Doc2Vec tools and point mutual information (PMI) and TF-IDF methods; and the GCN model is trained by using the training corpus, and thus optimizing the model; word sense disambiguation is performed on the test corpus by using the optimized GCN model, so that probability distribution of ambiguous vocabularies under each semantic category can be obtained; and the semantic class corresponding to the maximum probability value is judged as the semantic class of the ambiguous vocabulary. The invention has a good word sense disambiguation effect, and the real meaning of the ambiguous vocabulary is more accurately judged.
Owner:HARBIN UNIV OF SCI & TECH

Chinese word sense disambiguation method based on fusion of graph convolutional neural network and support vector machine

The invention relates to a Chinese word sense disambiguation method based on a graph convolutional neural network (GCN) fused with a support vector machine (SVM), in particular to a Chinese word sense disambiguation method based on the graph convolutional neural network (GCN) fused with the support vector machine (SVM) and a Chinese word sense disambiguation method based on the graph convolutional neural network (GCN) fused with the Chinese word sense disambiguation method based on the graph convolutional neural network (GCN) fused with the support vector machine (SVM). The method comprises the steps of firstly preprocessing corpora; and performing word segmentation, part-of-speech tagging and semantic tagging processing on statements of the training and testing corpora. A word sense disambiguation graph is constructed by taking sentences where ambiguous words are located and word forms, part-of-speech and semantic classes of vocabulary units on two sides of the ambiguous words as disambiguation features and taking the disambiguation features as nodes. Weights of nodes and edges in the graph are calculated by using Word2Vec, a Doc2Vec tool, point-by-point mutual information (PMI) and a TF-IDF algorithm. And training the GCN model by the training corpus, and optimizing the model. And calculating disambiguation features of training and testing corpora by using the optimized GCN model, inputting the calculated disambiguation features of the training corpora into an SVM classifier, optimizing the SVM classifier, and classifying the testing corpora to obtain classification conditions of ambiguous vocabularies under semantic categories. The method has a good word sense disambiguation effect, and the real meaning of the ambiguous vocabulary is accurately judged.
Owner:HARBIN UNIV OF SCI & TECH

Mobile equipment machine translation system based on hybrid strategy

The invention discloses a mobile device machine translation system based on a hybrid strategy. The system comprises a translation information acquisition module, a translation keyword grouping module,a machine translation engine, a neural machine translation engine, an interactive machine translation module, a word sense disambiguation module and a self-updating learning module. According to theinvention, the machine translation system is integrated into the mobile device, so that the mobile device can obtain translation information more quickly and conveniently; meanwhile, result translations translated by a machine translation system based on rules, statistics and a neural network are fused through an interactive machine translation module; the translation result of the source statement is more accurate; the machine translation system based on rules, statistics and the neural network is arranged in the mobile device, and translation does not need to be carried out by means of the cloud in the actual translation operation process; meanwhile, network connection is not needed in the whole translation process in the mobile device, and the defect that a traditional translation system needs to be networked to carry out translation by means of cloud resources is overcome.
Owner:南京莱科智能工程研究院有限公司

Chinese commercial text preprocessing method based on machine learning

PendingCN110457685ASolve questions that do not answer the question and have limited response scenariosImprove accuracyMachine learningSpecial data processing applicationsPretreatment methodPolysemy
The invention discloses a Chinese commercial text preprocessing method based on machine learning. The input Chinese commercial text is processed through the following steps that (1) performing sentence segmentation and word segmentation on the Chinese commercial text; (2) carrying out part-of-speech tagging on the segmented words by utilizing a decision tree; (3) performing word sense disambiguation by utilizing conditional probability based on a Bayesian classifier; (4) representing a word vector by using a hybrid model combining One-Hot coding and a Skip-Gram model; (5) adjusting word weights by utilizing TF-IDF, and determining corresponding word meanings of the polysemy under the current context; and (6) outputting the Chinese commercial text preprocessed based on machine learning. TheChinese commercial text preprocessing method can effectively solve the problems that a Chinese commercial question-answering system does not answer questions and is limited in response scene due to insufficient text preprocessing, can improve the text understanding accuracy of a computer, and enables extension work such as machine translation and intelligent question-answering to have implementability.
Owner:NANJING UNIV OF POSTS & TELECOMM

A method and device for adverb word sense disambiguation based on dependency constraints and knowledge

The invention discloses a dependency constraint and knowledge-based adverb meaning disambiguation method and apparatus. The method comprises the steps of performing dependency syntax analysis on large-scale corpora, collecting obtained dependency tuples and performing statistics on frequency numbers of the dependency tuples, and establishing a dependency knowledge library; performing dependency syntax analysis on a sentence in which ambiguous adverbs are located, and extracting the two dependency tuples meeting a set condition as a dependency constraint set of the ambiguous adverbs; sequentially extracting a synonym set and an antonym set as a word meaning representative word set of corresponding word meanings for the word meanings of the ambiguous adverbs according to a semantic dictionary; sequentially calculating posterior probabilities of the word meanings of the ambiguous adverbs in the dependency constraint set according to the dependency knowledge library and the word meaning representative word set; and selecting correct word meanings of the ambiguous adverbs according to the posterior probabilities. By utilizing the method and the apparatus, the effect of the dependency syntax analysis can be brought into full play and the word meanings of the ambiguous adverbs can be determined more accurately and effectively.
Owner:SHANDONG EVAYINFO TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products