Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

364 results about "Part-of-speech tagging" patented technology

In corpus linguistics, part-of-speech tagging (POS tagging or PoS tagging or POST), also called grammatical tagging or word-category disambiguation, is the process of marking up a word in a text (corpus) as corresponding to a particular part of speech, based on both its definition and its context—i.e., its relationship with adjacent and related words in a phrase, sentence, or paragraph. A simplified form of this is commonly taught to school-age children, in the identification of words as nouns, verbs, adjectives, adverbs, etc.

Enquiry statement analytical method and system for information retrieval

The invention discloses a query sentence analyzing method based on understanding of natural languages and a system thereof, and belongs to the technical field of information retrieval. The query sentence analyzing method comprises the following steps: (1) automatic segmenting, named entity identification and part-of-speech tagging of an input Chinese query sentence are implemented; (2) syntax structure of the segmented sentence is analyzed so as to obtain a syntax structural tree, and meaning of each word is determined according to the sentence after the part-of-speech tagging; (3) according to the syntax structure and the meaning of each word, semantic roles of predicates in the sentence are tagged; and (4) according to the analyzed result of the sentence from the levels of syntactics, syntax and semantics, keywords are expanded and the keywords that can reflect user information retrieval requirements are extracted. The query sentence analyzing system of the invention comprises a syntactic analyzing module, a syntax analyzing module, a semantic analyzing module and a keyword extracting module. The query sentence analyzing method and system can greatly improve the accuracy of query results and provide desired query results for users.
Owner:PEKING UNIV

Chinese natural language interrogative sentence semantization knowledge base automatic question-answering method

The invention discloses a Chinese natural language interrogative sentence semantization knowledge base automatic question-answering method. The method includes the following steps that Chinese natural language processing is performed on a fact type question input by a user, word segmentation, part-of-speech tagging and identification and expanding of a named entity are achieved, and a semantic dependency tree is generated; a generalization template and a semantic analysis technology are used for acquiring time, space, a fact entity, a fact object and the like in an interrogative sentence, then semantic processing is performed, composition element attributes relevant to all events in the interrogative sentence and values of the attributes are extracted, a plurality of 'attribute-value' pairs are generated, to-be-answered elements are substituted by interrogatives, and a complex fact triple set is formed; after a triple where a to-be-answered part is located is combined with other relevant fact triple sets to form knowledge base query with conditional constraints, and query matching based on similarity calculation is performed in a knowledge base, a result is extracted from the knowledge base, and a final answer is obtained. Fast and accurate query response to the knowledge base is achieved.
Owner:NANJING UNIV

Dependency semantic-based Chinese unsupervised open entity relationship extraction method

The invention relates to a dependency semantic-based Chinese unsupervised open entity relationship extraction method. The method comprises the following steps of preprocessing an input text: performing Chinese word segmentation, part-of-speech tagging and dependency grammar analysis on the input text; performing named entity identification on the input text; arbitrarily selecting two entities from identified entities to form candidate entity pairs; searching for a dependency path between two entities in the candidate entity pairs; and analyzing whether a syntactic structure mapped by the dependency path is matched with a normal form of a dependency semantic normal form set or not, if yes, extracting words or phrases from the residual part of the input text according to the matched normal form to serve as relational words, forming a relational triple by the extracted relational words and the candidate entity pairs, and if not, performing normal form matching of a next group of the candidate entity pairs; and outputting the relational triple. Compared with the prior art, the method has the advantages that the calculation complexity is low; the extraction efficiency is high; distance position limitation is overcome; a simple sentence also can be extracted and the like.
Owner:TONGJI UNIV

Multi-granularity semantic chunk based entity attribute and attribute value extracting method

The invention relates to a multi-granularity semantic chunk based entity attribute and attribute value extracting method, and belongs to the technical field of Web mining and information extraction. The method comprises the following steps that a corpus set is constructed and free text extraction is performed; a corpus is subjected to word segmentation, part-of-speech tagging and phrase recognition; the corpus is subjected to semantic role labeling; the corpus is subjected to dependency grammar analysis; the corpus is subjected to semantic dependency analysis; candidate entities, attributes and attribute value triads based on three granularities of words, phrases and semantic roles are extracted; the candidate entities, attributes and attribute value triads are corrected and subjected to error classification by means of a trained classifier. Compared with the prior art, the entities, attributes and attribute value triads based on three granularities of words, phrases and semantic roles are automatically extracted from a free text, the entity attribute and attribute value extraction accuracy and efficiency are improved, and the wide application prospect is achieved in the fields of theme detection, information retrieval, automatic abstracting, question and answer systems and the like.
Owner:BEIJING INSTITUTE OF TECHNOLOGYGY

Automatic legal knowledge graph construction method

The invention provides an automatic legal knowledge graph construction method, and aims at automatically constructing legal knowledge graphs according to trial documents. The method comprises the following steps of carrying out stop word removal and word segmentation on obtained trial documents; respectively extracting subject words of three types of trial documents, carrying out part-of-speech tagging and filtration on the extracted subject words, and extracting noun or noun phrase subject word to serve as entity concepts of a legal knowledge graph according to the filtration result; obtaining words similar with each extracted noun or noun phrase subject word, carrying out part-of-speech tagging and filtration on the obtained similar words, and extracting noun or noun phrase subject word similar words as entity concepts of the legal knowledge graph according to the filtration result; and constructing the legal knowledge graph according to the extracted subject word entity concepts, the similar word entity concepts and triple structures such as subject word-subject relationship-subject word and subject word-similar relationship-similar word formed by a relationship between the subject word entity concepts and the similar word entity concepts. The invention relates to the technical field of knowledge engineering.
Owner:UNIV OF SCI & TECH BEIJING

Construction and utilization method for context-aware dynamic word or character vector on the basis of deep learning

The invention belongs to the technical field of the natural language processing of computers, in particular to a construction and utilization method for a context-aware dynamic word or character vector on the basis of deep learning. The dynamic construction method for the context-aware dynamic word or character vector on the basis of the deep learning comprises the following steps of: in massive texts, through an unsupervised learning method, simultaneously learning a global feature vector of a word or character and the feature vector representation of the global feature vector when a specific context appears, and combining the global feature vector with the context feature vector, and dynamically generating word or character vector representation. By use of the method, the word or character vector dynamically constructed on the basis of the context can be applied to a natural language processing system. The method is mainly used for solving a problem that the word or character vector expresses different meanings in different contexts, i.e. the problem that one word or one character has multiple meanings can be solved. The dynamic word or character vector can be used for obviously improving the performance of various natural language processing tasks of different languages, wherein the tasks comprise Chinese word segmentation, part-of-speech tagging, naming recognition, grammatical analysis, semantic role tagging, sentiment analysis, text classification, machine translation and the like.
Owner:FUDAN UNIV

Text analysis method and text analyzer

The invention discloses a text analysis method and a text analyzer. The method comprises the following steps of: performing splitting processing on an acquired text by utilizing characters as a unit, and performing characteristic tagging on characters obtained by splitting according to preset character characteristics so as to form tagged word strings; performing word segmentation processing on the tagged word strings according to pre-constructed word segmentation models so as to obtain word segmentation results containing word orders; performing merging processing on the word orders contained in the word segmentation results, and performing characteristic tagging on words obtained by merging according to the preset character characteristics so as to obtain tagged word strings; performing part-of-speech tagging on the tagged word strings according to pre-constructed part-of-speech tagging models so as to obtain part-of-speech tagging results; and if confirming that the part-of-speech tagging results contain part-of-speech tags of entity words, merging the entity words containing the part-of-speech tags in the part-of-speech tagging results according to same adjacent rules, so as to obtain a text analysis result. By applying the text analysis method and the text analyzer, the entity word text analysis accuracy rate can be improved.
Owner:新浪技术(中国)有限公司

An information retrieval-based question and answer system and method for knowledge graph energization

The invention discloses an information retrieval-based question and answer system and method for knowledge graph energization, which integrally improve the question and answer effect of the system, expand the user consultation range and improve the question feedback accuracy. According to the technical scheme, the system comprises a knowledge map database for storing domain knowledge map information; a word segmentation and part-of-speech tagging module which segments the user questions and tags the part-of-speech of the user questions; an entity identification and link module which identifiesentities in the user questions and links the entities to nodes in the knowledge graph database; an intention understanding module which obtains an intention understanding result of the user problem based on the entity link result and the distributed representation vector; a retrieval module which retrieves a plurality of corresponding question and answer pairs as roughing results according to theinformation in the user questions based on the retrieval data source; a sorting module which is used for resorting the roughing results by utilizing the distributed representation vectors of the entities; and a semantic matching module which scores the reordering result by using the distributed representation vector of the entity and finally outputs an answer.
Owner:上海乐言科技股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products