Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

47 results about "Concept vector" patented technology

Multi-subject extracting method based on semantic categories

The invention provides a multi-subject extracting method based on semantic categories. The multi-subject extracting method based on the semantic categories comprises the following steps that firstly, a document is preprocessed according to a traditional method and a vector composed of feature words is obtained preliminarily; secondly, synonyms are merged by the utilization of the corresponding relation between word meanings and concepts of 'HowNet', polysemic word disambiguation is carried out according to the correlation between the semantic categories and the context, and a concept vector model is constructed to represent the document; then the concept vector model is converted to be a semantic category model according to the one-to-one corresponding relation between the concepts and the semantic categories; the concept similarity is calculated by the utilization of the related semantic information in the concepts in 'HowNet' and then the semantic similarity is obtained; the semantic categories are clustered by improving the K-means algorithm according to the method of presetting seeds, and a plurality of subject semantic category clusters are formed; finally, a plurality of sub-subject word sets are obtained in a reverse mode according to the corresponding relations between the semantic categories and the concepts and between the concepts and words. The method considers the semantic information, overcomes the defect that the sensibility to the initial center by the K-means algorithm and time-and-space cost are not stable, and improves the quality of extracted subjects.
Owner:HOHAI UNIV

Cross-language recommendation method and system

The invention discloses a cross-language recommendation method and system. The method comprises following steps: creating and renewing a bilingual search term vector model based on user's search of session logs and mining relevance of bilingual search terms; and creating and renewing a bilingual concept vector model based on a Chinese-English bilingual parallel corpus, creating and renewing a concept word vector model and mining related bilingual concepts. The system comprises a search string pre-processing module used for analyzing serial strings inputted by a user and filtering noise characters, a recommendation word calculation module set up by the bilingual search word vector model and the bilingual concept word vector model and used for searching and calculating similar recommendation words, a long-tail search word processing module used for searching not common low-frequency search words through rewriting of search words and searching of synonyms, and a result output module used for showing recommendation words processed to a user. The cross-language recommendation method and system have following beneficial effects: without on-line artificial translation, search efficiency of the user is increased; through the recommendation method for relevant search words for long-tail search words, recommendation coverage rate is increased; a support range of relevant search words is broadened; by dynamically renewing a mechanism of a recommendation model, the model can timely reflect newest research hotspots and study trends of the search system to which the user pays attention.
Owner:《中国学术期刊(光盘版)》电子杂志社有限公司

Wiki semantic matching-based document classification method and system

The invention discloses a wiki semantic matching-based document classification method and system. The method comprises the following steps of (1) obtaining a keyword set of a text document by utilizing keyword matching for each text document D in a document set, and performing matching in a wiki semantic reference space by utilizing a matching rule to obtain a reference concept set related to the text documents; (2) generating keyword vectors of the text document according to the keyword set of the text document, and generating concept vectors of the text document according to the keyword vectors and the reference concept set of the text document; (3) calculating comprehensive similarity between any two text documents in a plurality of to-be-classified text document sets according to the concept vectors and the keyword vectors; and (4) performing classification according to the comprehensive similarity between the any two text documents. The system comprises a first module, a second module, a third module and a fourth module. According to the method and the system, the contradiction between validity and high efficiency confronted by a wiki semantic matching method is overcome and an efficient online document classification method is provided.
Owner:WENZHOU UNIV OUJIANG COLLEGE

Knowledge graph completion, deduction and storage method and device based on entity concepts

The invention discloses a knowledge graph completion, deduction and storage method and a device based on entity concepts. The method comprises the steps of determining multiple concept vectors in one-to-one correspondence with multiple concepts of an entity and relationship vectors corresponding to relationships in a knowledge graph; determining an entity vector of the entity according to the plurality of concept vectors of the entity; calculating an unknown vector according to any two known vectors in the head entity vector, the tail entity vector and the relation vector of the unknown triple; and traversing the determined entity vectors or relationship vectors in the knowledge graph, determining the entity vector or relationship vector with the highest cosine similarity with the calculated unknown vector, and speculating the entity or relationship corresponding to the unknown vector so as to complement the knowledge graph. By the adoption of the method and the device, concept information and existing structural knowledge in the knowledge graph are fully fused, concepts and relations are vectorized, and the accuracy and expression capacity of a knowledge graph vectorization modeling result can be effectively improved.
Owner:CHINA ACADEMY OF ELECTRONICS & INFORMATION TECH OF CETC

Text coding representation method based on transformer model and multiple reference systems

ActiveCN110399454ASolve the problem of polysemy that is difficult to learnMachine learningText database queryingTransformerEuclidean vector
The invention discloses a text coding representation method based on a transformer model and multiple reference systems, and the method comprises the steps: splicing a word vector and a segmentation character vector of a sentence where the word vector is located based on a word vector and a separator vector coding result of a context text, and obtaining a spliced word vector; mapping the spliced word vector according to at least two set semantic concepts, obtaining at least two semantic concept vectors of the word vector, and, when the absolute semantic concept number of the word vector is smaller than the set semantic concept total number, wherein the semantic concept vectors of the word vector represent convergence, finally leaving p kinds of dissimilar semantic concept vectors; selecting the most suitable semantic concept vector of the word vector in the current context from the dissimilar semantic concept vectors through maximum pooling, and taking the most suitable semantic concept vector as a semantic prediction result of the word vector in the current context; and obtaining a probability vector of the word vector, and determining a word probability under a semantic concept corresponding to the word vector according to the probability vector.
Owner:深思考人工智能机器人科技(北京)有限公司

Document classification method based on hadoop data mining

The invention discloses a document classification method based on hadoop data mining. The method comprises following steps: A. preprocessing the data document to determine the keywords and the correspondence between each keyword and the document to which the keyword belongs; B. describing the attribute characteristics of data in a document by means of attribute feature transformation; C. using a matching rule to generate keyword vectors from a keyword set and generating concept vectors according to the keyword vectors and the data attribute characteristics obtained in step B; D. calculating the similarity between any two text documents in the data document to be classified according to the keyword vectors and the concept vectors in step C; E. performing a classification operation based onthe clustering process on the attribute vector, obtaining a classification result of the attribute vector, and the classification result indicating the classification of the target object corresponding to each attribute vector; F. Hadoop automatically collects the above classification results and classifies the classification data documents. The invention has the remarkable advantages of easy implementation and high classification accuracy.
Owner:NANJING UNIV OF POSTS & TELECOMM

Personal big data management hierarchy concept vectorizing incrementation processing method

Provided is a personal big data management hierarchy concept vectorizing incrementation processing method. The method comprises the following steps that 1, when a system run for the first time, all concepts are vectorized, and all branching nodes are subjected to concept vector merging operation; 2, when a user operates a concept tree, the substeps of 2.1 obtaining concept vectors and total word number of operated nodes and father nodes thereof, 2.2 modifying the concept vectors of the father nodes according to a formula, 2.3 conducting recursive implementation from the substep 2.1 by taking the father nodes as the operated nodes till a root node and 2.4 updating an inverse document frequency vector are executed; 3, when errors are accumulated to a certain degree, the substeps of 3.1 obtaining current inverse document frequency vector and an inverse document frequency initial value vector, 3.2 updating all vector weights in a vector space in a batched mode and 3.3 updating the inverse document frequency initial value vector are executed. According to the method, the personal big data management hierarchy concept vectorizing incrementation calculation method is achieved, the concept vectors in the concept space can be rapidly adjusted, and the execution efficiency is improved.
Owner:ZHEJIANG UNIV OF TECH

A Multi-topic Extraction Method Based on Semantic Classes

The invention provides a multi-subject extracting method based on semantic categories. The multi-subject extracting method based on the semantic categories comprises the following steps that firstly, a document is preprocessed according to a traditional method and a vector composed of feature words is obtained preliminarily; secondly, synonyms are merged by the utilization of the corresponding relation between word meanings and concepts of 'HowNet', polysemic word disambiguation is carried out according to the correlation between the semantic categories and the context, and a concept vector model is constructed to represent the document; then the concept vector model is converted to be a semantic category model according to the one-to-one corresponding relation between the concepts and the semantic categories; the concept similarity is calculated by the utilization of the related semantic information in the concepts in 'HowNet' and then the semantic similarity is obtained; the semantic categories are clustered by improving the K-means algorithm according to the method of presetting seeds, and a plurality of subject semantic category clusters are formed; finally, a plurality of sub-subject word sets are obtained in a reverse mode according to the corresponding relations between the semantic categories and the concepts and between the concepts and words. The method considers the semantic information, overcomes the defect that the sensibility to the initial center by the K-means algorithm and time-and-space cost are not stable, and improves the quality of extracted subjects.
Owner:HOHAI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products