Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An aggregation method and technology of hypernyms, applied in the field of information processing, to avoid complicated workload, enhance generalization ability, and improve accuracy
Active Publication Date: 2018-08-17
TENCENT TECH (SHENZHEN) CO LTD
View PDF7 Cites 4 Cited by
Summary
Abstract
Description
Claims
Application Information
AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology
Problems solved by technology
Therefore, the method based on edit distance to merge hypernyms with similar semantics also has certain limitations.
Method used
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more
Image
Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
Click on the blue label to locate the original text in one second.
Reading with bidirectional positioning of images and text.
Smart Image
Examples
Experimental program
Comparison scheme
Effect test
Embodiment 1
[0116] The hypernyms to be processed are: "the works of the poet Chen Xiangyan", "the works of the poet Mei Shaojing", "the works of the poet Zhao Gong", "the works of the poet Lu Zhi" and "the works of the poet Wang Yi".
[0117] Although these hypernyms to be processed look similar, the actual key information is inconsistent. Therefore, after removing similar text parts, the remaining text parts are: Chen Xiangyan, Mei Shaojing, Zhao Gong, Lu Zhi, Wang Yi.
[0118] The semantic similarity between the above remaining text parts is lower than the third preset threshold, and the average number of words contained is about 2.4, which is higher than the fourth preset threshold "2.2", which means that this aggregation is invalid, and these hypernyms to be processed Aggregation cannot be performed.
Embodiment 2
[0120] The hypernyms to be processed are: "simple homemade steamed dumplings", "homemade steamed dumplings" and "" steamed dumplings". After removing similar text parts, stop words and plural words, the remaining text parts are: NULL, NULL and NULL .
[0121] If the rest of the above text is empty, it means that the aggregation is valid, and these hypernyms to be processed can be merged.
[0122] Further, after it is determined that a group of hypernyms to be processed can be aggregated, the largest common character string between each hypernym to be processed in the aggregated group of hypernyms to be processed can be used as the aggregation of the group of hypernyms to be processed after the name.
[0123] For example: "simple homemade steamed dumplings", "homemade steamed dumplings" and "steamed dumplings", among them, the largest common character string: steamed dumplings, steamed dumplings can be used to name the hypernyms to be processed after aggregation, and these are...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more
PUM
Login to view more
Abstract
The present invention relates to the information processing technology, and in particular, to a hypernym aggregation method and apparatus, in order to improve the accuracy of hypernym aggregation. Themethod comprises that: a terminal device calculates the word vector similarity between the hypernyms according to the word vectors contained in each hypernym, calculates the entity type similarity between the hypernyms according to the entity types associated with the entities corresponding to each hypernym, and aggregates each hypernym with the word vector similarity reaching the first preset threshold and the entity type similarity reaching the second preset threshold. Thus, according to the technical scheme of the present invention, the short text such as the hypernym can be effectively processed, so that not only text key information contained in the hypernym can be effectively excavated, but also the type characteristics of the hypernym can be accurately described; and at the same time, the complicated workload of the artificial design of the characteristics can be avoided, and the generalization ability of the model can be enhanced, so that the invalid hypernym can be effectively identified, redundant data in the hypernym can be removed, and the accuracy of the hypernym aggregation can be significantly improved.
Description
technical field [0001] The present invention relates to information processing technology, in particular to a hypernym aggregation method and device. Background technique [0002] In the hypernym network generated based on the knowledge graph, in order to avoid the problem of hypernym redundancy, it is usually necessary to aggregate hypernyms with the same semantic meaning, that is, extract and merge hypernyms with different expressions for the same semantic meaning. For example: hypernyms about SLR cameras include: "a SLR camera", "commonly known as a SLR camera", "SLR camera", "LR camera", etc. These hypernyms with the same semantic meaning but different descriptions are called the same Semantic hypernyms. The process of merging these same semantic hypernyms together and expressing them with a common name is called the aggregation process of hypernyms. Merging hypernyms with the same semantics can reduce the redundancy problem of the hypernym network and improve the qual...
Claims
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more
Application Information
Patent Timeline
Application Date:The date an application was filed.
Publication Date:The date a patent or application was officially published.
First Publication Date:The earliest publication date of a patent with the same application number.
Issue Date:Publication date of the patent grant document.
PCT Entry Date:The Entry date of PCT National Phase.
Estimated Expiry Date:The statutory expiry date of a patent right according to the Patent Law, and it is the longest term of protection that the patent right can achieve without the termination of the patent right due to other reasons(Term extension factor has been taken into account ).
Invalid Date:Actual expiry date is based on effective date or publication date of legal transaction data of invalid patent.