Hypernym aggregation method and apparatus

An aggregation method and technology of hypernyms, applied in the field of information processing, to avoid complicated workload, enhance generalization ability, and improve accuracy

Active Publication Date: 2018-08-17
TENCENT TECH (SHENZHEN) CO LTD
View PDF7 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Therefore, the method based on edit distance to merge hypernyms with similar semantics also has certain limitations.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hypernym aggregation method and apparatus
  • Hypernym aggregation method and apparatus
  • Hypernym aggregation method and apparatus

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0116] The hypernyms to be processed are: "the works of the poet Chen Xiangyan", "the works of the poet Mei Shaojing", "the works of the poet Zhao Gong", "the works of the poet Lu Zhi" and "the works of the poet Wang Yi".

[0117] Although these hypernyms to be processed look similar, the actual key information is inconsistent. Therefore, after removing similar text parts, the remaining text parts are: Chen Xiangyan, Mei Shaojing, Zhao Gong, Lu Zhi, Wang Yi.

[0118] The semantic similarity between the above remaining text parts is lower than the third preset threshold, and the average number of words contained is about 2.4, which is higher than the fourth preset threshold "2.2", which means that this aggregation is invalid, and these hypernyms to be processed Aggregation cannot be performed.

Embodiment 2

[0120] The hypernyms to be processed are: "simple homemade steamed dumplings", "homemade steamed dumplings" and "" steamed dumplings". After removing similar text parts, stop words and plural words, the remaining text parts are: NULL, NULL and NULL .

[0121] If the rest of the above text is empty, it means that the aggregation is valid, and these hypernyms to be processed can be merged.

[0122] Further, after it is determined that a group of hypernyms to be processed can be aggregated, the largest common character string between each hypernym to be processed in the aggregated group of hypernyms to be processed can be used as the aggregation of the group of hypernyms to be processed after the name.

[0123] For example: "simple homemade steamed dumplings", "homemade steamed dumplings" and "steamed dumplings", among them, the largest common character string: steamed dumplings, steamed dumplings can be used to name the hypernyms to be processed after aggregation, and these are...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention relates to the information processing technology, and in particular, to a hypernym aggregation method and apparatus, in order to improve the accuracy of hypernym aggregation. Themethod comprises that: a terminal device calculates the word vector similarity between the hypernyms according to the word vectors contained in each hypernym, calculates the entity type similarity between the hypernyms according to the entity types associated with the entities corresponding to each hypernym, and aggregates each hypernym with the word vector similarity reaching the first preset threshold and the entity type similarity reaching the second preset threshold. Thus, according to the technical scheme of the present invention, the short text such as the hypernym can be effectively processed, so that not only text key information contained in the hypernym can be effectively excavated, but also the type characteristics of the hypernym can be accurately described; and at the same time, the complicated workload of the artificial design of the characteristics can be avoided, and the generalization ability of the model can be enhanced, so that the invalid hypernym can be effectively identified, redundant data in the hypernym can be removed, and the accuracy of the hypernym aggregation can be significantly improved.

Description

technical field [0001] The present invention relates to information processing technology, in particular to a hypernym aggregation method and device. Background technique [0002] In the hypernym network generated based on the knowledge graph, in order to avoid the problem of hypernym redundancy, it is usually necessary to aggregate hypernyms with the same semantic meaning, that is, extract and merge hypernyms with different expressions for the same semantic meaning. For example: hypernyms about SLR cameras include: "a SLR camera", "commonly known as a SLR camera", "SLR camera", "LR camera", etc. These hypernyms with the same semantic meaning but different descriptions are called the same Semantic hypernyms. The process of merging these same semantic hypernyms together and expressing them with a common name is called the aggregation process of hypernyms. Merging hypernyms with the same semantics can reduce the redundancy problem of the hypernym network and improve the qual...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/355G06F16/36G06F16/367
Inventor 郑孙聪李潇
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products