Candidate synonym determination method and device, server and storage medium

A technology for determining method and synonyms, which is applied in the direction of instruments, electrical digital data processing, calculation, etc., can solve the problems of inaccurate determination results of candidate synonyms, inaccurate results of synonyms mining, etc., and achieve the effect of improving accuracy

Active Publication Date: 2020-05-08
TENCENT TECH (SHENZHEN) CO LTD
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The core of this search corpus automatic alignment technology is translation, and translation focuses on the alignment between the same words. When determining candidate synonyms between search corpora, it f

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Candidate synonym determination method and device, server and storage medium
  • Candidate synonym determination method and device, server and storage medium
  • Candidate synonym determination method and device, server and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0035] The construction of traditional synonym dictionaries is mainly done manually by linguists through the interpretation of modern Chinese dictionaries, which requires a huge labor cost. The current synonym dictionary construction technology can use search engines to do computer automatic alignment technology through massive co-clicked search corpus (that is, a collection of different search corpora that clicked on the same search result), and then obtain synonyms between search corpora, and finally A synonym dictionary is constructed by manual deletion with the help of various statistical language features. The computer automatic alignment technology used here is the core of the synonym mining task. Usually, this problem is regarded as the translation problem from the source string to the target string. Using classic automatic translation technology, synonyms can be mined from the search corpus. Among them, the automatic translation technology will first determine the cand...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a candidate synonym determination method and device, a server and a storage medium. The method comprises the steps of: obtaining a plurality of search linguistic data, and carrying out word segmentation on each search linguistic data to obtain the segmented word sequence of each search linguistic data; for a first segmented word sequence and a second segmented word sequencein each segmented word sequence, determining a first candidate segmented word matched with a target segmented word of the first segmented word sequence in the second segmented word sequence based onthe sequencing information of segmented words and the number of the segmented words which are represented by the corresponding segmented word sequence; comparing the first segmented word information related to the sequence of the target segmented word in the first segmented word sequence with second segmented word information related to the sequence of the first candidate segmented word in the second segmented word sequence to obtain a comparison result. The second candidate segmented word, used for forming a candidate synonym with the target segmented word of the first segmented word sequenceis determined in the second segmented word sequence according to the comparison result and the first candidate segmented word, and therefore, so that the accuracy of a candidate synonym determinationresult is improved.

Description

technical field [0001] The present invention relates to the technical field of synonym mining, and more specifically, to a method, device, server and storage medium for determining candidate synonyms. Background technique [0002] At present, the construction process of a thesaurus dictionary is usually to first automatically align each search corpus that clicks on the same search result on the search engine to obtain synonyms between the search corpora, and then manually select the synonyms from the obtained synonyms to form a synonym dictionary. synonym for . [0003] The existing technology regards the automatic alignment of search corpora as a translation problem, and uses automatic translation technology to first determine candidate synonyms between search corpora, and then iteratively optimizes based on the candidate synonyms to obtain synonyms between search corpora. [0004] The core of this search corpus automatic alignment technology is translation, and translatio...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F40/247G06F40/284G06F40/42
Inventor 康战辉
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products