Method and system for automatic synonym discovery based on topic model

A topic model and automatic discovery technology, applied in natural language data processing, special data processing applications, instruments, etc., can solve the problems of synonym method error rate, limit the efficiency of synonyms, and cannot improve the efficiency of synonym discovery, so as to achieve efficiency improvement, The effect of improving efficiency and solving the problem of semantic approximation

Inactive Publication Date: 2019-01-01
SHANGHAI XINFEIFAN E COMMERCE CO LTD
View PDF1 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The existing mainstream automatic synonym discovery algorithm needs prior knowledge to construct the reference text pattern for synonym discovery, which limits the efficiency of synonym discovery; while another reference text pattern matching method requires manual analysis of the part-of-speech and semantics of the known vocabulary in advance. Annotate, build reference text schema
[0003] refer to figure 1 It can be seen that the discovery of synonyms in the existing system needs to be supplemented by manual

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for automatic synonym discovery based on topic model
  • Method and system for automatic synonym discovery based on topic model
  • Method and system for automatic synonym discovery based on topic model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] The technical solutions of the present invention will be further described below in conjunction with the embodiments and the accompanying drawings.

[0043] A kind of automatic synonym discovery method based on topic model provided by the present invention, comprises the following steps at least:

[0044] Import data of synonyms to be discovered;

[0045] Segment the imported data according to the information in the database;

[0046] Build a topic model and perform topic model clustering;

[0047] Minimal correlation clustering for topic clustering;

[0048] Output synonyms.

[0049] In the method of the present invention, after the step of outputting synonyms, there is also a step of manual screening of synonyms, and the technology of manual screening of synonyms can be realized by using existing technologies, so it will not be repeated in this embodiment.

[0050] In the present invention, the topic model can be hidden Dirichlet allocation model, and the steps of...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for automatically discovering synonyms based on a subject model, which at least comprises the following steps: importing data of synonyms to be discovered; word segmentation being performed on the imported data according to the information of the database; constructing theme model and clustering theme model; performing minimum correlation clustering for topic clustering; outputing synonym. The invention does not need prior knowledge and manual labeling, realizes automatic clustering of synonyms, and improves the efficiency of synonym discovery. To a certain extent, the problem of semantic similarity is solved, and manual intervention is not needed in the implementation process except the final screening, which greatly improves the efficiency of synonym automatic discovery.

Description

technical field [0001] The invention relates to the technical field of natural language processing, in particular to a topic model-based automatic synonym discovery method and system thereof. Background technique [0002] With the development of the information age, the scale of network text data is getting bigger and bigger, so the processing of natural language has gradually become more and more important. The importance of discovery technology is also increasingly reflected. The existing mainstream automatic synonym discovery algorithm needs prior knowledge to construct the reference text pattern for synonym discovery, which limits the efficiency of synonym discovery; while another reference text pattern matching method requires manual analysis of the part-of-speech and semantics of the known vocabulary in advance. Annotate, build reference text schema. [0003] refer to figure 1 It can be seen that the discovery of synonyms in the existing systems needs to be suppleme...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/35G06F17/27
CPCG06F40/247
Inventor 曲德君李进岭曹大军杨冠军郁抒思
Owner SHANGHAI XINFEIFAN E COMMERCE CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products