Maximum entropy based Vietnamese cross ambiguity elimination method

A technology of maximum entropy and maximum entropy model, which is used in natural language data processing, instrumentation, network data indexing, etc.

Active Publication Date: 2016-07-06
KUNMING UNIV OF SCI & TECH
View PDF2 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The present invention provides a maximum entropy-based Vietnamese cross-disambiguation disambiguati...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Maximum entropy based Vietnamese cross ambiguity elimination method
  • Maximum entropy based Vietnamese cross ambiguity elimination method
  • Maximum entropy based Vietnamese cross ambiguity elimination method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0046] Embodiment 1: as Figure 1-3 Shown, a kind of Vietnamese cross ambiguity disambiguation method based on maximum entropy, the specific steps of the Vietnamese cross ambiguity disambiguation method based on maximum entropy are as follows:

[0047] Step1. First, carry out disambiguation modeling on the Vietnamese cross-ambiguity field corpus in the formed Vietnamese cross-ambiguity field database, and obtain the Vietnamese maximum entropy cross-ambiguity disambiguation model;

[0048] Step2. Randomly select the test corpus from the Vietnamese cross ambiguity field corpus to disambiguate through the established Vietnamese maximum entropy cross ambiguity disambiguation model to obtain the disambiguation parameter sequence.

Embodiment 2

[0049] Embodiment 2: as Figure 1-3 Shown, a kind of Vietnamese cross ambiguity disambiguation method based on maximum entropy, the specific steps of the Vietnamese cross ambiguity disambiguation method based on maximum entropy are as follows:

[0050]Step1. First, carry out disambiguation modeling on the Vietnamese cross-ambiguity field corpus in the formed Vietnamese cross-ambiguity field database, and obtain the Vietnamese maximum entropy cross-ambiguity disambiguation model;

[0051] Step2. Randomly select the test corpus from the Vietnamese cross ambiguity field corpus to disambiguate through the established Vietnamese maximum entropy cross ambiguity disambiguation model to obtain the disambiguation parameter sequence.

[0052] The specific steps of disambiguation modeling in the step Step1 are as follows:

[0053] Step1.1, first use the crawler program to crawl out the webpage information from the Internet;

[0054] Step1.2. Filter and process the crawled webpage infor...

Embodiment 3

[0060] Embodiment 3: as Figure 1-3 Shown, a kind of Vietnamese cross ambiguity disambiguation method based on maximum entropy, the specific steps of the Vietnamese cross ambiguity disambiguation method based on maximum entropy are as follows:

[0061] Step1. First, carry out disambiguation modeling on the Vietnamese cross-ambiguity field corpus in the formed Vietnamese cross-ambiguity field database, and obtain the Vietnamese maximum entropy cross-ambiguity disambiguation model;

[0062] Step2. Randomly select the test corpus from the Vietnamese cross ambiguity field corpus to disambiguate through the established Vietnamese maximum entropy cross ambiguity disambiguation model to obtain the disambiguation parameter sequence.

[0063] The specific steps of disambiguation modeling in the step Step1 are as follows:

[0064] Step1.1, first use the crawler program to crawl out the webpage information from the Internet;

[0065] Step1.2. Filter and process the crawled webpage info...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a maximum entropy based Vietnamese cross ambiguity elimination method and belongs to the technical field of natural language processing. The method comprises the steps of firstly performing disambiguation modeling on Vietnamese cross ambiguity field corpora in a formed Vietnamese cross ambiguity field library to obtain a Vietnamese maximum entropy cross ambiguity elimination model; and randomly selecting test corpora from the Vietnamese cross ambiguity field corpora and performing disambiguation through the established Vietnamese maximum entropy cross ambiguity elimination model to obtain a disambiguated parameter sequence. The method effectively eliminates the ambiguities of Vietnamese cross ambiguity words and provides powerful support for work such as lexical analysis, syntactic analysis, semantic analysis, information extraction, information retrieval, machine translation and the like; at present, no related Vietnamese cross ambiguity elimination reports are discovered; and the method achieves a very good effect.

Description

technical field [0001] The invention relates to a maximum entropy-based Vietnamese cross-ambiguity disambiguation method, which belongs to the technical field of natural language processing. Background technique [0002] Vietnamese ambiguity disambiguation is the main link in the work of word segmentation and part-of-speech tagging, and it is the basis of other high-level applications and plays an extremely important role. In various Vietnamese information processing software or systems, Vietnamese cross-ambiguity disambiguation is an indispensable work. With the continuous improvement of Internet search technology, disambiguation has attracted more and more attention. The degree of disambiguation of ambiguous fields determines the accuracy of search; at the same time, disambiguation can improve the lexical analysis, syntactic analysis, semantic analysis and Application effects such as machine translation. Ambiguity is divided into intersection ambiguity and combination am...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/951G06F40/289
Inventor 郭剑毅刘艳超余正涛毛存礼线岩团陈玮
Owner KUNMING UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products