Unlock instant, AI-driven research and patent intelligence for your innovation.

Constraint conditional random field-based Vietnamese noun chunk identification method

A technology of constraints and recognition methods, applied in natural language translation, semantic tool creation, natural language data processing, etc., to achieve good recognition results and improve the effect of lexical analysis

Inactive Publication Date: 2018-03-13
KUNMING UNIV OF SCI & TECH
View PDF3 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The present invention provides a Vietnamese noun chunk recognition method based on a constrained random field to solve the problem of Vietnamese noun chunk recognition, reduce the complexity of syntactic analysis, and improve the performance and efficiency of subsequent tasks

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Constraint conditional random field-based Vietnamese noun chunk identification method
  • Constraint conditional random field-based Vietnamese noun chunk identification method
  • Constraint conditional random field-based Vietnamese noun chunk identification method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0030] Embodiment 1: as Figure 1-2 Shown, based on the Vietnamese noun block recognition method of constraint random field, the concrete steps of described method are as follows:

[0031] Step1. Building a corpus of noun chunks: First, crawl text corpora from Vietnamese websites, perform word segmentation, part-of-speech tagging, and manually mark noun phrases, and then manually proofread, mark, and deduplicate to form a corpus of Vietnamese noun chunks; Vietnamese nouns Part of the corpus in the chunk corpus is used to construct constraints, as training corpus and test corpus;

[0032] Step2, build constraints: from the Vietnamese noun chunk corpus, select the part-of-speech characteristics of the noun chunks according to the Vietnamese grammatical characteristics, and construct constraints in combination with the characteristics;

[0033] Step3. Construct a Vietnamese noun chunk recognition model based on constrained random fields: first, use conditional random fields to t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a constraint conditional random field-based Vietnamese noun chunk identification method, and belongs to the technical field of natural language processing. The method comprises the steps of firstly constructing a Vietnamese noun chunk corpus library; performing statistics on part-of-speech characteristics of noun chunks in the corpus library, and establishing a constraintcondition; training noun chunk corpora by applying a conditional random field to obtain an initial conditional random field identification model; adding the established constraint condition to obtaina final constraint conditional random field identification model; and according to a noun chunk identification model parameter sequence, identifying Vietnamese noun chunks to obtain a final identification result sequence. The Vietnamese noun chunks are effectively identified; and powerful support is provided for work of lexical analysis, semantic analysis, information extraction, information retrieval, machine translation and the like.

Description

technical field [0001] The invention relates to a method for recognizing chunks of Vietnamese nouns based on a constrained random field, and belongs to the technical field of natural language processing. Background technique [0002] Noun chunk recognition is a basic and important task in the process of Natural Language Processing (NLP), which can reduce the complexity of syntactic analysis and play an extremely important role in improving the performance and efficiency of machine translation . Noun phrase recognition is to automatically extract specific structured information from unstructured text, and its role is crucial; chunk recognition was originally proposed by Stenven Abney, who first discovered that chunks can reflect text content better than words contained information. Until 1995, Lance Ramshaw and Mitch Marcus proposed the method of using machine learning to solve the block, and achieved good results. Afterwards, a large number of scholars conducted in-depth ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/28G06F17/27G06F17/30
CPCG06F16/36G06F16/951G06F40/284G06F40/56G06F40/58
Inventor 郭剑毅李佳余正涛毛存礼线岩团陈玮
Owner KUNMING UNIV OF SCI & TECH