Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Multi-feature-fused controlling method for recognizing Chinese organization name

A technology of multi-feature fusion and control method, which is applied in the control field of multi-feature fusion recognition of Chinese institution names, can solve the problems of indeterminate length, large coverage and high difficulty, and achieve the effect of reducing recognition errors.

Inactive Publication Date: 2013-03-06
EAST CHINA NORMAL UNIVERSITY
View PDF2 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the lack of research on Chinese institution names, and its characteristics of large coverage, wide range of words, variable length, and customary abbreviations, the effect of its recognition is not ideal
At present, the Chinese organization name recognition method based on role labeling can realize the recognition of organization names well, but it is very difficult to build a complete role database, and this method is not ideal for the recognition of complex organization names
The recognition method of Chinese organization name based on statistics, due to the complexity of the statistical method, makes the realization of the recognition method extremely difficult

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-feature-fused controlling method for recognizing Chinese organization name
  • Multi-feature-fused controlling method for recognizing Chinese organization name
  • Multi-feature-fused controlling method for recognizing Chinese organization name

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0032] The present invention relies on the word segmentation software ICTICLAS of the Chinese Academy of Sciences to perform word segmentation and part-of-speech tagging on the input document. Among them, the right-boundary feature words and the contextual semantic features of organization names are obtained from the corpus marked in January 1998 by the People's Daily. The characteristics and composition modes of the left boundary are obtained by analyzing and summarizing the existing institution names. Specific operation steps: The first step is to use the word segmentation software ICTCLAS of the Chinese Academy of Sciences to perform word segmentation and part-of-speech tagging on the input document. The second step is to determine the position of the right boundary word of the organization name according to the right boundary feature lexicon. The third step is to start from the position of the right boundary and match the left boundary rule from right to left. In the fou...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a multi-feature-fused controlling method for recognizing Chinese organization name in a natural language processing system. The method is characterized by comprising the following steps of: a. recognizing left and right boundaries of a statement to be recognized according to a right boundary feature word library of a Chinese organization name and a left boundary rule of the Chinese organization name, and generating candidate Chinese organization names; b. determining a composing mode of candidate Chinese organization names, and screening the candidate Chinese organization names; and c. comparing feature words in a context semantics environment of the Chinese organization names, and verifying the candidate Chinese organization names so as to determine the Chinese organization names.

Description

technical field [0001] The invention relates to technical fields such as named entity recognition, relationship mining, document summarization, syntax analysis, machine translation, information extraction, etc., specifically a system for identifying and marking organization names in Chinese documents. Background technique [0002] With the widespread use of computers and the rapid development of the Internet, a large amount of information is presented to people in the form of electronic documents. People are in urgent need of some automated tools to help them quickly find the information they really need in massive information sources, so the processing of information documents came into being. Since Chinese documents are different from English documents, there is no space between words, and there is no case distinction for proprietary words such as company names, person names, place names, etc., which increases the difficulty of processing Chinese documents to a greater ext...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30G06F17/27
Inventor 凌雅娟杨静
Owner EAST CHINA NORMAL UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products