Unlock instant, AI-driven research and patent intelligence for your innovation.

Method for extracting Chinese institutional unit name from text information

A text information and name technology, applied in special data processing applications, instruments, network data retrieval, etc., can solve the problems of low accuracy, high cost, and large corpus feature size, and achieve less calculation and improve extraction accuracy , the effect of fast extraction speed

Active Publication Date: 2015-11-11
GUANGZHOU WANLONG SECURITIES CONSULTING CO LTD
View PDF3 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This recognition method needs to manually label a large number of corpus for training. The corpus has a large scale of features, the cost is relatively high, and the accuracy rate is not very high.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for extracting Chinese institutional unit name from text information
  • Method for extracting Chinese institutional unit name from text information
  • Method for extracting Chinese institutional unit name from text information

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] refer to figure 1 , the invention provides a method for extracting the name of a Chinese institution from text information, comprising:

[0039] S1. Load the text information to be analyzed;

[0040] S2. Matching the text information to be analyzed with the preceding labeling rules, marking the positions of the preceding words, and extracting information that conforms to the preceding labeling rules;

[0041] S3. Perform back-boundary identification processing on the extracted information, and then extract and obtain candidate company name data;

[0042] S4. Matching the front labeling rules on the candidate company name data, and obtaining the candidate company name after decision-making processing;

[0043] S5. Search and verify according to the candidate company name, judge whether the verification is successful, and if the verification is successful, obtain the name of the institution in Chinese.

[0044] As a further preferred embodiment, the preceding labeling ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention discloses a method for extracting a Chinese institutional unit name from text information. The method comprises: loading to-be-analyzed text information; performing matching of a front part tagging rule on the to-be-analyzed text information, marking the position of a front part word, and extracting information in coincidence with the front part tagging rule; performing posterior boundary identification processing on the extracted information to extract alternative company name data; performing matching of the front part tagging rule on the alternative company name data, and performing decision processing to obtain an alternative company name; and performing search verification on the alternative company name, judging whether the verification is successful, and if the verification is successful, obtaining a Chinese institutional unit name. According to method for extracting the Chinese institutional unit name from the text information, a front part word and a posterior boundary word of the Chinese institutional unit name are matched and marked, and the Chinese institutional unit name is extracted by combination with network search verification, so that the amount of operation is relatively small, the extraction speed is high, and the extraction precision is greatly improved; and therefore, the method can be widely applied to weighing instrument industries.

Description

technical field [0001] The invention relates to the field of text information extraction and mining, in particular to a method for extracting names of Chinese institutions from text information. Background technique [0002] With the rapid development of the Internet and its technology, the information on the network is growing explosively, and a large amount of information is presented in front of people in the form of electronic documents. People urgently need some automated tools to help them quickly find the real Important information, so information extraction research came into being, and named entity recognition research is an important part of information extraction. Named entity recognition refers to the identification of entities with specific meaning in text, mainly including names of people, places, institutions, dates, etc. Among them, the name of the institution is a relatively important category, especially the name of the institution in Chinese. In this app...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/313G06F16/374G06F16/951
Inventor 吴远辉
Owner GUANGZHOU WANLONG SECURITIES CONSULTING CO LTD
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More