Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Rule-based character attribute extraction method and system

An attribute extraction and character technology, applied in the field of natural language information extraction, can solve the problems of low matching efficiency and poor extraction performance, and achieve the effect of increasing the success rate

Active Publication Date: 2022-03-11
海南港航控股有限公司
View PDF10 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Aiming at the defects of the prior art, the present invention provides a rule-based character attribute extraction method and system, which solves the problem of matching in the process of character attribute extraction in the prior art. Technical issues of low efficiency and poor extraction performance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Rule-based character attribute extraction method and system
  • Rule-based character attribute extraction method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0040] The rule-based character attribute extraction method of the present invention, such as figure 1 shown, including the following steps:

[0041] (1) character attribute word acquisition step: use the Chinese word segmenter to carry out word segmentation process to the paragraph that contains character attribute information, obtain some personal character attribute words that described paragraph comprises, and mark the part of speech of each character attribute word;

[0042] (2) character attribute word segmentation preservation step: each character attribute word with part-of-speech tagging ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a rule-based character attribute extraction method and system, and belongs to the technical field of information extraction of natural languages, and the method comprises the following steps: carrying out word segmentation processing on a paragraph containing character attribute information by using a Chinese word segmentation device to obtain a plurality of character attribute words contained in the paragraph, and labeling the part-of-speech of each character attribute word; segmenting each character attribute word with part-of-speech tagging by using a space to obtain a set of all character attribute words and part-of-speech in the input paragraph, and storing the set in a one-dimensional array; and traversing the part-of-speech of each character attribute word in the one-dimensional array, matching the character attribute words with the character attribute trigger word matching rule table, continuing to match trigger words before and after the character attribute words for the matched character attribute words, if the matching is successful, extracting the character attribute words, and otherwise, not extracting the character attribute words. The invention provides a simple and effective character attribute extraction method, which reduces the extraction difficulty and improves the extraction efficiency at the same time.

Description

technical field [0001] The invention belongs to the technical field of natural language information extraction, and more specifically relates to a rule-based character attribute extraction method and system. Background technique [0002] With the rapid development of the Internet, the user data acquired by various websites has also increased exponentially. How to quickly and accurately analyze the real and useful character attribute information from these massive data, and provide data support for user portraits and business investment decisions has become an important issue. Information extraction is the problem that this research field tries to solve. Character attribute and relationship extraction is one of the information extraction tasks, the purpose is to extract entity attributes and the relationship between entities from unstructured text. [0003] At present, there are two main methods for character attribute extraction. The first method is based on rule matching. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/253G06F40/284G06F40/295G06F40/30
CPCG06F40/253G06F40/284G06F40/295G06F40/30
Inventor 王善和张勇刘如梦
Owner 海南港航控股有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products