Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Information extraction method and device

A technology of information extraction and rules, applied in the direction of instruments, calculations, semantic analysis, etc., can solve problems such as manpower and time

Active Publication Date: 2022-04-19
ULTRAPOWER SOFTWARE
View PDF20 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] A good rule model can reach a high standard of accuracy and precision, but when building a rule model, not only professional modelers are required, but also the text elements that need to be matched must be exhaustively exhausted, which consumes a lot of manpower and time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Information extraction method and device
  • Information extraction method and device
  • Information extraction method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0052] The embodiments of the present application are described in detail below.

[0053] In the rule-based extraction method, the regular expressions include information extraction rules, and the information extraction rules are used to extract the information expected by the user from the text. For example, match the information extraction rule "medium body | average body shape" with the text, and when "medium body size" or "average body shape" appears in the text, such information describing the body shape in the text will be extracted. In order to extract information more comprehensively, modelers need to exhaustively enumerate all possible expressions to construct regular expressions, which consumes a lot of manpower and time.

[0054] In addition to rule-based extraction methods, statistics-based extraction methods can also be used to extract information. That is, first use the corpus that marks the information that the user wants to extract to train the statistical mod...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the present invention discloses an information extraction method and device. The method includes: obtaining the text of the information to be extracted and an extraction expression, the extraction expression includes an area determination rule and an information extraction rule, and the area determination rule includes statistics operator, the statistical operator characterizes the statistical model used to identify named entities and / or dependent components in the text; using the statistical model to identify the named entities and / or dependent components in the text, is the identified named entity and / or the dependent components respectively mark the corresponding identification tags; use the identification tags to compare the region determination rules and the text to determine the effective extraction region in the text; extract the effective extraction region from the effective extraction region The string matched by the information extraction rule. The above method calls the statistical model in a regular manner, which is convenient and flexible, and at the same time expands the scope of the recognition vocabulary, reduces rule construction, and extracts the information required by the user more accurately.

Description

technical field [0001] The invention relates to the fields of text processing and information extraction, and in particular to an information extraction method. In addition, the invention also relates to an information extraction device. Background technique [0002] Information Extraction is a text processing technology that extracts specified types of factual information such as entities, relationships, and events from natural language texts, and forms structured data output. It can be used as a pre-information processing process for operations such as intelligent question answering, deep mining of semantic information, and standardized information extraction. [0003] The main method of information extraction is the rule-based extraction method, which generally includes two stages: constructing regular expressions, and applying regular expressions to obtain the information needed by users. Constructing regular expressions is mainly constructed by modelers based on extra...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/131G06F40/143G06F40/30G06F40/295
CPCG06F40/131G06F40/295G06F40/14G06F40/30
Inventor 李德彦晋耀红吴相博
Owner ULTRAPOWER SOFTWARE
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products