A semantic role annotation-based document body cutting and classifying system and a semantic role annotation-based document body cutting and classifying method

A semantic role labeling and classification method technology, applied in the field of document genre classification, can solve the problems of low accuracy, short length, and difficulty in training accurate classification models with machine learning technology, and achieve small data volume and genre recognition accuracy The effect of high and low data cost

Pending Publication Date: 2019-05-03
京华信息科技股份有限公司
View PDF3 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The accuracy of this genre identification method is low, because party and government documents are not as short as news information, and many party and government documents have dozens or even hundreds of pages.
It is difficult for machine learning techniques to accurately train accurate classification models

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A semantic role annotation-based document body cutting and classifying system and a semantic role annotation-based document body cutting and classifying method
  • A semantic role annotation-based document body cutting and classifying system and a semantic role annotation-based document body cutting and classifying method
  • A semantic role annotation-based document body cutting and classifying system and a semantic role annotation-based document body cutting and classifying method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0052] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0053] In the embodiment of the document genre classification system and method based on semantic role labeling in the present invention, the structural diagram of the document genre classification system based on semantic role labeling is as follows figure 1 shown. figure 1Among them, the document genre classification system based on semantic role labeling includes a connected semantic role labeling engine, knowledge ontology database, genre recognition rule ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a document body cutting classification system and method based on semantic role annotation. The system comprises a semantic role annotation engine, a knowledge body library, abody cutting recognition rule engine and a body cutting recognition rule library which are connected. The semantic role labeling engine comprises word dividers which are connected with one another; wherein the body cutting recognition rule engine comprises a body cutting recognition rule parser, a body cutting recognition rule matcher and a body cutting recognition rule inference device which areconnected with one another, and the body cutting recognition rule parser is used for parsing a body cutting recognition rule into a data structure which can be recognized by a computer program from atext; the body cutting recognition rule matcher is used for matching a result labeled by the semantic role labeling engine with a body cutting recognition rule; and the body cutting recognition rule inference device is used for executing inference according to the matching result of the body cutting recognition rule matcher to obtain the final body cutting classification. The method is relativelylow in data cost, relatively low in calculation cost and relatively high in posture recognition accuracy.

Description

technical field [0001] The invention relates to the field of document genre classification, in particular to a document genre classification system and method based on semantic role labeling. Background technique [0002] When classifying party and government documents such as party documents, special policies, laws and regulations, and leadership speeches, the traditional method is to collect a large amount of corpus for different document genres, use machine learning training models, and classify documents based on the machine learning training model , the number of documents summarized is large, and the data cost is high. In addition, the complete document content needs to be processed, and the calculation cost is high. The accuracy of this genre identification method is low, because party and government documents are not as short as news information, and many party and government documents have dozens or even hundreds of pages. It is difficult for machine learning techn...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/35G06F17/27G06N5/04
CPCY02P90/30
Inventor 蓝建敏
Owner 京华信息科技股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products