Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Intelligent analysis method and device for insurance industry documents

A parsing method and industry-leading technology, applied in the field of document parsing, can solve problems such as inability to do structured parsing, and achieve the effect of improving accuracy and efficiency

Active Publication Date: 2021-02-02
BEIJING UNIV OF POSTS & TELECOMM
View PDF7 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Its storage form is basically a document in PDF format, which cannot achieve fine structured analysis

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Intelligent analysis method and device for insurance industry documents
  • Intelligent analysis method and device for insurance industry documents
  • Intelligent analysis method and device for insurance industry documents

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be described in further detail below in conjunction with the embodiments and accompanying drawings. Here, the exemplary embodiments and descriptions of the present invention are used to explain the present invention, but not to limit the present invention.

[0030] Here, it should also be noted that, in order to avoid obscuring the present invention due to unnecessary details, only the structures and / or processing steps closely related to the solution according to the present invention are shown in the drawings, and the related Other details are not relevant to the invention.

[0031] It should be emphasized that the term "comprising / comprising" when used herein refers to the presence of a feature, element, step or component, but does not exclude the presence or addition of one or more other features, elements, steps or components.

[0032] The st...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides an intelligent analysis method and a device for insurance industry documents. The method comprises the steps of converting original data in a PDF format into data in a CSV format, wherein the data in the CSV format comprises a predetermined document text recognition feature dimension; performing data cleaning processing on the converted data in the CSV format; capturing context semantic information of the text features based on the text positions, and expanding feature dimensions; labeling a training sample to be used as a training sample by utilizing a plurality of categories to obtain a training sample set, wherein the plurality of categories comprise text contents and a plurality of title categories of different levels; selecting a training set from the training sample set, training a random forest algorithm by using the training set, and performing category classification on the test sample by using the trained random forest algorithm to obtain a category classification result of test sample data features; and recombining the document content based on a category classification result, generating a structured file for outputting, and extracting important attributes in the document.

Description

technical field [0001] The invention relates to the technical field of document analysis, in particular to an intelligent analysis method and device for documents in the insurance industry. Background technique [0002] In the 1990s, with the rapid development of artificial intelligence technology, many foreign intelligence workers applied the concept of machine learning to the field of automatic text classification. With the continuous maturity of machine learning algorithms, more and more electronic documents can be intelligently parsed and classified. However, in the field of text data in today's society, the vast majority of text exists in unstructured form. Only structured data can be used for better training and prediction of machine learning. Therefore, structural analysis of text data is a major problem in the field of natural language processing today. [0003] The existing document storage format is basically in PDF format, so it is generally necessary to conver...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/205G06F40/211G06F40/242G06F40/289G06F40/30G06F40/103G06Q40/08
CPCG06Q40/08G06F40/103G06F40/205G06F40/211G06F40/242G06F40/289G06F40/30
Inventor 岳潭胡宗海
Owner BEIJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products