Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Electronic invoice content analysis method and system

A technology for electronic invoices and content, which is applied in electronic digital data processing, special data processing applications, instruments, etc., and can solve the problem that the invoice management system cannot meet the requirements.

Active Publication Date: 2016-06-01
AEROSPACE INFORMATION
View PDF5 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

With the increasing attention and use of electronic invoices, the traditional invoice management system can no longer meet the requirements, and the electronic invoice management system has emerged as the times require

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Electronic invoice content analysis method and system
  • Electronic invoice content analysis method and system
  • Electronic invoice content analysis method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0040] Embodiment 1. A method for analyzing the content of an electronic invoice.

[0041] Combine below figure 1 The method of the first embodiment will be described in detail.

[0042] figure 1 It is a flow chart of the method for analyzing the content of the electronic invoice in Embodiment 1 of the present invention, such as figure 1 As shown, the electronic invoice described in the embodiment of the present invention is based on the layout file format, including a position analysis module, a text merging module and a text association identification module, including the following steps:

[0043] Step S101 , the location analysis module invokes the format file analysis engine module to perform location analysis on the content of the electronic invoice, and obtain a set of location information in units of characters.

[0044] Specifically, the location parsing module parses the location information of each character in the electronic invoice. Preferably in the embodimen...

Embodiment 2

[0063] Embodiment 2, the processing flow of the text merging module in the method for analyzing the content of the electronic invoice.

[0064] Combine below figure 2 The method of this embodiment will be described in detail.

[0065] figure 2 It is the processing flowchart of the text merging module in the method for analyzing the content of the electronic invoice in the second embodiment of the present invention, such as figure 2 As shown, the method of the present embodiment includes the following steps:

[0066] Step S201, sort the character sets in the position information set from top to bottom and from left to right.

[0067] Step S202 , using the character gap threshold to preliminarily merge characters in the same text field in the same line.

[0068] Step S203, using the tag dictionary to set the type attribute of each text field text line.

[0069] In the embodiment of the present invention, the tag dictionary defines the face elements of the electronic invo...

Embodiment 3

[0073] Embodiment 3, the processing flow of the text association identification module in the method for analyzing the content of the electronic invoice.

[0074] image 3 It is a processing flowchart of the text association recognition module in the method for analyzing the content of the electronic invoice according to the third embodiment of the present invention.

[0075] Step S301, according to the label dictionary, traverse the text line set, and match a commodity line label.

[0076] Step S302: Find all commodity row labels according to the row gap threshold and the matched commodity row labels.

[0077] Step S303, start traversing the text line set at the end of the product line label, and determine the start and end positions of the product line content.

[0078] Step S304, judging the attribute type of the currently indexed text.

[0079] Step S305, if the attribute type of the currently indexed text is a text type, continue traversing, and return to step S304 to ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an electronic invoice content analysis method and system, and belongs to the technical field of text content extraction. An electronic invoice is based on a layout file format. The system mainly comprises a position analysis module, a text combination module and a text associative identification module, wherein the position analysis module calls a layout file analysis engine module to perform position analysis on the content of the electronic invoice so as to obtain a position information set taking a character as a unit; the text combination module adopts inter-character gaps to combine characters belonging to same text regions so as to obtain a text region set; the text associative identification module performs associative identification of the text regions on the text region set in combination with a tag dictionary and text region gaps; and the analysis work of the whole electronic invoice is finished and analysis data is stored in a database. According to the method, the universality and adaptability of text content extraction can be effectively improved and the invoice content of different types and different styles can be analyzed.

Description

technical field [0001] The invention relates to the technical field of text content extraction, in particular to a method and system for analyzing content of an electronic invoice. Background technique [0002] In order to effectively save social resources, reduce tax costs, and finally realize paperless invoices, my country is stepping up efforts to promote electronic invoices. With the increasing attention and use of electronic invoices, the traditional invoice management system can no longer meet the requirements, and the electronic invoice management system came into being instead. Since the electronic invoice management system needs to store the content of the invoice, the analysis of the content of the electronic invoice is an essential step. However, due to the wide variety of invoices and complex styles, how to improve the generality and applicability of the invoice analysis method is an urgent problem to be solved. Contents of the invention [0003] In view of t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 龚勇浩戴晓栋张玉魁尹春天范立波杜英垒黄新华
Owner AEROSPACE INFORMATION
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products