Unlock instant, AI-driven research and patent intelligence for your innovation.

Method for analyzing Excel file to multiple tables and improving data precision based on SAX

A technology of data accuracy and excel, which is applied in the field of data processing, can solve problems such as time format cannot be converted normally, data loss accuracy, etc., and achieve the effect of improving data accuracy and date data processing ability and improving flexibility

Inactive Publication Date: 2021-10-01
INSPUR SOFTWARE TECH CO LTD
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] Aiming at the needs and deficiencies of the current technological development, the present invention provides a method for parsing Excel files to multiple tables based on SAX and improving data accuracy. Loss of precision, time format cannot be converted normally, and problems caused by traditional DOM parsing methods

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for analyzing Excel file to multiple tables and improving data precision based on SAX
  • Method for analyzing Excel file to multiple tables and improving data precision based on SAX
  • Method for analyzing Excel file to multiple tables and improving data precision based on SAX

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0025] The present embodiment proposes a method based on SAX parsing Excel files with at most tables and improving data accuracy, including:

[0026] First, obtain the excel file, which contains multiple mappings, each of which is the header of the excel file, and configure the mapping relationship between the target data table and the header.

[0027] Set the data threshold based on the number of threads, memory, and EXCEL row data size of the deployment machine.

[0028] Then, compare the mapping relationship between the header information in the excel file and the configuration target data table, analyze each data unit of the excel file through SAX, and perform mapping relationship matching verification, specifically:

[0029] Use SAX to traverse each row of the excel file and match it with the mapping relationship, and confirm the relevant information of the matching mapping relationship in the excel file. A mapping relationship includes the mapping between multiple header...

Embodiment 2

[0038] The present embodiment proposes a method based on SAX parsing Excel files with at most tables and improving data accuracy, including:

[0039] First, obtain the excel file, which contains multiple mappings, each of which is the header of the excel file, and configure the mapping relationship between the target data table and the header.

[0040] Set the data threshold based on the number of threads, memory, and EXCEL row data size of the deployment machine.

[0041] Then, compare the mapping relationship between the header information in the excel file and the configuration target data table, analyze each data unit of the excel file through SAX, and perform mapping relationship matching verification, specifically:

[0042] Use SAX to traverse each row of the excel file and match it with the mapping relationship, and confirm the relevant information of the matching mapping relationship in the excel file. A mapping relationship includes the mapping between multiple header...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for analyzing an Excel file to multiple tables and improving data precision based on SAX, which relates to the technical field of data processing. The method comprises the following steps of firstly, obtaining an excel file, enabling the excel file to comprise a plurality of mappings, enabling each mapping to be a header of the excel file, and configuring a mapping relationship between a target data table and the header, then comparing a mapping relationship between header information in the excel file and the configured target data table, analyzing each data unit of the excel file through SAX, performing mapping relationship matching verification, finally performing data import splitting on the excel file according to the successfully matched mapping relationship, and when traversal data reaches a set data threshold value or is a new header, starting the multi-thread and data source connection pool to perform data import processing. The method aims to import excel files into a plurality of database tables in an SAX analysis mode, and meanwhile the problems that data lose precision, time formats cannot be normally converted and problems brought by a traditional DOM analysis mode are solved.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to a SAX-based method for parsing Excel files up to tables and improving data accuracy. Background technique [0002] For excel file parsing and importing database tables, the traditional method has the following problems: [0003] (1) The traditional DOM parsing method is an excel file of several M, and the result of the analysis will occupy hundreds of M of memory, resulting in JVM memory overflow; [0004] (2) Each excel file can only have one header information. When there are multiple header information in an excel file, the excel must be split into multiple excels with different headers (one header corresponds to one excel data file), import one by one; [0005] (3) The mapping relationship between the excel table header and the data table must be completely matched, and partial matching cannot be performed according to the specified requirements; [0006] (4) SAX p...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/18G06F40/205
CPCG06F40/18G06F40/205
Inventor 张庆兵
Owner INSPUR SOFTWARE TECH CO LTD