Excel document data analysis method and device

A technology of data analysis and excel, which is applied in the field of data processing, can solve problems such as memory overflow, analysis methods cannot cope, and Excel document data cannot be continuously analyzed, so as to improve the efficiency of data analysis

Active Publication Date: 2016-05-11
SHENZHEN AUDAQUE DATA TECH
View PDF5 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

These two methods have inevitable defects: the first parsing method can only parse Excel documents in a specific format, so it cannot be generally applied; the second parsing method cannot deal with larger Excel documents because the second method Loading the entire document into memory during parsing may cause memory overflow and make it impossible to continue parsing Excel document data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Excel document data analysis method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] The technical solution and beneficial effects of the present invention will be apparent through the detailed description of specific embodiments of the present invention in conjunction with the accompanying drawings.

[0029] refer to figure 1 , which is a flow chart of a preferred embodiment of the data parsing method of the Excel document of the present invention. The method mainly includes:

[0030] Step 10, obtaining the file stream of the Excel document to be parsed. In this step, the file stream is in zip format, and according to the regulations of the OfficeOpenXML file format, the file stream in zip format includes at least xml files describing application program data, metadata, and custom data.

[0031] Step 20, parsing the file stream to obtain information about workbooks and worksheets in the file stream. It mainly includes reading information of a workbook (workbook) and a worksheet (worksheet or sheet) in the file stream by parsing two related file stre...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to an Excel document data analysis method. The method comprises steps as follows: step 10, acquiring a document flow of an Excel document requiring analysis; step 20, analyzing the document flow, and acquiring information about workbooks and worksheets in the document flow; step 30, reading xml (extensive markup language) documents corresponding to the worksheets respectively through multiple threads; step 40, analyzing xml documents of shared data in the document flow, and finding storage locations of the shared data corresponding to the worksheets and reading the storage locations through the multiple threads; step 50, performing analysis in combination of the xml documents corresponding to the worksheets and the shared data corresponding to the worksheets through the multiple threads so as to acquire data of the Excel document. The invention further relates to an Excel document data analysis device. The Excel document data analysis method and device are generally applicable to the all kinds of formats of Excel documents and can be applied to larger Excel documents, and the data analysis efficiency is improved.

Description

technical field [0001] The present application relates to the technical field of data processing, in particular to a data analysis method and device for an Excel document. Background technique [0002] Microsoft Excel is one of the components of Microsoft Office, Microsoft's office software, and Microsoft Office has used the OfficeOpenXML file format different from previous versions (using binary file format) since version 2007. The container for the new file format is a compressed ZIP file format based on simple components that include XML files that describe application data, metadata, and custom data, as well as relationships between components, images embedded in documents, or Non-XML files such as OLE object binaries, at the core of the new OfficeOpenXML format use some XML reference schema and a ZIP container. Take the newly created blank Excel document with the suffix xlsx as an example. After decompressing it, the folders _rels, xl, and docProps are formed in the fi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/22
CPCG06F16/258G06F40/12
Inventor 刘倍材樊文飞贾西贝
Owner SHENZHEN AUDAQUE DATA TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products