Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method for analyzing Word file information and system thereof

A file information and file technology, applied in the field of analyzing Word file information and its system, can solve problems such as program hangup, imperfect document support, and affecting program stability, and achieve the effect of saving system resources

Inactive Publication Date: 2011-02-23
WONDERSHARE TECH CO LTD
View PDF1 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] However, for Solution 1, a Com call is required for each word element parsed, so the efficiency and stability of the parsing process is low; in addition, when the parsed attribute Word object has no value set, if the parsing continues, it will cause The program hangs, thus directly affecting the stability of the program
[0008] For the second solution, since the open source software such as Open-Office does not support the document in doc format perfectly, when parsing some more complex elements, attributes will be lost.
[0009] For solution three, this method can only support docx documents; since the format of doc files is not public, this method cannot parse Word2003 format files and Word2000 format files, and the version support for Word files is not complete
[0010] Therefore, the existing technology still needs to be improved and developed

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for analyzing Word file information and system thereof
  • Method for analyzing Word file information and system thereof
  • Method for analyzing Word file information and system thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment approach

[0062] 1. In step S100, the XML2003 file conversion module 100 first initializes the Com interface of MS_Word, and then imports a Word file in the background, such as a *.doc or *.docx file; and then saves the Word file as a Word_XML2003 format file. Specifically, as attached image 3 As shown, the XML2003 file conversion module 100 includes an initialization unit 110, a document creation unit 120, a setting unit 130, an import unit 140, and a generation unit 150 for sequential data connection, wherein, as attached Figure 4 As shown, step S100 may include the following steps:

[0063] Step S110, the initialization unit 110 initializes the Word object, such as: instantiate a Word program object ApplicationPtr, etc.;

[0064] Step S120, the document creation unit 120 creates a document object, such as: create a Word document object DocumentPtr through the Word instance object;

[0065] Step S 130, the setting unit 130 sets Word to run in the background, that is, the Word obje...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for analyzing Word file information and a system thereof. The method comprises the following steps of: converting a Word file to be analyzed into an intermediate file of a document in a Word_XML2003 format; analyzing the basic information of elements in the document in the Word_XML2003 format; combining the information obtained from analyzing according to a rule of Word; and writing analyzed and combined objects into an XML file. Because the document in the Word_XML2003 format is used as the intermediate file and combination is performed according to the rule of the Word after analyzing the document in the Word_XML2003 format, attribute loss cannot be caused when complicated elements are analyzed, a Com interface is prevented from being called frequently, system resources are saved, the Word file information in all formats is analyzed highly effectively and stably, and the method is particularly suitable to be used when the Word file information of various editions need to be analyzed in batches.

Description

technical field [0001] The invention relates to the field of systems capable of reading and analyzing Word file information. More specifically, the improvement relates to a method and system for analyzing Word file information. Background technique [0002] Microsoft Word is a word processing application program of Microsoft Corporation. In office automation, Microsoft Word has been used more and more. However, today, when automated office applications are widely used, it is often necessary to read and identify useful information. How to analyze Word documents in batches and quickly will directly affect people's office efficiency. [0003] At present, there are roughly the following methods for parsing Word files that are often used in the industry: [0004] Solution 1: By calling the automated Com interface of MS-Word, the Word file information is analyzed according to the Word document structure. [0005] Solution 2: Realize the analysis of Word file information by calli...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/21
Inventor 解辉
Owner WONDERSHARE TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products