Method and system for streaming analysis of document content

A parsing method and content streaming technology, which are applied in the field of document content streaming parsing methods and systems, and can solve the problems of file checking lag, device freezes, and reduced file parsing processing performance.

Active Publication Date: 2021-01-08
南京中孚信息技术有限公司 +3
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Disadvantage 1: For files transmitted over the network, the files must be saved on the disk of the device first, and then read into the memory, which has a certain lag for file checking
Disadvantage 2: Large file processing
For the processing of large files, if the large files are loaded into the memory at one time, the parsing process will occupy too much memory, and the processing process will also occupy too much CPU resources, causing the device to freeze and affecting other operations of the device user. and use
Disadvantage 3: Compressed file processing
A compressed file is a special type of file, which can contain multiple files or folders, and can also contain compressed files to form a compressed file with multiple nesting layers. If there are too many nesting layers, one-time loading will not only occupy a large Reduced processing performance for file parsing

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for streaming analysis of document content
  • Method and system for streaming analysis of document content
  • Method and system for streaming analysis of document content

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0054] In order to further illustrate the various embodiments, the present invention provides accompanying drawings, which are part of the disclosure of the present invention, and are mainly used to illustrate the embodiments, and can be used in conjunction with the relevant descriptions in the specification to explain the operating principles of the embodiments, for reference Those of ordinary skill in the art should be able to understand other possible implementations and advantages of the present invention. The components in the figures are not drawn to scale, and similar component symbols are generally used to represent similar components.

[0055] According to an embodiment of the present invention, a method and system for stream parsing of document content are provided.

[0056] Now in conjunction with accompanying drawing and specific embodiment the present invention is further described, as figure 1 As shown, according to the document content stream parsing method of t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a document content streaming analysis method and system, and the method comprises the following steps: S1, reading file data, and completing the directory scanning; S2, judgingfile types, and classifying different types of files; and S3, calling a corresponding parser to parse the corresponding file according to the file type. The document content streaming analysis methodand system have the beneficial effects that firstly, files are classified into structured files, text files and compressed files according to different file structures; according to the file contentstreaming analysis method provided by the invention, only a part of data is loaded for processing each time, different processing methods are used for different types of files, and meanwhile, a statemachine is used for controlling the whole processing process; the internal streaming processing modes of all types of files are different, but the whole process of file analysis is the same.

Description

technical field [0001] The present invention relates to the technical field of document content parsing, and in particular, to a method and system for stream parsing of document content. Background technique [0002] With the advent of the era of big data, the number of files transmitted through the Internet has greatly increased, and the Internet is flooded with various text files, video files, audio files, etc. Among them, besides ordinary documents, there are a large number of electronic documents in the text files. There may be some secret-related documents in these electronic documents, and official documents, as the main way for party and government agencies to carry out date work, are the most important source of secret-related documents. Similarly, confidential documents may also appear in non-confidential equipment. In order to ensure the safety of the national secret work, it is urgent to check out the secret-related official documents from the massive files in t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/205
CPCG06F40/205
Inventor 殷博潘飚冯静
Owner 南京中孚信息技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products