Word document fragmentation method and device

A fragmentation and document technology, applied in word processing, instruments, computing, etc., can solve the problems of huge workload, long time consumption, low retrieval efficiency, etc.
CN107357765BActive Publication Date: 2018-11-09ZHONGKE DINGFU BEIJING TECH DEV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
ZHONGKE DINGFU BEIJING TECH DEV
Publication Date
2018-11-09

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The embodiment of the invention provides a Word document fragmentization method and device. In order to solve the problem that in the prior art, the retrieval efficiency is low when the target content is retrieved in a Word document, the Word document fragmentization method comprises the steps that firstly, all paragraphs of the Word document are obtained; secondly, according to the sequential order of the paragraphs in the Word document, the paragraph attributes of the paragraph are obtained in sequence, all the paragraph attributes which first appear in the Word document are extracted, and a paragraph attribute set of the Word document is generated; thirdly, a paragraph attribute recognition model is utilized to extract all headline paragraph attributes in the paragraph attribute set, and a headline paragraph attribute set is generated; fourthly, according to the headline paragraph attribute set, all headlines in the Word document are recognized, a headline tree of the Word document is generated, and the Word document fragmentization is achieved. Therefore, a user can directly retrieve the document paragraphs containing the target content in the fragmentized Word document or retrieve the target content in the headline tree of the Word document, and the retrieval efficiency is improved when the user retrieves the headlines of the Word document.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The present invention relates to the field of word information processing, in particular to a Word document fragmentation method and device Background technique

[0002] Word document is a special document format in Microsoft Word software. Because Microsoft Word software occupies an absolute dominant position in the existing word processing software, Word document has actually become an international common document format standard. Therefore, in In the prior art, most of the documents that people deal with in work and study are Word documents.

[0003] Generally speaking, the content of a Word document is an article, a report, a thesis, etc., and these Word documents include a title and a body. In a document, the title is usually a summary of the content of a section of text, reflecting the theme of a section of text, and the text is a specific description of its subject content, reflecting the specific content corresponding to the topic; in a docum...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More