A Method for Identifying Scientific Formulas in Format Documents
A technology for format files and formulas, applied in the field of file processing
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0018] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the embodiments and accompanying drawings.
[0019] Such as figure 1 As shown, it is the process of identifying scientific formulas in the format file, including:
[0020] Step 101 traverses the character stream information extracted from the format file, and performs content-based preprocessing on the character stream.
[0021] Preprocess the extracted character stream information, including redundant spaces and redundant characters that affect layout analysis and merging such as columns. Here, a content-based method is used to remove redundant characters; and a structure tree is designed to store the encoding information, coordinate information, and font size information of each character.
[0022] Step 102 generates a document layout through a layout analysis algorithm on the processed c...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


