Semantic model based text message extraction method and device
Patent Information
- Authority / Receiving Office
- CN · China
- Current Assignee / Owner
- ZHONGKE DINGFU BEIJING TECH DEV
- Publication Date
- 2018-01-19
Smart Images

Figure 1 
Figure 2 
Figure 3
Abstract
Description
technical field
[0001] The present application relates to the technical field of text processing, in particular to a semantic model-based text information extraction method and device. Background technique
[0002] With the explosive growth of Internet information, the contents of various documents are becoming more and more colorful. Since the information people need is hidden in various styles of content, it is increasingly difficult to find it. Therefore, people need to use information extraction methods to find the required information in relevant texts.
[0003] At present, the information extraction method is mainly based on the HTML structure extraction method, which uses the HTML parser to scan the characters in the HTML text information one by one, analyzes the structural hierarchical relationship of the HTML text information, and numbers the same HTML tags sequentially from zero, and finally A DOM tree corresponding to the HTML text information is formed, and then...