Chapter content tiering method and device, and article content tiering method and device
A chapter and content technology, applied in special data processing applications, instruments, electronic digital data processing, etc., can solve problems such as no document structure, difficult document information, and no consideration of document structure characteristics, and achieve the effect of saving processing time.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0031] The analysis process of the present invention is illustrated below through a simple embodiment. For example, in a document there are the following chapters, which are paragraphs with headings.
[0032]
[0033] (1) The hypothesis space of the ID3 algorithm includes all decision trees, and the search space is a complete hypothesis space. Because every finite discrete-valued function can be represented as some decision tree, it avoids the risk that the hypothesis space might not contain the objective function.
[0034] (2) The ID3 algorithm uses all the current training samples at each step of the search, and decides how to simplify the current hypothesis based on the information gain criterion. An advantage of using the statistical property of information gain is that it greatly reduces the susceptibility to errors in individual training samples, so the algorithm can be easily extended to handle noisy training samples by modifying the algorithm.
[0035] (3) The ID3...
Embodiment 2
[0049] In Example 1, the hierarchical processing of chapter content is simply explained by giving an example. Using this method, different chapters in an article can be analyzed to obtain multiple subtree merge graphs, as shown in Figure 5A and 5B shown. For different subtree merge graphs, the relevance of vocabulary in the same level can be judged according to the associated vocabulary. If there is a correlation, the different subtree merge graphs are connected through their corresponding associated vocabulary to generate a higher-level tree merge graph (See Figure 5c). For example, it can be seen from the related word list that according to the core words "C4.5 algorithm" and "ID3 algorithm" are related to "decision tree", the core words "C4.5 algorithm" and "ID3 Algorithm" is listed in the hypernym related word "decision tree", forming a structural hierarchy diagram as shown in Figure 5c.
[0050] In addition, for different subtree merge graphs, Figure 6A The existing...
Embodiment 3
[0052] For chapters without titles, a tree merge graph is formed through the following implementation manners.
[0053] First of all, for each sentence of a chapter without a title input through the input unit, when it is judged that the chapter has no title, the input sentence is divided into words, and the frequency of occurrence of each word in the chapter is arranged according to the frequency of occurrence, and then according to The associative vocabulary list finds out the vocabulary most associated with the second level of the multiple subtree merge graph, and puts the sentence containing the found vocabulary most associated with the second level under the second level as the third level, forming Structural hierarchy diagram.
[0054] Similarly, the tree merge graphs of different chapters can also be merged to form an article information merge graph.
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com