Method and system for acquiring important knowledge points in field
A technology of knowledge points and fields, applied in the field of digital resource processing, can solve problems such as multi-manpower and material resources, difficult standards, poor objectivity, etc., to reduce workload, save time and labor costs, and improve efficiency and accuracy.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0033] In this embodiment, a method for obtaining important knowledge points in the domain is provided, and the flow chart is as follows figure 1 shown. The knowledge points in the field refer to the words or entries in the field, which reflect the knowledge in the field. The method of obtaining important knowledge points in the field includes the following process:
[0034] S1: Segment the text to obtain the word segmentation result.
[0035] Some digital resources in the field are selected for the text here. In order to make the knowledge points covered by it broad enough, more electronic digital resources in the field are generally selected. For example, in the field of history, you can choose e-books in this field related to the history of five thousand years and the history of dynasties. After selecting the digital resources in the field, extract the text from it, and then segment the words. After the word segmentation, a large number of words are obtained. These words...
Embodiment 2
[0067] This embodiment provides a method for obtaining important knowledge points in the field, and its steps are the same as those in Embodiment 1. This embodiment provides a specific method for calculating the semantic vector of each candidate knowledge point in the above process, the specific process as follows:
[0068] The first step is to determine the number of occurrences of each candidate knowledge point in the candidate document, so that the text of each candidate knowledge point and its occurrence times is obtained. The candidate text is the text obtained after word segmentation from the selected digital resources, and the candidate knowledge point is the word obtained after the word segmentation in the candidate text except common words. This part is the same as that in Embodiment 1, and will not be repeated here.
[0069] The second step is to calculate the binary tree with the minimum weighted path length according to each candidate knowledge point and the number...
Embodiment 3
[0098] Field encyclopedias are an important digital publishing resource. Domain encyclopedias usually organize domain information in the form of entries. The domain encyclopedia needs to contain important entries in the domain. However, building a domain encyclopedia requires a lot of human input. This embodiment provides a method for acquiring important domain knowledge points, which are entries in domain encyclopedias. In this embodiment, the domain e-book text and newspaper text are used to calculate the semantic vector of the candidate entry through the skip-gram model. The semantic similarity of candidates is calculated through the semantic vector, and the semantic similarity matrix of all candidate entries is obtained. The semantic similarity matrix is used to calculate the important entries in the candidate entries, and then the field encyclopedia can be built or the gaps can be checked and filled according to these important entries, which provides an objective an...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 