Supercharge Your Innovation With Domain-Expert AI Agents!

Automatic personalized abstracting method in digital library system

A digital book and automatic summarization technology, applied in the field of information processing, can solve the problem of low coverage of main information in documents, and achieve the effects of strong anti-interference ability, flexible acquisition, and high accuracy

Inactive Publication Date: 2013-04-17
成都希创掌中科技有限公司
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] However, the prior art represented by the above-mentioned patent documents still has the following technical problems: In the CN 101231634 patent, the weight vector is calculated according to sentences, resulting in the segmentation of summary information by sentences , in this case the coverage of the extracted summary information on the main information of the document is low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Automatic personalized abstracting method in digital library system
  • Automatic personalized abstracting method in digital library system
  • Automatic personalized abstracting method in digital library system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0049] The most basic implementation mode of the present invention comprises the following steps:

[0050] a. Input query information, the query information includes keywords and personalized information of users;

[0051] b. Establish a relevant model and an irrelevant model according to the input query information, the relevant model refers to the probability distribution function of the natural language model of the query statement, and queries the digital book system with keywords to obtain the top 5-50 documents ;

[0052] The irrelevant model is a supplementary probability distribution function of the related model, which refers to all document collections in the digital library system;

[0053] Because in the language model built with the entire document set, the query related documents have only a small value, and the query irrelevance occupies the main factor, so the entire document set can be used to build an irrelevant model

[0054] c. For each word in the docume...

Embodiment approach

[0062] The automatic summarization system we introduced uses language model-related technologies in document processing and weighting, and uses word frequency statistics to weight sentences. The processing flow of the automated summary is as follows figure 1 . It shows the process of establishing a relevant model and an irrelevant model according to the query information input by the user after the user inputs the query information, and generating a personalized summary through the word sequence analysis model (WSA for short).

[0063] The process of summary extraction by related model and unrelated model:

[0064] We extract summary information based on statistical language models. In our research, two language models are constructed: one is a correlation model, which is defined as ;The other is an uncorrelated model, which is defined as . related model is the probability distribution function of the natural language model of the query statement. In contrast, uncor...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a personalized automatic summarization method in a digital book system, which relates to the technical field of information processing, including a. inputting query information; b. establishing a relevant model and an irrelevant model according to the input query information; c. Obtain each word in the document of summary information, calculate the probability that described word produces under relevant model and irrelevant model; d, the described relevancy degree of each keyword is saved in a queue; e, select described Add a group of consecutive keyword correlations in the queue, and the document fragment with the highest correlation is used as a document abstract; f, judge whether to continue to search for the next abstract according to the threshold value; g, if necessary, continue to step e, if not , returns all documents in the summary data collection as summary information. The accuracy of this method is higher than that of the article summary obtained by the traditional summary algorithm. Moreover, the method has strong anti-interference ability when simulating the real data situation.

Description

technical field [0001] The invention relates to the technical field of information processing, in particular to a personalized automatic summarization method in a digital book system. Background technique [0002] Query-based automatic summarization, that is, for a given document, return one or more summary information related to the query. When a text collection is established or updated, the document is automatically divided into multiple discrete summary information. [0003] In the current automatic summarization process, one method is to pre-estimate the length of the summary information based on some documents related to the current document. After the approximate length of the document summary, find the information segment of the specified length that best matches the query as the article summary. [0004] Another method is to divide the document into one or more semantic information blocks through preprocessing. After the semantic information block is determined, t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
Inventor 李庆刘家芬罗旭斌张晨胡川
Owner 成都希创掌中科技有限公司
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More