Unlock instant, AI-driven research and patent intelligence for your innovation.

A method to extract desired content from text

A text extraction and content technology, applied in the field of computer programs, can solve problems such as low extraction efficiency, and achieve the effect of improving extraction efficiency

Active Publication Date: 2018-03-30
广州极盛信息科技开发有限公司
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to propose a method for extracting required content from text, which can solve the problem of low extraction efficiency

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method to extract desired content from text

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] In the following, the present invention will be further described in conjunction with the drawings and specific embodiments.

[0031] Such as figure 1 As shown, a method for extracting required content from text, it includes the following steps:

[0032] Step S1, receiving the keyword set by the user and the weight of the keyword, and receiving the text uploaded by the user. There may be multiple texts uploaded by users, and the ways to obtain texts include online collection. The number of keywords set can also be multiple, for example, two keywords are set: Jack Ma and Listing, the weight of Jack Ma is 0.5, and the weight of Listing is 0.3.

[0033] Step S2, according to Formula 1, select the text whose correlation degree is greater than a preset value (eg, 20%) as the target text.

[0034] Formula One is: Among them, G is the correlation degree of the text, P n is the number of occurrences of the nth keyword in the text, M n is the weight of the nth keyword, L ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a method and a device for extracting required content from text. The method includes: preprocessing the target text according to a preset word segmentation package, so that each word in each sentence of the target text is given a part-of-speech category, and the part-of-speech category includes subject, predicate, object, attributive, Adverbial and complement; receive the label input by the user, extract the subject and the target search word needed by the user, and count the target search word in each sentence of the target text according to the preset synonyms word forest and semantic field synonym The total number of occurrences of synonymous synonymous words, save the sentences whose total number exceeds the preset threshold to the extraction library; calculate the respective vector value of each sentence in the extraction library, and calculate the difference between each two sentences according to the vector value If there are two sentences whose included angle is smaller than the preset angle, one of the two sentences will be randomly deleted. The invention can effectively improve the extraction efficiency of text content.

Description

technical field [0001] The present invention relates to computer programs. Background technique [0002] The report is an official document used when reporting work, reflecting the situation, making suggestions, and answering inquiries from higher authorities. At the same time, the report is the way to do things, the basis for success, and the prerequisite for leaders to make correct judgments and decisions. In recent years, with the impetus of the market economy, reporting has become a new industry, and the use of reports has gradually expanded, including new product development, investment and financing, company development planning, and annual development. The institutions currently writing the report include national universities, social sciences, research associations, research institutes, think tanks and other national research institutions, such as: Chinese Academy of Sciences, Chinese Academy of Social Sciences, Peking University, Tsinghua University, China Non-state...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/27
Inventor 彭宏利
Owner 广州极盛信息科技开发有限公司