Method and system for extracting content information on internet

A content information, Internet technology, applied in the direction of website content management, network data retrieval, special data processing applications, etc., can solve the problems of slowing down the user's browsing speed, unable to obtain and save the text content, restricting the user to extract information, etc., to improve the browsing speed. effect of speed

Inactive Publication Date: 2018-06-15
FOSHAN DAOJING TECH CO LTD
View PDF8 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, the RSS provided by many information content sites does not cover all the information in the site, but only provides a small part of the content. For the content that RSS does not provide, it cannot be obtained by means of existing technologies, which limits the user. Activeness in extracting information
[0010] 3) Cannot obtain and save the text content through RSS
The current RSS only provides the text address link, but not the content of the text. Users must visit the URL pointed by the text address link to browse the text, thus reducing the user's browsing speed

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for extracting content information on internet
  • Method and system for extracting content information on internet

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] figure 1 A flow chart of a method for extracting content information on the Internet according to an embodiment of the present invention is shown. Such as figure 1 As shown, the method includes the following steps:

[0047] S100. Responding to an input instruction of content to be extracted;

[0048] S200. Recognize the input instruction, and select a classification category according to the recognition result;

[0049] S300. Analyze the input instruction by using the classification category, so as to generate a content set associated with the classification category, where the content set includes the content to be extracted;

[0050] S400. Sort the content set according to the correlation coefficient associated with the input instruction, and display it on the display.

[0051] Wherein, the correlation coefficient is calculated according to the following formula:

[0052] C=R*(M+N) N

[0053] C is the correlation coefficient, R is the category correlation degre...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method for extracting content information on the internet. The method comprises the following steps of: responding to an input instruction of to-be-extracted content; recognizing the input instruction and selecting a classification category according to the recognition result; analyzing the input instruction by using the classification category to generate content sets associated with the classification category, wherein the content sets contain the to-be-extracted content; and sorting the content sets according to association coefficients between the content sets andthe input instruction and displaying on a displayer. By means of the scheme, the method provided by the invention has the advantages that a setting interface is provided to users, so that the users can directly obtain the content information in a target web page and do not need to passively rely on whether RSS (Really Simple Syndication) is published or not by an information website and the published content of the RSS and more abundant and more detailed information can be extracted from a wider information source; in addition, the content information can also be stored locally for the users to visit, thereby improving the browsing speed of the users.

Description

technical field [0001] The invention relates to the field of communication technology, in particular to a method and system for extracting content information on the Internet. Background technique [0002] With the development of the Internet, the information content it contains has reached a massive level, but these consulting contents are scattered on thousands of sites in the Internet, which brings great inconvenience to people's browsing. Under such circumstances, more and more attention is paid to Internet content extraction technology, which can actively extract information content and provide raw data for content aggregation, content mining, content publishing and other services. [0003] The extraction of Internet information content and search engines are different concepts. The search engine finds web pages that have a certain relationship with the keywords through the keywords entered by the user, and lists and displays the addresses of these web pages that meet ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/9566G06F16/958
Inventor 王森
Owner FOSHAN DAOJING TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products