Unlock instant, AI-driven research and patent intelligence for your innovation.

Method for judging news list page and method for screening news list page

A technology for list pages and news, applied in the directions of network data navigation, special data processing applications, instruments, etc., can solve the problems of vague collection targets, high cost, low data collection efficiency, etc., and achieve the effect of improving efficiency

Active Publication Date: 2014-12-03
INST OF COMPUTING TECH CHINESE ACAD OF SCI
View PDF5 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] The Internet is an important channel for providing news information. Both the public and business units need to rely on the Internet to obtain the news information they care about. However, the types of websites in the Internet are relatively complicated. For example, some comprehensive media websites have a large number of other websites besides news pages. Content pages, users usually need to spend a lot of money when searching for news
[0003] At present, there are some news collection tools that can automatically search for news pages in the website specified by the user, collect all the news pages, and then find the news content that the user cares about. When such news collection tools collect data, due to the comparison of collection targets Fuzzy, often judge a large number of non-news pages, and even be affected by website links to expand the scope of collection to non-user-specified websites, so that the efficiency of data collection is very low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for judging news list page and method for screening news list page
  • Method for judging news list page and method for screening news list page
  • Method for judging news list page and method for screening news list page

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0048] The present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. The specific embodiments described here are only used to explain the present invention, but not to limit the present invention. In addition, it should be noted that, for the convenience of description, only parts related to the present invention are shown in the drawings but not all content.

[0049] Before setting forth the various embodiments of the present invention, the relevant concepts involved are explained first:

[0050] "News list page" means: the main content on the page is a news title with a link (there may be a brief summary below the title) or a news picture.

[0051] "Channel" means: each item in the navigation bar of the News Site. For example Figure 4 Each item shown in a site's navigation bar is considered a channel.

[0052] Parent page (also known as parent page) and sub-page: refer to two web pages with a network link ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method for judging a news list page and a method for screening the news list page. The method for judging the news list page comprises the following steps: acquiring a webpage, and judging whether the webpage is a news webpage or not; if the webpage is not the news webpage, collecting sub-webpages in the webpage to repeat the judging process on the sub-webpages; if the webpage is the news webpage, and is judged as the news webpage in a channel, judging whether the father webpage of the webpage is the news webpage or not; if the father webpage is not the news webpage, recording the correlation information of the webpage and the father webpage; judging the news list page according to the correlation information; other steps. Through the utilization of the method provided by the invention to find the news list page, an existing news collector can directly take the news list page as the start page to collect the news content, so that the collection efficiency of the news data is improved.

Description

technical field [0001] The invention relates to the field of computer data processing, in particular to a method for judging news list pages and a method for screening news list pages. Background technique [0002] The Internet is an important channel for providing news information. The public and business units need to rely on the Internet to obtain the news information they care about. However, the types of websites in the Internet are relatively complicated. For example, some comprehensive media websites have a large number of other websites besides news pages. Content web pages, users usually need to spend a lot of money when searching for news. [0003] At present, there are some news collection tools that can automatically search for news pages in the website specified by the user, collect all the news pages, and then find the news content that the user cares about. When such news collection tools collect data, due to the comparison of collection targets Fuzzy, often ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/9535G06F16/954G06F16/955
Inventor 刘晓娜张凯程学旗刘悦张瑾余智华
Owner INST OF COMPUTING TECH CHINESE ACAD OF SCI