Novel method and device for collecting Chinese news page increment

A page and news technology, applied in the field of incremental collection of Chinese news pages, can solve the problem of low efficiency of information processing, and achieve the effect of avoiding repeated collection, improving efficiency and the quality of retrieval

Inactive Publication Date: 2012-12-19
INST OF SCI & TECHN INFORMATION OF CHINA
View PDF2 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] A technical problem to be solved by the present invention is to provide a new method and device for incremental collection of Chinese news pages, which can effectively solve the problem of information processing caused by repeated collection of news pages in current Chinese news page collection methods The defect of low efficiency

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Novel method and device for collecting Chinese news page increment
  • Novel method and device for collecting Chinese news page increment
  • Novel method and device for collecting Chinese news page increment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0073] In order to make the above objects, features and advantages of the present invention more comprehensible, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0074] One of the core ideas of the present invention is to provide a novel incremental acquisition method for Chinese news pages, including: identifying stable pages to obtain the identified stable pages; performing corresponding operations on news page classifiers to obtain generated news page classifier; new pages are collected to obtain newly added pages after collection; news pages are identified to obtain identified news pages; this method can effectively solve the existing problems in the current Chinese news page collection method The low efficiency of information processing caused by repeated collection of news pages.

[0075] refer to figure 1 , which shows a schematic flow chart of Embodiment 1 of a novel incremental a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a novel method and device for collecting Chinese news page increment. The novel method for collecting Chinese news page increment comprises the following steps: recognizing stable pages to obtain recognized stable pages; performing corresponding operation on a news page sorter to obtain a generated news page sorter; collecting newly increased pages to obtain collected newly increased pages; and recognizing news pages to obtained recognized news pages. By adopting the novel method for collecting Chinese news page increment, the problem existing in the existing Chinese news page collecting method that the information processing efficiency is low due to repeatedly news pages collecting can be effectively solved.

Description

technical field [0001] The invention relates to the fields of information retrieval and data integration, in particular to a novel method and device for incremental collection of Chinese news pages. Background technique [0002] Since the birth of the Web in the early 1990s, it has developed at an astonishing speed. Up to now, the Web has become the largest information warehouse in the world, covering all fields of the real world, and has become the main way for human beings to obtain information in their work and life. The release of Web information is mainly realized in the form of web pages. According to the latest estimates, the number of web pages in the Web has exceeded 550 billion (1 billion is equal to 1 billion). Obviously, manual access can no longer meet people's needs for information acquisition. In order to allow people to more effectively access and utilize the massive information in the Web, researchers have started research in the field of Web information sea...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 刘伟
Owner INST OF SCI & TECHN INFORMATION OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products