Method and system for identifying newspage attributive characters

An attribute feature, web page technology, applied in the direction of website content management, network data retrieval, special data processing applications, etc., can solve the problems of non-recognition scheme, affecting users' news reading, user interference, etc., to achieve the effect of avoiding interference

Active Publication Date: 2014-03-05
BEIJING QIHOO TECH CO LTD
View PDF2 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Obviously, the headline information provided by the news web page has nothing to do with the text information, it is not normal news, but "junk news", which will only cause disturbance to users and affect their normal news reading
"Junk news" should be identified and dealt with in a timely manner so as not to cause interference to users, but there is currently no effective identification scheme

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for identifying newspage attributive characters
  • Method and system for identifying newspage attributive characters
  • Method and system for identifying newspage attributive characters

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039]Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.

[0040] Such as figure 1 As shown, an embodiment of the present invention provides a method for identifying the attribute characteristics of a news web page, which includes: Step 110, extracting title information and text information from the captured news web page, for example, in the aforementioned news web page " Come to XXX to play games and watch movies" is the title information, "Under the slave society..." is the text informat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method and a system for identifying news page attributive characters. The method includes: respectively extracting header information and main body information from captured news pages; respectively analyzing the header information and the main body information and calculating matching rate of the header information and the main body information according to header information analysis data and main body information analysis data; judging the attributive characters of the news pages according to the matching rate of the header information and the main body information. The method has the advantages that the attributive characters of the news pages can be identified, the attributive characters reflect relevance of the header information and the main body information of the news pages, and 'junk' webpages can be identified.

Description

technical field [0001] The invention relates to a method and system for identifying attribute features of news webpages. Background technique [0002] At present, there are a large number of news webpages on the Internet, and the news webpages have rich news for users to browse to obtain the latest news. Now many news webpages are implanted with some irrelevant news information, which are usually worthless to users and become "junk news" to users, which will only interfere with users' browsing. [0003] For example, the title information of a news web page is "Come to XXX to play games and watch movies", and the text information is: "In a slave society, businessmen without property rights are weak. In a slave society where legal labor income cannot be guaranteed, the development of business Civilization is absolutely unfeasible...". Obviously, the headline information provided by the news web page has nothing to do with the text information. It is not normal news, but "gar...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/958G06F40/30
Inventor 韩孟岗
Owner BEIJING QIHOO TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products