Web page classifying method and apparatus

A web page classification and web page technology, applied in the Internet field, can solve problems such as high time cost, inability to guarantee accuracy, and inability to guarantee system timeliness.

Active Publication Date: 2015-10-07
北京鸿享技术服务有限公司
View PDF4 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The current webpage classification technology is mainly a semi-automatic way to classify webpages, which is completed through algorithms and manual review. In the algorithm stage, a traditional classification algorithm (such as naive Bayesian) is used to initially classify webpages. , but the main problem at this stage is that the accuracy rate cannot be guaranteed; in the manual review stage, in order to improve the classification accuracy rate, manual review is generally required
[0004] Since the above scheme is semi-automated, it cannot meet the requirements when faced with a large amount of data that needs to be classified; and, because the classification of web pages is generally defined manually in the early stage, the scalability is poor; the timeliness of the entire system is very poor , due to the need to go through two stages, and the time cost of manual review in the second stage is very high, the timeliness of the entire system cannot be guaranteed

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Web page classifying method and apparatus
  • Web page classifying method and apparatus
  • Web page classifying method and apparatus

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0109] Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.

[0110] refer to figure 1 , shows a flowchart of steps of a method for classifying webpages in Embodiment 1 of the present invention.

[0111] Step 101, parsing multiple webpage elements from the webpage to be predicted.

[0112] The embodiment of the present invention predicts webpage classification based on webpage elements, and webpage elements can be a part of the webpage to be predicted, for example, can include any number of ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a web page classifying method and apparatus. The method comprises the steps of: analyzing a plurality of web page elements from a to-be-predicted web page; predicting candidate web page classifications to which the to-be-predicted web page belongs according to the web page elements respectively; and comparing the candidate web page classifications predicted by the web page elements respectively, and determining final web page classification of the to-be-predicted web page. According to the method, a full-automatic classifying process is realized, the manual operation is not required, the web page classifying efficiency is greatly improved, especially massive web pages of the whole network and web pages newly generated in the internet can be quickly and effectively classified, and the web page classifying timeliness is ensured.

Description

technical field [0001] The invention relates to the technical field of the Internet, in particular to a method for classifying webpages and a device for classifying webpages. Background technique [0002] Web page classification plays a vital role in many Internet products. For example, in news media, the classification of news webpages plays a very important role in organizing news content reasonably and effectively and improving users' reading experience. [0003] The current webpage classification technology is mainly a semi-automatic way to classify webpages, which is completed through algorithms and manual review. In the algorithm stage, a traditional classification algorithm (such as naive Bayesian) is used to initially classify webpages. , but the main problem at this stage is that the accuracy rate cannot be guaranteed; in the manual review stage, in order to improve the classification accuracy rate, manual review is generally required. [0004] Since the above sch...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F40/14
CPCG06F16/951G06F40/14G06F40/205G06F18/24147G06F18/24155G06F16/955G06F16/958G06F3/038G06F3/0481G06F3/0482
Inventor 王建刚沈亮邓本洋陈培军
Owner 北京鸿享技术服务有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products