Web page classification method and system based on semantic extension

A semantic extension and page technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problems of poor accuracy and application flexibility, and keyword matching schemes cannot meet the requirements well , to achieve the effect of accurate and reliable data processing and efficient classification

Active Publication Date: 2017-11-10
ELECTRIC POWER RES INST OF GUANGDONG POWER GRID
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] With the popularization of the Internet and the development of information technology, it is increasingly hoped to mine and utilize information through the Internet. However, at present, the classification of data is mostly based on the comparison of keywords. The method is relatively simple. Automatic classification of WEB information or The scheme based on keyword matching in the search process cannot meet the requirements very well, and the effect is poor in terms of accuracy and application flexibility

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Web page classification method and system based on semantic extension
  • Web page classification method and system based on semantic extension
  • Web page classification method and system based on semantic extension

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020] The present invention will be described in further detail below in conjunction with the embodiments and accompanying drawings, but the embodiments of the present invention are not limited thereto.

[0021] Such as figure 1 Shown is a schematic flow diagram of a preferred embodiment of a semantic extension-based WEB page classification method of the present invention, including the following steps:

[0022] S11, extracting the keywords of the WEB page;

[0023] S12. Semantically expanding the keywords of the WEB page to obtain a keyword combination;

[0024] In this embodiment, at first extract the keyword of WEB page, carry out semantic extension according to keyword, obtain the semantic extension set of this page, namely described keyword combination;

[0025] S13. According to the keywords of the WEB page, determine the similar category tree of the WEB page in the semantic thesaurus from the category trees in the preset semantic thesaurus, wherein the category tree ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention provides a WEB page classification method based on semantic extension, comprising: extracting keywords of WEB pages; performing semantic expansion on keywords of WEB pages to obtain keyword combinations; according to keywords of WEB pages, from preset In the category tree in the semantic lexicon, determine the similar category tree of the WEB page in the semantic lexicon, wherein the category tree contains a plurality of nodes, and each node contains a plurality of preset keywords and their preset Weight; match the keyword combination with a plurality of preset keywords contained in each node in the similar category tree, if there are the same keywords, add the corresponding preset weights; combine the WEB The pages are classified under the node with the highest weight, and at the same time, the keyword combination is stored in the node with the highest weight, and the similar category tree is updated. Correspondingly, the present invention also provides a WEB page classification system based on semantic extension. The invention can effectively improve the accuracy and flexibility of WEB page information classification.

Description

technical field [0001] The invention relates to a WEB page data processing technology, in particular to a semantic extension-based WEB page classification method and a semantic extension-based WEB page classification system. Background technique [0002] With the popularization of the Internet and the development of information technology, it is increasingly hoped to mine and utilize information through the Internet. However, at present, the classification of data is mostly based on the comparison of keywords. The method is relatively simple. Automatic classification of WEB information or The scheme based on keyword matching in the search process cannot meet the requirements very well, and the effect is poor in terms of accuracy and application flexibility. Contents of the invention [0003] Based on this, the present invention provides a semantic extension-based WEB page classification method and system, which can effectively improve the accuracy and flexibility of WEB pa...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
Inventor 徐立新付丽萍颜小林李军
Owner ELECTRIC POWER RES INST OF GUANGDONG POWER GRID
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products