Population property classification method and device based on anchor texts and peripheral texts in URLs

A population attribute and classification algorithm technology, which is applied in text database clustering/classification, unstructured text data retrieval, electronic digital data processing, etc. The number of samples is limited and other problems, to achieve the effect of accurate forecasting of population attributes, wide coverage and complete classification

Inactive Publication Date: 2015-03-25
RUN TECH CO LTD BEIJING
View PDF6 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, the above method needs to obtain the keyword information in the web pages that the user browses, and the amount of information on the web pages is huge, and there are many interference factors, which cannot directly reflect the user's click p

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Population property classification method and device based on anchor texts and peripheral texts in URLs
  • Population property classification method and device based on anchor texts and peripheral texts in URLs
  • Population property classification method and device based on anchor texts and peripheral texts in URLs

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] The technical solutions of the present invention will be further described below in conjunction with the accompanying drawings and through specific implementation methods. It should be understood that the specific embodiments described here are only used to explain the present invention, but not to limit the present invention. In addition, it should be noted that, for the convenience of description, only parts related to the present invention are shown in the drawings but not all content.

[0045] figure 1 It is a schematic flow chart of the demographic attribute classification method based on the anchor text and surrounding text in the URL provided by Embodiment 1 of the present invention, as shown in figure 1 shown, including the following steps:

[0046] S101. Obtain the anchor text and surrounding text in the URL clicked by the unknown user within a preset time period.

[0047] Specifically, when predicting the population attribute classification of an unknown us...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a population property classification method and device based on anchor texts and peripheral texts in URLs. The method comprises the first step of acquiring the anchor texts and the peripheral texts in the URLs clicked by unknown users within a preset time period, the second step of classifying the URLs into different catalogues according to the anchor texts, the peripheral texts and a pre-established first classification model, wherein the first classification model is obtained by conducting classification training through classified catalogues on the Internet, and the third step of conducting population property classification prediction on the unknown users according to catalogue feature information under different catalogues and a pre-established second classification model, wherein the second classification model is obtained by conducting classification training according to the catalogue feature information under the catalogues which the URLs clicked by known users belong to and population properties.

Description

technical field [0001] The invention relates to the technical field of data mining, in particular to a population attribute classification method and device based on anchor text and surrounding text in URLs. Background technique [0002] Demographic attributes of a person include but are not limited to age, gender, family income, occupation category, education level, life stage, etc. Gaining insight into people's demographic attributes has important practical significance for personalized web applications, personalized advertisements, etc. optimize. [0003] Most of the existing population attribute classification methods are to obtain the text features in the Web pages according to the Web pages browsed by the users, and search the pre-established population attribute classification models according to the text features, so as to complete the classification of the user's population attributes. Among them, the population attribute classification model is trained by using k...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/35G06F16/9558
Inventor 张岩峰梁东山
Owner RUN TECH CO LTD BEIJING
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products