Automatic blog writer interest and character identifying method based on support vector machine

A support vector machine, automatic identification technology, applied in special data processing applications, instruments, electronic digital data processing, etc., can solve the problems of lack of personalized service, high personal information overhead, and difficult implementation.

Inactive Publication Date: 2012-09-12
SOUTH CHINA UNIV OF TECH
View PDF6 Cites 21 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] Aiming at the huge amount of current blog users, the lack of personalization services based on blogs, the large cost of manually collecting or inferring authors' personal infor

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Automatic blog writer interest and character identifying method based on support vector machine
  • Automatic blog writer interest and character identifying method based on support vector machine
  • Automatic blog writer interest and character identifying method based on support vector machine

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0045] The embodiments of the present invention will be further described below in conjunction with the accompanying drawings, but the implementation of the present invention is not limited thereto.

[0046] The method of automatic identification of blogger's interest and personality based on support vector machine includes automatic identification of blogger's interest and automatic identification of personality. Among them, the automatic identification of interests includes collection of blog post training samples, denoising of blog post samples, Chinese lexical analysis, construction of a set of candidate interest feature items, measurement of the importance of candidate interest feature items, screening of interest classification feature item sets, and weight calculation of feature items , Vector representation of interest classification training samples, training interest classifiers, and predicting interest categories of other bloggers; automatic personality recognition i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an automatic blog writer interest and character identifying method based on a support vector machine. The automatic blog writer interest and character identifying method includes building an interest classified training sample set and a character classified training sample set at first; respectively processing the two training sample sets by a Chinese morphology analyzer to obtain a candidate interest feature item set and a candidate character feature item set; then analyzing the two candidate feature item sets by the aid of a statistics method; building an interest classified feature item set and a character classified feature item set; displaying the interest classified training sample set and the character classified training sample set into vector forms by the two feature item sets; and finally respectively using two groups of training interest classifiers and character classifiers. The classifiers are used for identifying interests and characters of other writers. By the aid of the automatic blog writer interest and character identifying method, the interests and the characters of the writers can be accurately identified, the method is applied to various personal services based on information of the writers, service providers can sufficiently know users, service quality is improved, and the method has an extremely high practical value.

Description

technical field [0001] The invention relates to a blog mining technology, in particular to a method for automatically identifying bloggers' interest and character based on a support vector machine. Background technique [0002] With the rapid development of the Internet, network communication methods are becoming more and more diverse. As a new way of network communication, blog has the advantages of simple use, strong personalization, good real-time performance and strong interaction, so it has attracted more and more people's attention. According to the "25th Statistical Report on Internet Development in China" released by China Internet Network Information Center (CNNIC), as of December 2009, the number of blog users reached 2.21 billion. Among them, the scale of active blogs has been further expanded, and the number of bloggers who have updated the blog space within half a year has reached 145 million. [0003] Today, blog applications have penetrated into every field ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 黄翰鲁梦平郝志峰刘伟庆张远峰蔡昭权
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products