Social media user demographic attribute prediction method based on multi-model stack fusion

A technology of population attributes and social media, applied in prediction, data processing applications, character and pattern recognition, etc., can solve problems such as data imbalance, long training time, errors, etc., to reduce generalization errors, improve accuracy, guarantee The effect of accuracy

Inactive Publication Date: 2018-05-29
SUN YAT SEN UNIV
View PDF3 Cites 25 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] 1. Since there are a large number of user-generated content such as advertisements, sharing, news, etc. in the content published on social media, there will be large errors in only text mining
[0005] 2. High-dimensional data problems
The traditional text classification method generally extracts the TFIDF features of the text, and the dimension can be as high as hundreds of thousands of dimensions. For the traditional SVM classification model, it will cause extremely long training time and cannot effectively converge.
[0006] 3. There is an imbalance in the data, most of the Weibo users are male

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Social media user demographic attribute prediction method based on multi-model stack fusion
  • Social media user demographic attribute prediction method based on multi-model stack fusion
  • Social media user demographic attribute prediction method based on multi-model stack fusion

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] The present invention will be further described below in conjunction with specific embodiment:

[0037] A method for predicting demographic attributes of social media users based on multi-model stack fusion described in this embodiment:

[0038] Such as figure 1 As shown, when predicting gender attributes, the specific steps are as follows:

[0039] a1. Extract TFIDF features, statistical features and time information features;

[0040] Among them, when performing TFIDF feature extraction, the blog post sent by each user is regarded as a document, and each word in it is regarded as a word, and then the TFIDF value of each word in the document is calculated to obtain a multi-dimensional TFIDF feature, and finally The extracted TFIDF feature is obtained from the multi-dimensional TFIDF feature after chi-square test screening;

[0041] Statistical features include the total number of blog posts sent by users, the number of blog posts reposted, the number of comments, th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a social media user demographic attribute prediction method based on multi-model stack fusion, and three demographic attributes including gender, age and region of a user arepredicted. The prediction of the three demographic attributes comprises the steps of (S1) user feature extraction, (S2) model training, and (S3) multi-model fusion to obtain a prediction result. The feature extraction in the present invention is not only for the text content of user Weibo, but also relates to statistical features, time information features and social relationship features, and theaccuracy of prediction is ensured. The mode of multi-model stack fusion is used to fuse logistic regression, random forest, and XGBoost models, a generalization error can be effectively reduced, andthe prediction accuracy is greatly improved.

Description

technical field [0001] The present invention relates to the technical field of model prediction, in particular to a method for predicting social media user demographic attributes based on multi-model stack fusion. Background technique [0002] With the continuous advancement of my country's informatization process and the continuous development of network technology, the Internet and communication terminals are speeding up their integration into modern life. The speed of development, and gradually become an independent, new way of information exchange and dissemination, and constantly changing people's lives. The rapid development of social media, while providing social convenience for people, has also had a huge impact and influence on advertising media. How does the advertising media use the characteristics of social media users to mine hidden user characteristics such as gender, age, region, etc. by mining the user’s behavior preferences on social media, and deliver more ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06Q10/04G06K9/62G06Q50/00
CPCG06Q10/04G06Q50/01G06F18/24G06F18/25G06F18/214
Inventor 郑子彬吴垚明陈亮
Owner SUN YAT SEN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products