Method for predicting gender of microblog user based on deep learning

A technology of deep learning and prediction method, which is applied in the field of web mining and intelligent information processing, and can solve the problems of artificially constructing microblog text, high dimensionality, and sparse feature vector.

Active Publication Date: 2018-06-01
BEIJING INSTITUTE OF TECHNOLOGYGY
View PDF6 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The existing microblog user gender identification methods mainly have the following problems: it is necessary to manually construct the microblog text features; the existing microblog text representation mainly uses the vector space model or the bag of words model, which has the problem of sparse feature vectors and high dimensionality

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for predicting gender of microblog user based on deep learning
  • Method for predicting gender of microblog user based on deep learning
  • Method for predicting gender of microblog user based on deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0058] Step 1, microblog information collection: utilize web crawler to collect the microblog text of the user on the microblog platform, and save it to the computer;

[0059] The microblog texts of several microblog users of different genders are collected, and the microblog texts of each user are stored in an extensible markup language file named after the user ID. In addition, the gender attributes of all Weibo users are stored in a file.

[0060] For example, for the microblog platform Twitter, use the web crawler Scrapy to collect the Twitter text of the microblog users, that is, the microblog text. The Weibo text with the user ID "1a4a60942a15426c9a7ec3764e7d0ede" is saved in the file "1a4a60942a15426c9a7ec3764e7d0ede.xml", in the form:

[0061]

[0062] Step 2, microblog text preprocessing: perform text extraction, lemmatization restoration, and stop word and punctuation mark filtering on the microblog text collected in step 1;

[0063] Preprocess the XML file coll...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a method for predicting the gender of a microblog user based on deep learning and belongs to the field of Web mining and intelligent information processing. The prediction method includes the steps of collecting microblog information; preprocessing a microblog text; constructing word vectors of microblog text words; using a convolutional neural network-based microblog textrepresentation method to construct feature vectors of microblog text sentences; using a long-term and short-term memory network model-based method for gender prediction or classification of the microblog user. The convolutional neural network-based microblog text representation method can achieve semantic modeling of the microblog text without the need to manually construct microblog text features. The long-term and short-term memory network-based microblog user gender prediction method can extract semantic sequence dependency features in the microblog text. The method for predicting the gender of the microblog user can accurately extract the microblog text features and improve the recognition performance of the gender of the microblog user, and has broad application prospects in the fields of information recommendation and product marketing.

Description

technical field [0001] The invention relates to the fields of Web mining and intelligent information processing, and relates to a method for predicting the sex of microblog users based on deep learning. The invention has broad application prospects in the fields of information recommendation, network public opinion monitoring, and e-commerce. Background technique [0002] Gender prediction of Weibo users is an important research content of user identity portrait construction. The construction of user identity portrait refers to the identification of various identity attributes of users, including the user's gender, age, and education level. User identity portrait construction technology can be widely used in computer investigation and evidence collection, network public opinion monitoring, commodity marketing and other fields. [0003] At present, user gender prediction mainly uses a classification method to identify the user's gender. In the document "Authorship Attributi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06F17/30G06N3/08G06Q50/00
CPCG06N3/084G06Q50/01G06F16/951G06F40/211G06F40/284
Inventor 张春霞冉昇武嘉玉冯丽霞牛振东黄达友
Owner BEIJING INSTITUTE OF TECHNOLOGYGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products