Unlock instant, AI-driven research and patent intelligence for your innovation.

Past Weibo data collection and processing methods

A processing method and data collection technology, which are applied in the fields of electronic digital data processing, natural language data processing, special data processing applications, etc., can solve the problem of inability to obtain a large amount of past Weibo data, etc., and achieve the effect of increasing data collection traffic.

Active Publication Date: 2018-03-13
HEFEI UNIV OF TECH
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to provide a method for collecting and processing past microblog data, so as to solve the problem that a large amount of past microblog data cannot be obtained by crawler methods or third-party API calls in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0015] Past microblog data collection and processing methods. Past microblog data refers to microblog data published by users before the current time. It is characterized by fixed data and convenient post-event analysis. It includes the following steps:

[0016] (1) Obtain active Weibo user ID:

[0017] Call the third-party API interface of Weibo to obtain the public Weibo data on the Weibo square. The public Weibo data is the user information field of the Weibo author, including the user UID and the user's city ID information; according to the obtained Weibo square The user UID is extracted from the Weibo data published on the Internet, and the available active Weibo user ID is available after deduplication;

[0018] (2) Obtain the microblog data of active microblog users:

[0019] Split the obtained user UID into 7 local user UID libraries, use 7 microblog third-party APITokens to run in parallel, increase the number of microblogs obtained per unit time; then call the micro...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for collecting and processing previous microblog data. The method comprises the steps that firstly, ID of an active microblog user is obtained; secondly, the microblog data of the active microblog user are obtained; finally, the microblog data are processed. According to the method, an API of a third part of the Sina is improved to overcome the shortage of insufficient precision of the data obtained by a microblog interface and the requirement for collecting and processing the previous microblog data can be met.

Description

technical field [0001] The invention relates to the field of microblog data processing methods, in particular to a method for collecting and processing past microblog data. Background technique [0002] With the rise of Weibo, short texts containing a large number of micro-viewpoints and emotional tendencies are rapidly enriched, and Weibo text analysis has become a popular research direction. [0003] In the process of microblog data collection, a large number of microblog data collection strategies usually adopt crawler crawling method. This method has fast crawling speed and high efficiency, but the captured data is noisy. Although it reduces the time of data collection, it is It doubles the preprocessing time for obtaining accurate data; and the crawler is unstable and often faces the danger of being banned by Sina. A small amount of Weibo data is generally collected by calling the third-party API of Sina Weibo. The data collected by this method has less noise and obvio...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
CPCG06F16/951G06F40/279
Inventor 任福继刘宁全昌勤魏希权
Owner HEFEI UNIV OF TECH