Unlock instant, AI-driven research and patent intelligence for your innovation.

Multi-source heterogeneous online network topic early identification method

A multi-source heterogeneous, early identification technology, applied in network data retrieval, network data indexing, other database retrieval and other directions, can solve problems such as difficulty in early discovery of topics, information production, dissemination, and complex interaction.

Pending Publication Date: 2021-01-19
XI'AN PETROLEUM UNIVERSITY
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The production, dissemination, and interaction of information between this multi-source heterogeneous online network are becoming more and more complex, making early discovery of topics more difficult
And at present, many topic discovery methods are mainly aimed at the discovery and dissemination of hot topics, and there is still a lot of room for research on early topic discovery methods.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-source heterogeneous online network topic early identification method
  • Multi-source heterogeneous online network topic early identification method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0035] refer to figure 2 , the specific operation process of this embodiment is:

[0036] 1) Analyze the characteristics of different online social network structures, design a distributed parallel crawler engine according to the characteristics of different online social network structures, and then use the distributed parallel crawler engine to crawl the original short text information published by online social networks, and then pass Chinese word segmentation and The text feature value extraction method performs text preprocessing on the original short text information disclosed by the online social network, and obtains a short text keyword set D 0 ;

[0037] Among them, the original short text information published by the crawled online social network includes the news headlines of each news site and the microblogs of each microblog platform, and the TF-IDF method is used to extract the original short text information through Chinese word segmentation and text feature v...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a multi-source heterogeneous online network topic early identification method. The method comprises the following steps: 1) obtaining a short text keyword set D0; 2) performingcommunity structure division on the complex network constructed in the step (2) by utilizing a dynamic community division method, and segmenting a time interval [t0, tend] by taking a time progressive increment delta t as an interval to obtain a time interval [t0, tend]; constructing a complex network at the moment t0 + delta t through newly added short text information of each heterogeneous online social network crawled within the time progressive increment delta t, and then performing community division on the complex network at the moment t0 + delta t by utilizing a dynamic community division method to realize community division of the complex network; and 4) performing statistics on complex network community division results to construct a finally discovered topic keyword set. The method can perform early topic discovery and extraction on short text information data crawled from a plurality of online social network platforms.

Description

technical field [0001] The invention belongs to the research field of early identification methods of online network topics, and relates to an early identification method of multi-source heterogeneous online network topics. Background technique [0002] On the one hand, with the rapid and in-depth development of the Internet, especially the mobile Internet, the Internet has broken the time and space constraints of traditional information exchange and circulation, subverted the traditional information dissemination mode, and the role of Internet users in the process of information dissemination has changed from information consumer to Transformed into an information diffuser or even an information producer; different online social network system subjects also gradually began to appear and form a phenomenon of mutual information dissemination. The production, dissemination, and interaction of information between this multi-source heterogeneous online network are becoming more ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/9535G06F16/9536G06F40/289G06Q50/00
CPCG06F16/9535G06F16/9536G06F40/289G06Q50/01G06F16/951Y02D10/00
Inventor 徐小艳周帅鹏张贝贝吕伟
Owner XI'AN PETROLEUM UNIVERSITY