Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Detection system for identifying microblog data stream of sudden event in real time

A technology for emergencies and detection systems, applied in digital data information retrieval, unstructured text data retrieval, data processing applications, etc. Effectiveness of reliability, optimized calculation process, fast and accurate detection and identification

Inactive Publication Date: 2021-04-02
10TH RES INST OF CETC
View PDF10 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, although there are many research results and some effective solutions in real-time event detection and recognition, most of these emergency event recognition methods only realize the detection and recognition of global events or regional events (such as countries) ( Such as large-scale natural disasters, armed conflicts, etc.), did not detect and identify small-scale events (such as local epidemics, forest fires, etc.)
In addition, some methods need to manually set the number of events, event types and other information, which often requires a priori knowledge of large materials and manual labeling data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Detection system for identifying microblog data stream of sudden event in real time
  • Detection system for identifying microblog data stream of sudden event in real time
  • Detection system for identifying microblog data stream of sudden event in real time

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] refer to figure 1 , figure 2. In the preferred embodiment described below, a detection system for real-time identification of microblog data streams for emergencies includes: an entity extraction module connected in series in sequence, an entity filtering module connected to a trend identification module, a similarity calculation module, a similarity degree filtering module, clustering link module, clustering grading module and data storage module to build a whole-process system from the original microblog data flow to event detection, identification and storage, characterized in that the entity extraction module is based on RoBERTa-wwm -large-ext model, and trained on the NER dataset released by the CLUE academic organization to extract various types of named entities; use crawler technology to crawl in real time from official Weibo and major V accounts certified by provinces, cities and counties Take text data and perform data cleaning on the crawled data. Input t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a detection system for identifying the microblog data stream of a sudden event in real time, and the sudden event can be rapidly and accurately detected and identified withoutany priori knowledge about the event. According to the technical scheme, the system is characterized in that a crawler tool is used for crawling text data in real time; an entity extraction module extracts multiple types of named entities, and a trend recognition module is adopted to obtain a hot word list about different regions; an entity filtering module is used for filtering entities with no heat; a similarity calculation module is used for establishing a co-occurrence matrix in a window, calculating entity similarity and constructing an entity relation graph; a similarity filtering moduleis used for filtering edges with smaller numerical values in the entity relation graph; an entity clustering module uses a community discovery algorithm to obtain a corresponding clustering set for the entity relation graph; a clustering link module is used for continuously tracking the events in the event window; a clustering and grading module grades the clustering result subjected to the clustering link according to the number of hot words contained in the clustering result; and a data storage module stores clustering and grading information.

Description

technical field [0001] The invention belongs to the technical field of emergency event detection and identification, in particular to a detection system for real-time identification of emergency event microblog data streams. Background technique [0002] With the rapid development of Internet technology, social network services, news, forums, Weibo, and social platforms based on smart phone applications, some emerging Internet services have become important platforms for people to spread and obtain information. Especially in recent years, the development of microblogging is suddenly emerging, and it is loved by the majority of users by virtue of its real-time and convenience. People can publish and obtain relevant information about a sudden event in the "real world" at the first time. For example, the Sina Weibo account officially certified by the China National Health and Medical Commission has become the primary way for many Chinese to understand the real-time situation o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/35G06F16/33G06F16/335G06F16/31G06F40/295G06Q50/00
CPCG06F16/353G06F16/334G06F16/335G06F16/31G06F40/295G06Q50/01
Inventor 庄旭尹可鑫甘翼袁鑫丛迅超李贵
Owner 10TH RES INST OF CETC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products