Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Dual-mode-based event corpus automatic construction method and device and storage medium

A technology of automatic construction and corpus, applied in the field of data processing, can solve the problems of low accuracy, high cost, incomplete corpus, etc., and achieve the effect of saving cost, complete content and improving accuracy

Active Publication Date: 2018-12-11
EAST CHINA UNIV OF SCI & TECH +1
View PDF8 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The inventors found that there are at least the following problems in the prior art: when constructing most news topic event corpora, experts often need to manually label news information related to topic events, which is not only inefficient but also high in cost
And for news events, generally a theme event will have many related sub-theme events, it is difficult to collect all relevant event corpus during manual labeling, resulting in incomplete corpus, incomplete coverage, and low accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Dual-mode-based event corpus automatic construction method and device and storage medium
  • Dual-mode-based event corpus automatic construction method and device and storage medium
  • Dual-mode-based event corpus automatic construction method and device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention more clear, various implementation modes of the present invention will be described in detail below in conjunction with the accompanying drawings. However, those of ordinary skill in the art can understand that, in each implementation manner of the present invention, many technical details are provided for readers to better understand the present application. However, even without these technical details and various changes and modifications based on the following implementation modes, the technical solution claimed in this application can also be realized.

[0031] The first embodiment of the present invention relates to a method for automatically constructing an event corpus based on dual modes. The specific process is shown in Figure 1, and the specific process is as follows:

[0032] Step 101, acquire the first theme event keyword input by the user.

[0033...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention relates to the field of data processing, and discloses a dual-mode-based event corpus automatic construction method and device and a storage medium. The dual-mode-basedevent corpus automatic construction method includes: acquiring the first subject event keyword inputted by a user; retrieving a first topic event corpus according to a first topic event keyword, andexpanding the first topic event corpus to obtain a second topic event corpus; obtaining the third topic event corpus according to the correlation between the second topic event corpus and the topic, and forming the corpus by the third topic event corpus. Through the dual-mode-based event corpus automatic construction method, experts are not required to annotate news information related to the subject event, thus improving the efficiency of constructing the corpus and saving the labor cost. Moreover, a corpus of all relevant events can be automatically collected, so the corpus is made to be more complete and accurate.

Description

technical field [0001] The embodiments of the present invention relate to the field of data processing, and in particular to a method, device and storage medium for automatically constructing an event corpus based on dual modes. Background technique [0002] In recent years, network technology has developed rapidly, and Internet data has become the main source of information for people due to its advantages such as rapid update, wide range, and easy access. According to statistics, the vast majority of network data are stored in the form of text, recording a large number of news events, and these news events often revolve around a certain theme. In the era of big data, all news events related to a certain topic are screened out from massive data, and a corpus of news topic events is constructed, which is helpful for the mining and analysis of news events. [0003] The inventors found that there are at least the following problems in the prior art: when constructing most new...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27
CPCG06F40/295G06F40/30
Inventor 过弋王志宏
Owner EAST CHINA UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products