Unlock instant, AI-driven research and patent intelligence for your innovation.
Event-based Chinese coreference corpus library establishment method
What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A construction method and corpus technology, applied in the field of event-based Chinese referential corpus construction, can solve the problem of no Chinese referential corpus, etc., and achieve the effect of less classification, improved performance, and clear structure
Active Publication Date: 2017-06-27
SHANGHAI UNIV
View PDF6 Cites 3 Cited by
Summary
Abstract
Description
Claims
Application Information
AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology
Problems solved by technology
Events involve many aspects of entities, called elements. Like static concepts in traditional texts, there are also a large number of references. At the same time, events themselves also have many references. For event-oriented applications Said that they bring a lot of uncertainty, they need to be processed and studied, which requires the help of corpus, however, so far, there is no event-oriented Chinese reference corpus
Method used
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more
Image
Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
Click on the blue label to locate the original text in one second.
Reading with bidirectional positioning of images and text.
Smart Image
Examples
Experimental program
Comparison scheme
Effect test
Embodiment 1
[0044] see figure 1 , this event-based Chinese reference corpus construction method mainly includes the following steps:
[0045] (1) Select the CEC2.0 corpus as the basis for construction,
[0046] (2) Determine the target of referential labeling and labeling methods,
[0047] (3) Formulate corresponding labeling specifications according to specific reference targets,
[0048] (4) CEC2.0 corpus text preprocessing,
[0049] (5) Automatically label event elements and event references,
[0050] (6) Further optimize the labeling results through manual labeling,
[0051] (7) Set consistency check steps to ensure the quality of corpus annotation.
Embodiment 2
[0053] This embodiment is basically the same as Embodiment 1, and the special features are as follows:
[0054] The step (1) selects the CEC2.0 corpus as the basis for construction:
[0055] (1-1). Select CEC2.0 as the basic corpus for construction;
[0056] (1-2). Check the accuracy of event and event element annotation against the CEC2.0 corpus annotation specification;
[0057] (1-3). Supplement related annotations for incompletely annotated corpus, and correct incorrectly annotated corpus.
[0058] The step (2) determines the target and labeling method of referring to:
[0059] (2-1). The targets of referents are divided into two categories: the referents of event elements (object, environment and time) and the referents of events. The referents of event elements are divided into existing elements There are two kinds of referential labels for and default elements;
[0060] (2-2). In order to facilitate related processing by the computer, all types of re...
example 1
[0061] Example 1: Attribute labeling of object elements
[0062]
[0063] Shanghai Municipal Government Information Office
[0064] 15:45 on the 12th release
[0065] information
[0066]
[0067]
[0068] say
[0069]
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More
PUM
Login to View More
Abstract
The invention relates to an event-based Chinese coreference corpus library establishment method. The method mainly comprises the following steps of (1) selecting a CEC2.0 corpus library as a basis of establishment; (2) determining a target and an annotation mode of coreferenceannotation; (3) making a corresponding annotation specification according to a specific coreference target; (4) performing text preprocessing on CEC2.0 corpora; (5) automatically annotating event elements and event coreference; (6) further optimizing an annotation result through manual annotation; and (7) setting a consistency check step to ensure the quality of corpus annotation. According to the method, the defects of an existing coreference resolution corpus library are overcome; the method not only can cover all events in the corpus library but also is established based on Chinese syntactic analysis and semantic analysis, and conforms to the characteristics of Chinese; and the method also can perform consistence check on annotated corpora to ensure the quality of the corpus annotation.
Description
technical field [0001] The invention belongs to the field of natural languageprocessing (Natural LanguageProcessing), and relates to an event-based Chinese reference corpus construction method. Background technique [0002] Reference is a common linguistic phenomenon that occurs a lot in daily conversations and texts. Reference can make language expression concise and coherent, which is conducive to language communication and text writing. However, using a large number of references will increase the difficulty for computers to understand language and text. The main task of coreference resolution is to identify the same entity described by different expressions in the text. In the past, a large amount of research work was concentrated on non-event texts, and achieved certain results. With the rise of the concept of "event", more and more scholars have begun to conduct event-oriented research. Events are related to many elements and are knowledge representation units wi...
Claims
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More
Application Information
Patent Timeline
Application Date:The date an application was filed.
Publication Date:The date a patent or application was officially published.
First Publication Date:The earliest publication date of a patent with the same application number.
Issue Date:Publication date of the patent grant document.
PCT Entry Date:The Entry date of PCT National Phase.
Estimated Expiry Date:The statutory expiry date of a patent right according to the Patent Law, and it is the longest term of protection that the patent right can achieve without the termination of the patent right due to other reasons(Term extension factor has been taken into account ).
Invalid Date:Actual expiry date is based on effective date or publication date of legal transaction data of invalid patent.