Unlock instant, AI-driven research and patent intelligence for your innovation.

A method and system for identifying Chinese homonyms

A technology for identifying methods and events, which is applied in the fields of instrumentation, computing, and electrical and digital data processing, and can solve problems such as insufficient versatility, conflicting classification results, and lack of performance.

Active Publication Date: 2018-08-07
SUZHOU UNIV
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] At present, in the field of Chinese homologous event recognition, most methods use classifier-based machine learning methods and rule-based methods. These methods have the following problems: 1) Most Chinese homologous event recognition methods that use machine learning still use English homologous event recognition. method, the language is not specific enough
These characteristics make the method of identifying the same event in English lack of performance; 2) The machine learning method assumes that the event pairs are independent of each other, which may easily cause conflicts in classification results and inconsistent event chains; 3) The disadvantage of the rule method is that The construction cost of the rules is high, and the versatility is not enough to be used across domains

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and system for identifying Chinese homonyms
  • A method and system for identifying Chinese homonyms
  • A method and system for identifying Chinese homonyms

Examples

Experimental program
Comparison scheme
Effect test

example 1

[0080] Example 1: At 7 am on December 14, 2012, more than 10 monkeys used monkey paws to create a wounding case in the corn field of Chenpeng Village. Four villagers were injured when they were scratched by the monkey's paw. Subsequently, the monkey who caused the wounding case was driven away by the police. So far, two villagers have been seriously injured. ...the group of monkeys once broke into the residence of an elderly man who lived alone. When the monkey attacked the old man, the old man resisted. After the old man was slightly injured, the monkey rushed into the cornfield of Chenpeng Village.

[0081] Event annotation information can be generated by event extraction tools or manually, as shown in Example 2:

example 2

[0082] Example 2: E1:Tri=SenID=1 Type=Attack Args={December 14th, 2012 at 7 am / TIME / Time; more than 10 monkeys / PER / Attacker; monkey paw / WEA / Instrument; Chen Pengcun Cornfield / LOC / Place}Polarity=True Tense=Past

[0083] E2: Tri=Scratch SenID=2 Type=Attack Args={Villager / PER / Target; Monkey Paw / WEA / Instrument} Polarity=True Tense=Past

[0084] E3: Tri=Injured SenID=2 Type=Injure Args={Villager / PER / Victim; Monkey Paw / WEA / Instrument} Polarity=True Tense=Past

[0085] E4: Tri=Assault SenID=3 Type=Attack Args={Monkey / PER / Attacker}Polarity=True Tense=Past

[0086] E5: Tri=Drive SenID=3 Type=Arrest Args={Civil Police / PER / Agent; Monkey / PER / Person}Polarity=True Tense=Past

[0087] E6: Tri=Serious Injury SenID=4 Type=Injure Args={Current / TIME / Time; Villager / PER / Victim}Polarity=True Tense=Past

[0088] E7: Tri=Intrusion SenID=9 Type=Transport Args={Monkey / PER / Artifact; Residence / LOC / Place}Polarity=True Tense=Past

[0089] E8: Tri=Attack SenID=10 Type=Attack Args={Monkey / PER / Attacker; O...

example 3

[0095]

[0096] Indicates that E1 and E2, E1 and E4, E2 and E4, E3 and E6 are the same events.

[0097] figure 2 It is the decomposed flowchart of step S1 of the method for identifying Chinese homonymous events provided by a preferred embodiment of the present invention. Such as figure 2 As shown, step S1 of the method for identifying Chinese homonymous events provided by a preferred embodiment of the present invention further includes the following steps.

[0098] S101. Invoke a word segmentation tool to segment words for each event sentence in the annotation text of the same index and the test text, and obtain a word segmentation annotation set and a word segmentation test set separated by spaces.

[0099] For example: the event sentence "at 7 o'clock in the morning on December 14, 2012, more than 10 monkeys used monkey paws to create a wounding case in the cornfield of Chenpeng Village." After word segmentation, it becomes:

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A method and system for identifying Chinese homologous events, the method comprising: performing word segmentation, entity recognition, and syntactic analysis on each sentence containing an event in annotated text of the same index and a test text, and obtaining a set of preprocessed annotated texts and a preprocessed The test text set is processed, and the event pairs and their feature information of the same event type are extracted from the pre-processing annotation text set and the pre-processing test text set in units of documents, respectively, to obtain the annotation text feature set and the test text feature set. According to the characteristics of each event pair in the marked text feature set, train a co-referencing event recognition model; then use the co-referring event recognition model to judge whether there is a co-referring relationship between the event pairs corresponding to each feature in the test text feature set, and get the event co-referencing First collection. The global optimization is performed on the results of the initially identified coincident events in the first set of event coincidences, and the event coincidence set is obtained. In this way, the performance of co-referencing event recognition is improved.

Description

technical field [0001] The invention belongs to the field of natural language processing, and in particular relates to a method and system for recognizing Chinese homonyms among events. Background technique [0002] Event (Event) is a main form of information representation. It is an objective fact (also called "natural event") of specific people, things, and things interacting at a specific time and a specific place, such as human injury and death events. and food additive incidents, etc. An article often contains many events, and there are various relationships between these events. When two events point to the same event ontology, it is considered that the two events have a co-reference (or coreference) relationship. For example: [0003] Example 1: The heads of state of the two countries held talks in Paris today. ... The two sides discussed the issue of peace in the Middle East during the talks. [0004] Example 2: The financial crisis broke out in the United State...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/27
Inventor 李培峰朱巧明周国栋朱晓旭
Owner SUZHOU UNIV