Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Evaluation object extraction method based on domain dictionary and semantic roles

A technology for evaluating objects and semantic roles, which is applied in semantic analysis, natural language data processing, special data processing applications, etc., can solve the problems that the Chinese annotation corpus cannot fully excavate features and poor domain adaptability, and achieve flexible and diverse structures. The effect of accuracy

Active Publication Date: 2015-01-07
BEIJING INSTITUTE OF TECHNOLOGYGY
View PDF4 Cites 43 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to solve the problem that the existing rule-based evaluation object extraction method has poor domain adaptability, and the machine learning-based method cannot fully mine the characteristics of the limited Chinese annotation corpus, and proposes a Chinese based on domain dictionary and semantic roles. Sentence Evaluation Object Extraction Method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Evaluation object extraction method based on domain dictionary and semantic roles
  • Evaluation object extraction method based on domain dictionary and semantic roles
  • Evaluation object extraction method based on domain dictionary and semantic roles

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] The present invention will be further described below in conjunction with embodiment.

[0035] In this embodiment, the data set provided by the Sixth Chinese Opinion Analysis Evaluation (COAE2014 for short) task four is selected as the experimental corpus for the creation of domain dictionaries and the training of CRFs. In this corpus, each sentence contains a marked evaluation object (OT: is an evaluation object).

[0036] Step 1: Preliminarily filter the corpus S (mostly sentences in microblogs and forums) according to rules. The specific content of Rules is as follows:

[0037] Rule 1: Remove pure English sentences (currently mainly focus on the analysis of Chinese sentences);

[0038] Rule 2: Divide the sentence with " / / " and reverse the order of the clauses; for example, user a reposted user b's microblog "iphone5s is very beautiful.", and said "I like it very much!" in this microblog, In the corpus S, it is expressed as: "I like it very much! / / iphone5s is very...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to an evaluation object extraction method based on a domain dictionary and semantic roles and belongs to the field of natural language processing application technologies. The evaluation object extraction method based on the domain dictionary and the semantic role comprises the following steps that firstly, according to the information of the part of speech, dependency information and semantic role information, the domain dictionary DL of evaluation objects is established; secondly, the characteristics in the four aspects of words, dependency, relative positions and the semantic roles are fully extracted, model training and prediction are carried out on the DL and the characteristics through conditional random fields (CRFs), and then the extraction of the evaluation objects is completed. Compared with the prior art, according to the characteristics that the structures of Chinese sentences, especially Chinese sentences of microblogs and forum evaluation information are flexible and diverse, the constructive methods are variable, and the number of the characteristics of the sentences is small, the syntax of different levels and the semantic information are fully utilized, the advantages of the evaluation object extraction method based on rules and machine learning are also utilized, the evaluation object with a high confidence coefficient is found from a corpus automatically, rapidly and accurately, and the accuracy of extraction of the evaluation objects of the Chinese sentences is improved.

Description

technical field [0001] The invention relates to a method for extracting evaluation objects of Chinese sentences, in particular to an evaluation object extraction method based on domain dictionaries and semantic roles, and belongs to the technical field of natural language processing applications. Background technique [0002] With the development of the Internet, especially web2.0, more and more people not only obtain information through the Internet, but also participate in more communication on the Internet. The formation and development of blogs, microblogs, and forums have greatly changed the way people use the Internet. With the development of the Internet, it is becoming more and more difficult to understand the massive information on the Internet by artificial means, and to integrate and analyze the information on the Internet. It is against such an application background that the technology of capturing and analyzing web texts emerges as the times require. Due to t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/9535G06F40/242G06F40/30
Inventor 冯冲廖纯杨森黄河燕
Owner BEIJING INSTITUTE OF TECHNOLOGYGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products