Method and system for constructing Tibetan emotional dictionary based on Tibetan language features

A sentiment dictionary, language feature technology, applied in natural language data processing, special data processing applications, instruments, etc., can solve the problems of late start of Tibetan sentiment analysis research, slow development of language processing, misunderstanding and other problems

Inactive Publication Date: 2017-09-01
MINZU UNIVERSITY OF CHINA
View PDF2 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] As an important language in China, Tibetan language has developed slowly in language processing. Tibetan sentiment analysis research started relatively late, corpus and emotional resources are scarce, Tibetan lacks a semantic dictionary, and it is difficult to analyze and det

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for constructing Tibetan emotional dictionary based on Tibetan language features
  • Method and system for constructing Tibetan emotional dictionary based on Tibetan language features
  • Method and system for constructing Tibetan emotional dictionary based on Tibetan language features

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0070] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0071] The purpose of the present invention is to provide a method for constructing a Tibetan emotional dictionary based on Tibetan language features, by matching the Chinese vocabulary ontology with the Sino-Tibetan dictionary to obtain a Tibetan basic emotional dictionary, and by Word2vec tools for Tibetan microblog information Corpus training and screening, based on the basic Tibetan emotional dictionary, is expanded to provide more Tibetan emotional vocabul...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention discloses a method and a system for constructing a Tibetan emotional dictionary based on Tibetan language features. The method comprises: matching the Chinese vocabulary ontology with emotion classification with a Chinese-Tibetan dictionary to obtain a Tibetan basic emotional dictionary; carrying out corpus training on the preliminarily collected Tibetan microblogging information by using the Word2vec tool to obtain a synonym set of the corpus training vocabulary, and taking the synonym set as an extended candidate word set; calculating the weight variance of each extended candidate word; and screening the extended candidate words according to the weight variance to obtain emotional extension words. According to the method for constructing the Tibetan emotional dictionary based on Tibetan language features disclosed by the present invention, the Chinese vocabulary ontology is matched with a Chinese-Tibetan dictionary to obtain the Tibetan basic emotional dictionary, corpus training and screening is carried out on the Tibetan microblogging information by using the Word2vec tool, and extension is carried out based on the Tibetan basic emotional dictionary, so that more Tibetan emotional vocabulary is provided, and emotion of the current Tibetan microblogging information expression is accurately analyzed.

Description

technical field [0001] The invention relates to the technical field of microblog language analysis, in particular to a method and system for constructing a Tibetan emotional dictionary based on Tibetan language features. Background technique [0002] At present, the field of sentiment analysis in English and Chinese is relatively mature, especially in the field of English sentiment processing, which has very comprehensive sentiment lexicon resources, among which the famous SentiWordNet of Princeton University and the General Inquirer (GI) dictionary compiled and developed by Harvard University, these Dictionaries are one of the resources commonly used by many researchers. In this dictionary, not only the meaning of each word is listed, but also its emotional attributes are marked accordingly. The resources that can be used in Chinese include "HowNet" developed by Dong Zhendong; "Dictionary of Complimentary and Derogatory Terms" compiled by Zhang Wei, Liu Jin and others; "Dic...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/374G06F40/289
Inventor 邱莉榕
Owner MINZU UNIVERSITY OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products