Cross-domain sentiment classification method based on comparison and alignment network

A sentiment classification and cross-domain technology, applied in cross-domain sentiment analysis and cross-domain sentiment classification based on comparison and alignment network, can solve problems such as limiting the applicability of UDA and insufficient data in unmarked target fields, so as to improve user experience Effect

Pending Publication Date: 2022-07-15
BEIJING INSTITUTE OF TECHNOLOGYGY
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in practical applications, the large amount of unlabeled target domain data required by UDA may not be sufficiently available, thus limiting the applicability of UDA

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cross-domain sentiment classification method based on comparison and alignment network
  • Cross-domain sentiment classification method based on comparison and alignment network
  • Cross-domain sentiment classification method based on comparison and alignment network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0050] like figure 1 As shown, a cross-domain sentiment classification method based on contrastive alignment network includes the following steps:

[0051] Step 1: Text preprocessing.

[0052] First, load the review corpus and pretrained language model. Among them, the pre-trained language model can be the BERT model or other models (such as the RoBERTa model).

[0053] Then, text preprocessing and text data formatting are performed on the review corpus.

[0054] Specifically, it includes the following steps:

[0055] Step 1.1: Extract attribute words, opinion words and their location information for each comment sentence.

[0056] Step 1.2: Use the nltk tokenizer to pre-segment the comment statement, and separate the token words with spaces.

[0057] Step 1.3: Add two special token words after the token sequence of the comment sentence: [CLS], [SEP], thus constructing a general input form: S={[CLS], w 1 , w 2 ..., w n , [SEP]}, n represents the total number of token w...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a cross-domain sentiment analysis method based on a comparison and alignment network, and belongs to the technical field of fine-grained sentiment analysis in natural language processing. According to the method, a scene, which is not fully explored, of cross-domain sentiment classification is researched, that is, a target domain is a scene with few samples. In the scene, the invention provides a neural network model named as a contrast alignment network (CAN). According to the model, two instances are randomly extracted from an original domain and a target domain, and then the target domain and the original domain are trained according to the instances of the combined target domain and the original domain. The first objective is to minimize classification errors on the original domain. The second is a pair of contrast targets, where a measure of distance between a target domain instance and an original domain instance in one pair is minimized if they express the same emotion, otherwise the measure is maximized with a constant upper limit. According to the method, the problem that the target domain data resources in the cross-domain sentiment classification task are limited is solved, and the user experience is improved.

Description

technical field [0001] The invention relates to a cross-domain sentiment classification method, in particular to a cross-domain sentiment analysis method based on a contrast alignment network, and belongs to the technical field of fine-grained sentiment analysis in natural language processing. Background technique [0002] Cross Domain Sentiment Classification (CDSC) is an important task that aims to transfer the learned knowledge from the original domain to the target domain. CDSC enables sentiment classification models trained in the original domain with large amounts of labeled data to perform well on target domain data with limited training samples. When the data in the target domain is lacking and the data in the original domain is sufficient, this situation is more common and challenging in the industry, and the main challenge lies in the domain transfer (or distribution transfer) between the source domain and the target domain. The domain transfer problem is mainly a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/289G06F40/30G06F16/35
CPCG06F40/289G06F40/30G06F16/35
Inventor 宋大为马放张辰杨艺
Owner BEIJING INSTITUTE OF TECHNOLOGYGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products