A Cross-Domain Text Classification Method Based on Adaptive Noise Reduction Encoder

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A text classification, cross-domain technology, applied in the field of network text data information for cross-domain classification, can solve the problems of enlarged feature space, sensitivity to noise coefficient, increased text data high-dimensionality, sparsity, etc.

Active Publication Date: 2021-09-14

HEFEI UNIV OF TECH

View PDF6 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Cross-domain classification tasks need to process features of multiple different domains at the same time, which further enlarges the feature space and further exacerbates the high-dimensional and sparse nature of text data, which makes it difficult to select meaningful features and provides a common way for learning. Feature spaces for cross-domain classification pose challenges

[0007] Second, although the edge denoising encoder can learn a relatively robust feature space in cross-domain classification tasks, however, its learning results are sensitive to noise coefficients

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0052] see figure 1 , the cross-domain text classification method based on the adaptive noise reduction encoder in this embodiment is carried out in the following steps:

[0053] Step 1: Statistical feature words and their frequency of occurrence in the source and target domains

[0054] Obtain the target domain data set DT and the source domain data set DS with label information respectively,

[0055]

[0056] t i is the i-th sample in the target domain data set DT, no t is the number of samples in the target domain data set DT, Indicates the i-th sample t in the target field data set DT i The a-th feature word in , a=1,2,...,nw t , nw t is the number of characteristic words of samples in the target domain data set DT.

[0057] the s j is the jth sample in the source domain data set DS, no s is the number of samples in the source domain data set DS, w b j Indicates the jth sample s in the source domain data set DS j The b-th feature word in , b=1,2,...,nw ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a cross-domain text classification method based on an adaptive noise reduction encoder, which is characterized in that: a feature selection method suitable for cross-domain tasks is used to filter samples in the source domain data set and the target domain data set Feature words with low frequency and meaningless appearing in , and adaptively calculate the optimal noise interference coefficient according to the distribution difference between the samples in the source domain set and the target domain set, and use the optimal noise interference coefficient to analyze the feature space For perturbation, a new feature space is built using the stacked edge denoising encoder approach and a classifier is constructed. The invention can better excavate the relationship between latent features among fields, reduce field differences, and thus can improve the correctness of classification.

Description

technical field [0001] The invention relates to a cross-domain text classification method based on an adaptive noise reduction encoder, and classifies network text data information, more specifically, cross-domain classification for network text data information in different fields and different data distributions . Background technique [0002] In recent years, with the rapid rise of online social platforms such as blogs, WeChat, and Weibo, a large amount of text information has been generated on the Internet. These massive data often contain huge potential commercial value. Information can be used to improve or upgrade products in a targeted manner, so as to meet consumer needs and increase market competitiveness; It will be more favored by consumers. In view of this, research in related fields such as text classification has extremely important value and significance. [0003] However, because the data in the network is affected by multiple factors such as users and ti...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityPatents(China)

IPC IPC(8): G06F16/35G06F40/289

CPCG06F40/289

Inventor张玉红杨帅李玉玲李培培

OwnerHEFEI UNIV OF TECH

A Cross-Domain Text Classification Method Based on Adaptive Noise Reduction Encoder

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology