A text classification method and device based on transfer learning

A technology for text classification and transfer learning, applied in the field of text classification methods and devices based on transfer learning, can solve the problems of time-consuming and labor-intensive, and the impact of results is huge, so as to avoid workload, reduce the demand for training data, and reduce labor costs. The effect of workload

Inactive Publication Date: 2019-05-03
北京牡丹电子集团有限责任公司数字科技中心
View PDF3 Cites 60 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this method relies on a large amount of feature engineering work, which is time-consuming and labor-intensive, and has a huge impact on the results.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A text classification method and device based on transfer learning
  • A text classification method and device based on transfer learning
  • A text classification method and device based on transfer learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] In the following description, for purposes of illustration rather than limitation, specific details such as specific equipment structures, interfaces, and techniques are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the invention may be practiced in other embodiments without these specific details. In other instances, detailed descriptions of well-known devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.

[0037] Such as figure 1 As shown, a text classification method based on transfer learning, including:

[0038] S1: Use unlabeled text to train the BERT model, and get the pre-trained word representation BERT model;

[0039] S2: filtering content of links, forwarding symbols and user names in the text to be classified;

[0040] S3: input the filtered text into the word representation BERT model ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a text classification method and device based on transfer learning, and the method comprises the steps: S1, employing an unlabeled text to train a BERT model, and obtaining a pre-trained word representation BERT model; S2, filtering links, forwarding symbols and user name contents in the to-be-classified text; S3, inputting the filtered text into the word representation BERTmodel trained in the step S1 to obtain a semantic file of the text; and S4, inputting the semantic file of the text into a convolutional neural network for processing to obtain a category label of the sentence in the semantic file. According to the method, transfer learning is used for text classification, and a BERT word representation model trained based on large-scale unlabeled corpora is provided. The word representation model has universality, does not depend on the specific text field, and can also be used for other tasks such as entity extraction and sentiment analysis.

Description

technical field [0001] The present invention relates to the technical field of natural language processing, in particular to a method and device for text classification based on transfer learning. Background technique [0002] In the Web 2.0 era, every netizen has become a source of information on the Internet. Information release platforms for various purposes emerged as the times require, such as FaceBook, Xiaonei, Sina Weibo, etc., for users to publish, obtain, and share various information. Due to the large number of Internet users, the average amount of information generated by each information release platform every day is also large, so the amount of information generated by the Internet every day is also huge. Text classification refers to the process of automatically determining the text category according to the text content under a given classification system. Text classification is a very important module in text processing, and it is widely used, including gar...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06K9/62
Inventor 柳宜江武开智
Owner 北京牡丹电子集团有限责任公司数字科技中心
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products