method, device, system and medium for text information extraction based on Domain adaptation

A text information and extraction method technology, applied in special data processing applications, natural language data processing, instruments, etc., can solve the problems of large discrepancies in document data, low text accuracy and recall rates, and poor transferability. Efficiency, Enhanced Interconnections, Enhanced Capacity Effects

Active Publication Date: 2018-10-16
SUZHOU UNIV
View PDF4 Cites 21 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the great differences in document data between different fields, especially for the data in the source and target fields of the social media field, the transferability is poor, resulting in low text accuracy and recall rates in the migrated social media field.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • method, device, system and medium for text information extraction based on Domain adaptation
  • method, device, system and medium for text information extraction based on Domain adaptation
  • method, device, system and medium for text information extraction based on Domain adaptation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0058] The core of this application is to provide a method for extracting text information based on domain self-adaptation, which can improve domain transferability and enhance text analysis and extraction capabilities in fields such as social media; another core of this application is to provide a method based on domain self-adaptation The adapted text information extraction device, system and a readable storage medium have the above beneficial effects.

[0059] In order to make the purposes, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described embodiments It is a part of the embodiments of this application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinar...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a text information extraction method based on the Domain adaptation, comprises the following steps of:preprocessing the input text to obtain a text vector; between the second domain and the first domain,extracting a common feature of a text vector according to a common feature extraction parameter,and in the first domain extracting a private feature of the text vector according to the private feature extraction parameter; carrying out field classification on common features after a field blurring is carried out; analyzing and correcting a common feature extraction parameters according to a classification result and a domain information of a first field; performing adjacent word prediction on the text vector according to a private feature; analyzing and correcting the private feature extraction parameters according to a prediction result and a adjacent words in the text. The method can improve text analysis and extraction capability in social media and other fields. The invention also discloses a text information extraction device and system based on a field adaptation and a readable storage medium, and a text information extraction device has the above beneficial effects.

Description

technical field [0001] The present application relates to the field of domain adaptation, in particular to a method, device, system and a readable storage medium for extracting text information based on domain adaptation. Background technique [0002] At present, most of the extraction of text feature information is trained by supervised learning method on the basis of large-scale manual annotation corpus. In the process of named entity recognition, some fields (such as news fields and other formal texts) can obtain large-scale labeled data sets relatively easily, and the recognition system can be trained on the basis of large-scale manual labeling corpus; but some fields, such as social media In this field, the corpus is relatively scarce. For example, there is a corpus from Sina Weibo that is used for supervised learning and evaluation, and its size is only 1890 sentences. Such a corpus is not large enough to train the model. Due to the lack of large-scale social media co...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27
CPCG06F40/295
Inventor 陈文亮卢奇张民
Owner SUZHOU UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products