Supercharge Your Innovation With Domain-Expert AI Agents!

Method for recognizing topics of nonequilibrium interactive texts based on example obtaining

A topic recognition, unbalanced technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problem of low topic recognition accuracy

Active Publication Date: 2015-05-27
XI AN JIAOTONG UNIV
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The interactive text generated in these real scenes generally presents an unbalanced distribution of topic categories, and the classifier often ignores minority categories when training the model, making the recognition accuracy of minority categories generally low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for recognizing topics of nonequilibrium interactive texts based on example obtaining
  • Method for recognizing topics of nonequilibrium interactive texts based on example obtaining
  • Method for recognizing topics of nonequilibrium interactive texts based on example obtaining

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0063] A topic recognition method for unbalanced interactive text based on instance acquisition, comprising the following steps, referring to figure 1 It consists of three steps:

[0064] Step 1: Filter instances from the source dataset stage:

[0065] (1) Determine the feature set representing the instance in the common feature set, that is, from the source data set (denoted as Dset Source ) and the target data set (denoted as Dset Target ) from the common feature set to select the feature set that can represent the instance and tend to the minority class.

[0066] (2) Sort and filter source dataset instances by cosine similarity. Use the cosine function to calculate the similarity between each minority class target instance and the same class instance in the source data set, and sort in descending order according to the value of this similarity, and for each minority class target instance, obtain the first K similar to the target data set instance The source dataset inst...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for recognizing topics of nonequilibrium interactive texts based on example obtaining. The method comprises three implementation steps that examples are selected from a source data set, wherein firstly, an evaluation function is defined, a feature set which represents the examples and tends to the minority in common feature sets is determined, and then source data set examples are obtained in a selection mode through cosine similarity sorting; the feature vector space consistency of the examples is processed, wherein the feature vector space of the examples is synthesized with the similarity as weights, and the feature vector space is made to be consistent with the feature vector space of target examples; the obtained examples and the examples of a target data set are synthesized. The obtained examples are collected into the target data set to train a classifier model together.

Description

technical field [0001] The invention relates to a natural language processing technology for information retrieval, extraction and management, in particular to a recognition method for an Internet interactive text topic. Background technique [0002] With the rapid development of Internet information technology, interactive text-based network applications are emerging, such as live classrooms, online Q&A chat rooms, community discussions and other typical interactive text application scenarios. The interactive texts generated in these real scenes generally show the nature of unbalanced distribution of topic categories, and the classifier often ignores minority categories when training the model, making the recognition accuracy of minority categories of topics generally low. For unbalanced interactive text, how to overcome its imbalance and improve the recognition accuracy of minority topics is an important task. The applicant did not retrieve any patent documents related to...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/27G06F17/30
Inventor 田锋高鹏达郑庆华吴凡
Owner XI AN JIAOTONG UNIV
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More