Rapid incremental clustering method for domain question-answering system consultations

A question answering system and incremental clustering technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve problems such as spending a lot of time, low efficiency of clustering algorithms, and failure to meet application requirements

Active Publication Date: 2015-07-15
南方电网互联网服务有限公司
View PDF5 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] 1) The amount of data is large, and the efficiency of the algorithm is very low by directly using clustering, which cannot meet the needs of the application;
[0006] 2) There are a lot of semantic noise in user consultation questions, which is a main reason for the poor clustering effect;
[0009] 5) Since the efficiency of the clustering algorithm is relatively low, it takes a lot of time to cluster all the data, which cannot meet the requirements of the application;

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Rapid incremental clustering method for domain question-answering system consultations
  • Rapid incremental clustering method for domain question-answering system consultations
  • Rapid incremental clustering method for domain question-answering system consultations

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0062] The present invention will be further described below in conjunction with the accompanying drawings and embodiments.

[0063] 1. A scalable clustering system for user consultation questions

[0064] like figure 1 As shown, the present invention proposes a scalable clustering system framework for user consultation problems. The system framework divides clustering into offline clustering and online clustering, so as to improve the efficiency of clustering algorithms, including the following steps:

[0065] Step 1) Consult the offline clustering algorithm of the history.

[0066] Step 2) Online clustering algorithm for user inquiries.

[0067] Step 3) merging the clustering results to generate clustering results;

[0068] Based on the clustering framework that combines offline clustering and online clustering, the fast and incremental clustering method for domain question answering system consultation provided by the present invention includes the following steps:

[0...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a rapid incremental clustering method for domain question-answering system consultations. The method comprises the following steps of performing semantic pretreatment on user consultations by utilizing a semantic irrelevancy dictionary and a word class dictionary through combination of consultation history offline clustering algorithm and consultation repeat removal on the basis of a clustering frame combined with offline clustering and online clustering, so that semantic normalization is realized; furthermore, calculating and constructing a similarity degree pattern on the basis of the similarity degree of multiple characteristics; performing offline clustering on consultation history of a user on the basis of the similarity degree pattern; and furthermore, performing online clustering on the consultations of the user by utilizing the offline clustering result as a clustering characteristic, and combining the offline clustering result with the online clustering result, so as to generate a clustering result. The clustering method provided by the invention is rapid in system response, is capable of achieving the accuracy conforming to the practical application demand, and is high in effectiveness and accuracy.

Description

technical field [0001] The invention relates to data mining and natural language processing in the field of artificial intelligence computers, in particular to a user consultation clustering method for text customer service consultation systems such as domain question answering systems. Background technique [0002] In a large number of natural language applications such as domain question answering systems, there is a basic and common problem: there are a large number of user consultation histories in the system, and user consultation consists of a short text (hereinafter referred to as short text corpus or user consultation) Corpus), how to cluster the consultation history into different classes according to a certain degree of similarity, and use the clustering results to identify and help the Q&A system understand the user’s consultation in the user domain question answering system. [0003] In the field of search engines, Baidu Zhizhi, domain question answering systems,...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27
Inventor 马健刘亮亮吴健康李洪梅
Owner 南方电网互联网服务有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products