Unlock instant, AI-driven research and patent intelligence for your innovation.

Data cleaning method and device, electronic equipment and computer readable medium

A technology of data and sub-questions, applied in computer components, computing, electrical digital data processing, etc., can solve problems such as inaccurate answers from customer service robots, uneven quality of knowledge base questions, and large differences in sub-questions, etc., to achieve convenience Scenario migration, improving answer accuracy, and realizing the effects of product systemization

Pending Publication Date: 2022-03-01
TAIKANG LIFE INSURANCE CO LTD +1
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in the process of building the knowledge base, due to the inconsistent level of knowledge administrators, the quality of the newly added knowledge base questions is uneven, and there will inevitably be problems such as redundant knowledge base questions and large differences in sub-questions under the same node, which eventually lead to customer service robots answering Not accurate enough
[0003] In the current knowledge base construction, the problems in the knowledge base are mainly cleaned by knowledge management personnel based on experience, which is time-consuming and labor-intensive.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data cleaning method and device, electronic equipment and computer readable medium
  • Data cleaning method and device, electronic equipment and computer readable medium
  • Data cleaning method and device, electronic equipment and computer readable medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided in order to give a thorough understanding of embodiments of the present disclosure. However, those skilled in the art will appreciate that the technical solutions of the present disclosure may be practiced without one or more of the specific details being omitted, or other methods, components, devices, steps, etc. may be adopted. In other instances, well-known technical solution...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a data cleaning method and device, electronic equipment and a computer readable medium, and belongs to the technical field of data processing. The method comprises the steps that sub-questions under nodes to be processed in a knowledge base and a pre-trained language representation model are obtained, and feature vectors corresponding to the sub-questions are obtained through the language representation model; constructing a vector retrieval library corresponding to the to-be-processed node according to the feature vectors corresponding to all the sub-problems under the to-be-processed node; determining the similarity between the feature vectors corresponding to the sub-problems under the to-be-processed node and the feature vectors in the vector retrieval library; and putting the sub-questions of which the similarity is greater than or equal to a similarity threshold value into the knowledge base again, and clearing the sub-questions of which the similarity is less than the similarity threshold value from the knowledge base. According to the method, the vector retrieval library of the nodes is established by using the language representation model, and the sub-problems are cleaned, so that the sub-problems under the same node are highly similar.

Description

technical field [0001] The present disclosure relates to the technical field of data processing, and in particular, to a data cleaning method, a data cleaning device, electronic equipment, and a computer-readable medium. Background technique [0002] In the construction of intelligent customer service robots, knowledge base construction is an extremely important step. However, in the process of building the knowledge base, due to the inconsistent level of knowledge administrators, the quality of the newly added knowledge base questions is uneven, and there will inevitably be problems such as redundant knowledge base questions and large differences in sub-questions under the same node, which eventually lead to customer service robots answering Not accurate enough. [0003] In the current knowledge base construction, the problems in the knowledge base are mainly cleaned by knowledge management personnel based on experience, which is time-consuming and labor-intensive. [000...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/33G06F16/332G06K9/62
CPCG06F16/3344G06F16/3329G06F18/214
Inventor 杨正良刘设伟龚珊珊
Owner TAIKANG LIFE INSURANCE CO LTD