Unlock instant, AI-driven research and patent intelligence for your innovation.

A corpus expansion method and related equipment

A technology of corpus and grammatical information, applied in the field of corpus expansion methods and related equipment, can solve the problems of low expansion efficiency and achieve the effects of increasing richness, realizing automatic expansion, and improving expansion efficiency

Active Publication Date: 2021-11-09
SIMPLECREDIT MICRO LENDING CO LTD
View PDF10 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] At present, the way to expand the initial corpus mainly depends on the way of manual expansion. For example, artificial divergent thinking on a certain initial corpus can obtain more than a dozen or more expansion corpora that match the query method of the initial corpus, and the expansion efficiency is low.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A corpus expansion method and related equipment
  • A corpus expansion method and related equipment
  • A corpus expansion method and related equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] In the embodiment of the present invention, the dynamic word vector of each word in the short text corpus to be expanded, the content word information, function word information and grammatical information corresponding to the short text corpus can be obtained, and based on the dynamic word vector from the corpus set, determine the corresponding A set of synonym candidates matching content word information and function word information. Since the dynamic word vector can reflect the meaning of words in different contexts, the accuracy of the determined synonym candidate set can be improved; further, the short text corpus can be expanded by combining the synonym candidate set and / or grammatical information, Determine the target expansion corpus set corresponding to the short text corpus. In this way, on the one hand, the expansion of the short text corpus can be automatically realized and the expansion efficiency can be improved; on the other hand, the short text corpus c...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the present invention discloses a corpus expansion method and related equipment. The method is applied in the field of data processing technology, and includes: obtaining the dynamic word vector of each word in the short text corpus to be expanded, and the short text corpus corresponding to content word information, function word information, and grammatical information; based on the dynamic word vector, determine a synonym candidate set matching the content word information and the function word information from the corpus; according to the synonym candidate set and / or the grammar The information expands the short text corpus, and determines the target expansion corpus set corresponding to the short text corpus. By adopting the application, the automatic expansion of the short text corpus can be realized, and the expansion efficiency of the short text corpus can be improved.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to a corpus expansion method and related equipment. Background technique [0002] In the intelligent customer service system, in order to understand the user's business, it is necessary to learn and identify each user's question and answer tag data through machine learning, but machine learning often requires a certain amount of initial corpus. For business scenarios in different fields, it is often difficult to provide a large amount of standardized initial corpus. Therefore, when the initial corpus is insufficient, it is often necessary to expand the initial corpus. [0003] At present, the way to expand the initial corpus mainly depends on the way of manual expansion. For example, artificial divergent thinking on a certain initial corpus can obtain more than a dozen or more expansion corpora that match the query method of the initial corpus, and the expansion efficiency ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/332G06F16/33G06F40/211G06F16/35
CPCG06F16/3329G06F16/3344G06F16/35G06F40/211
Inventor 张欢韵
Owner SIMPLECREDIT MICRO LENDING CO LTD