Input method and device based on sample probability quantization and electronic equipment

A probability and sample technology, applied in the field of natural language processing, can solve problems such as probability value distortion, and achieve the effect of reducing the degree of distortion

Active Publication Date: 2021-06-18
GUANGZHOU ZIIPIN NETWORK TECH CO LTD
View PDF19 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Depending on the mapping method, the probability values ​​will be distorted to varying degrees

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Input method and device based on sample probability quantization and electronic equipment
  • Input method and device based on sample probability quantization and electronic equipment
  • Input method and device based on sample probability quantization and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

example 1

[0184] Example 1: Assuming the input samples one, two, two, three, three, three, a total of 6 inputs, 3 sample types are generated, namely one, two, three. Among them, the sample probability value of one is 1 / 6, the sample probability value of two is 2 / 6, and the sample probability value of three is 3 / 6. If the predicted words are sorted according to the probability value, the order is three, two, one. Now because of the problem of storage space of electronic equipment, the probability value needs to be quantized, and the quantization target area is [0,3].

[0185] 1) If using a non-well-defined mapping function (such as tanh ), it is possible to map the sample probability values ​​of one, two, and three to 2, 1, 1. If the predicted words are sorted according to the quantized value after mapping, the order may be two, three, one, and this order is the same as the probability value. The order of sorting is not the same.

[0186] 2) If a well-defined mapping function (such a...

example 2

[0188] Example 2: If it is quantized by group, the situation is more complicated. For example, input one, two, two, three, three, three, four, four, four, four, a total of 10 inputs, resulting in 4 sample types, namely one, two, three, four. Among them, the sample probability value of one is 1 / 10, the sample probability value of two is 2 / 10, the sample probability value of three is 3 / 10, and the sample probability value of four is 4 / 10. If the predicted words are sorted according to the probability value, the order is four, three, two, one. Now because of the problem of storage space, the probability value needs to be quantized, and the quantization target area is [0,3].

[0189] 1) If using a non-well-defined mapping function (such as min - max ), the first group is one, two, and the quantized value after mapping may be 2, 1; the second group is three, four, and the quantized value after mapping may be 2, 1. After combining the results of the first group and the second ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention provides an input method and device based on sample probability quantization and electronic equipment, and the method comprises the steps of obtaining user input information, and carrying out the calculation to obtain candidate words; performing probability prediction calculation on the candidate words to obtain probability values of the candidate words; inputting the probability values of the candidate words into a mapping function to obtain probability mapping values corresponding to the candidate words, wherein the mapping function is used for mapping the probability value into a specified probability mapping value domain range and adjusting the dispersion degree of the probability mapping value to be an expected dispersion degree in the specified probability mapping value domain range, and the probability value and the probability mapping value are in a one-to-one mapping relationship; rounding the probability mapping value to obtain a probability mapping quantized value; and determining a sorting order of the candidate words according to the probability mapping quantized value so as to output a candidate word list according to the sorting order. According to the embodiment of the invention, the probability value distortion degree after quantization is reduced, so that the candidate word list sequence determined based on the probability value after quantization is consistent with the candidate word list sequence before quantization as far as possible.

Description

technical field [0001] The present invention relates to the technical field of natural language processing, in particular to an input method, device and electronic equipment based on sample probability quantification. Background technique [0002] Technology is the driving force behind social progress. At present, using a large amount of corpus and using the Ngram language model for training, it has been able to provide a good input experience for users of most common languages, such as English and French. However, for the related languages ​​of countries and regions along the Belt and Road, such as Arabic and Turkish, due to their language characteristics and huge vocabulary, compared with English, etc., the long-tail effect is more prominent. Using ordinary natural language processing technology, It is difficult to deal with the huge amount of vocabulary in the long tail, so that users in countries and regions along the Belt and Road cannot obtain a good input experience. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F3/02G06F16/33
CPCG06F3/0237G06F16/3346G06F3/023
Inventor 梁振兴
Owner GUANGZHOU ZIIPIN NETWORK TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products