Input method and device based on sample probability quantization and electronic equipment
What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A probability and sample technology, applied in the field of natural language processing, can solve problems such as probability value distortion, and achieve the effect of reducing the degree of distortion
Active Publication Date: 2021-06-18
GUANGZHOU ZIIPIN NETWORK TECH CO LTD
View PDF19 Cites 1 Cited by
Summary
Abstract
Description
Claims
Application Information
AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology
Problems solved by technology
Depending on the mapping method, the probability values will be distorted to varying degrees
Method used
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more
Image
Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
Click on the blue label to locate the original text in one second.
Reading with bidirectional positioning of images and text.
Smart Image
Examples
Experimental program
Comparison scheme
Effect test
example 1
[0184] Example 1: Assuming the input samples one, two, two, three, three, three, a total of 6 inputs, 3 sample types are generated, namely one, two, three. Among them, the sample probability value of one is 1 / 6, the sample probability value of two is 2 / 6, and the sample probability value of three is 3 / 6. If the predicted words are sorted according to the probability value, the order is three, two, one. Now because of the problem of storage space of electronic equipment, the probability value needs to be quantized, and the quantization target area is [0,3].
[0185] 1) If using a non-well-defined mapping function (such as tanh ), it is possible to map the sample probability values of one, two, and three to 2, 1, 1. If the predicted words are sorted according to the quantized value after mapping, the order may be two, three, one, and this order is the same as the probability value. The order of sorting is not the same.
[0186] 2) If a well-defined mapping function (such a...
example 2
[0188] Example 2: If it is quantized by group, the situation is more complicated. For example, input one, two, two, three, three, three, four, four, four, four, a total of 10 inputs, resulting in 4 sample types, namely one, two, three, four. Among them, the sample probability value of one is 1 / 10, the sample probability value of two is 2 / 10, the sample probability value of three is 3 / 10, and the sample probability value of four is 4 / 10. If the predicted words are sorted according to the probability value, the order is four, three, two, one. Now because of the problem of storage space, the probability value needs to be quantized, and the quantization target area is [0,3].
[0189] 1) If using a non-well-defined mapping function (such as min - max ), the first group is one, two, and the quantized value after mapping may be 2, 1; the second group is three, four, and the quantized value after mapping may be 2, 1. After combining the results of the first group and the second ...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more
PUM
Login to view more
Abstract
The embodiment of the invention provides an input method and device based on sample probability quantization and electronic equipment, and the method comprises the steps of obtaining user input information, and carrying out the calculation to obtain candidate words; performing probability prediction calculation on the candidate words to obtain probability values of the candidate words; inputting the probability values of the candidate words into a mapping function to obtain probability mapping values corresponding to the candidate words, wherein the mapping function is used for mapping the probability value into a specified probability mapping value domain range and adjusting the dispersion degree of the probability mapping value to be an expected dispersion degree in the specified probability mapping value domain range, and the probability value and the probability mapping value are in a one-to-one mapping relationship; rounding the probability mapping value to obtain a probability mapping quantized value; and determining a sorting order of the candidate words according to the probability mapping quantized value so as to output a candidate word list according to the sorting order. According to the embodiment of the invention, the probability value distortion degree after quantization is reduced, so that the candidate word list sequence determined based on the probability value after quantization is consistent with the candidate word list sequence before quantization as far as possible.
Description
technical field [0001] The present invention relates to the technical field of natural language processing, in particular to an input method, device and electronic equipment based on sample probability quantification. Background technique [0002] Technology is the driving force behind social progress. At present, using a large amount of corpus and using the Ngram language model for training, it has been able to provide a good input experience for users of most common languages, such as English and French. However, for the related languages of countries and regions along the Belt and Road, such as Arabic and Turkish, due to their language characteristics and huge vocabulary, compared with English, etc., the long-tail effect is more prominent. Using ordinary natural language processing technology, It is difficult to deal with the huge amount of vocabulary in the long tail, so that users in countries and regions along the Belt and Road cannot obtain a good input experience. ...
Claims
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more
Application Information
Patent Timeline
Application Date:The date an application was filed.
Publication Date:The date a patent or application was officially published.
First Publication Date:The earliest publication date of a patent with the same application number.
Issue Date:Publication date of the patent grant document.
PCT Entry Date:The Entry date of PCT National Phase.
Estimated Expiry Date:The statutory expiry date of a patent right according to the Patent Law, and it is the longest term of protection that the patent right can achieve without the termination of the patent right due to other reasons(Term extension factor has been taken into account ).
Invalid Date:Actual expiry date is based on effective date or publication date of legal transaction data of invalid patent.