Check patentability & draft patents in minutes with Patsnap Eureka AI!

Feature encoding method and device

A technology of feature encoding and encoding information, which is applied in the field of encoding, can solve problems such as storage and calculation difficulties, and the length of one-hot code becomes larger, so as to achieve the effect of reducing the number of digits required and shortening the encoding length

Pending Publication Date: 2019-01-29
ALIBABA GRP HLDG LTD
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] However, when the value of the feature is large, the one-hot encoding will cause the length of the encoded one-hot code to increase accordingly, which will cause difficulties in storage and operation.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Feature encoding method and device
  • Feature encoding method and device
  • Feature encoding method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0032] Aiming at the problems of the One-Hot encoding mentioned in the background art, an improved R-Hot encoding is further proposed in the related art. When R-Hot coding converts features into corresponding coded information, the coded information contains multiple bits with a value of 1 at the same time, so that different features can be distinguished through fewer bits of coded information.

[0033] For example, when the value of the feature is m=6, if the number of bits of R-Hot encoding is selected as r=2, the corresponding encoding information can be: 1100, 0110, 0011, 1010, 0101, 1001, that is, only 4 One bit can represent six values, so it is superior to the above-mentioned One-Hot encoding in terms of occupied space and required computing resources. The R-Hot coding in the related art is usually a random coding, that is, after setting the number of bits of the coding information, the coding information corresponding to each feature is randomly generated, and it only ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A method and apparatus for feature encode are provided. The method may include: mapping the identification code of the value of the feature to s+1 non-negative integers less than p, where p is a primenumber not less than a formula as described in the description, M is the total number of values of the features, s is the maximum number of collisions between the encoded information corresponding tothe values of the features; mapping the non-negative integer to a r-point subset of the r x p-point set, wherein r is the number of non-zero bits contained in the encoded information corresponding tothe value of the feature, and r>1, base on that location of the elements in the r point subset in the r p point set, mapping the r-point subset to encoded information corresponding to a value of thefeature. The technical proposal of the application can shorten the encoding length and meet the requirement of the collision number.

Description

technical field [0001] The present application relates to the field of coding technology, and in particular to a feature coding method and device. Background technique [0002] In related technologies, many fields involve feature encoding processing. For example, in the field of machine learning, when the features to be learned include the feature "gender" (for example, the corresponding value can include "male", "female"), "region" (for example, the corresponding value can include "Asia", "Europe", "Africa"), "Browser used" (for example, the corresponding values ​​can be "Browser A", "Browser B", "Browser C" and "Browser D"), etc., directly The efficiency of learning the above features is very low. [0003] To do this, the above features can be encoded as numbers. For example, for the feature "gender", you can set the value "male" to 0, and the value "female" to 1; for the feature "area", you can set the value "Asia" to 0, and the value "Europe" is 1, the value "Africa"...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/22
CPCG06F40/126
Inventor 张祺智游源李文杰李体云包洪英钱锟郭东白
Owner ALIBABA GRP HLDG LTD
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More