Unlock instant, AI-driven research and patent intelligence for your innovation.

Feature coding method and device

A feature encoding and encoding technology, applied in the computer field, can solve problems such as cost, sparse large matrix, and many model parameters, and achieve the effect of reducing length

Active Publication Date: 2021-07-23
ADVANCED NEW TECH CO LTD
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] When there are too many values ​​of non-numerical variables, such as IP address or device id, directly performing one-hot feature encoding on all values ​​will often result in a very sparse large matrix, resulting in too many parameters in the model, which is necessary for subsequent Model deployment comes with a significant cost

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Feature coding method and device
  • Feature coding method and device
  • Feature coding method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0053] The solutions provided in this specification will be described below in conjunction with the accompanying drawings.

[0054] figure 1 It is a schematic diagram of an implementation scenario of an embodiment disclosed in this specification. Such as figure 1 As shown, when training the machine learning model, the training data should be used as the input of the machine learning model, wherein the training data includes variables of non-numeric type, such as product name, store name, buyer's IP address, etc. These Variables of non-numeric type can be used as the input of machine learning model after feature coding. The embodiment of this specification mainly involves the method of feature coding of non-numeric type variables. The degree of differentiation of the target.

[0055] Understandably, figure 1 The shown machine learning model is only an example, and is not used to limit the machine learning algorithm in the embodiment of this specification. For example, the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of this specification provides a feature encoding method and device, the method includes: obtaining the variable value of the feature variable related to the business goal, the variable value is a non-numeric type; according to multiple value sets of predetermined feature variables and multiple Correspondence between feature encoding methods, select the target encoding method corresponding to the variable value from a variety of feature encoding methods, and multiple value sets are divided according to the pre-evaluated degree of discrimination of various possible values ​​​​of feature variables for business objectives, Multiple feature encoding methods are used to encode the values ​​in the corresponding value set into multiple vector spaces; the target encoding method is used to encode variable values ​​into target vectors in the target vector space, and the target vector space is the same as in multiple vector spaces The vector space corresponding to the target encoding method; the eigenvector of the feature variable is determined based on the target vector. It can not only make the model not lose useful information, but also reduce the length of feature encoding, and has certain generalization.

Description

technical field [0001] One or more embodiments of this specification relate to the computer field, and in particular, to a feature encoding method and device. Background technique [0002] In classic data modeling scenarios, it is often encountered that a lot of data is represented by variables of non-numeric types. For example, if a user purchases a product, the product belongs to a first-level category, a second-level category, the product has its own name and type, and the product belongs to a certain store (corresponding to the nickname or id of the store). Log in on an internet protocol (IP) address, a wireless-fidelity (WIFI) or a physical device, and finally find that the transaction is a fake transaction, a cash transaction or a stolen card transaction Wait. [0003] The scenario described above contains a large amount of behavioral information, and this information often appears in the data structure as variables of non-numeric type. Classical machine learning al...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06N20/00
Inventor 宋乐李辉葛志邦黄鑫王琳朱冠胤
Owner ADVANCED NEW TECH CO LTD