Target characteristic data mining method and apparatus

A technology of feature data and target features, applied in special data processing applications, electrical digital data processing, machine learning and other directions, can solve problems such as the decline of machine learning effect

Active Publication Date: 2017-08-15
ZHEJIANG TMALL TECH CO LTD
View PDF3 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0011] This method generally filters out features, which may filter out a large number of effective features, resulting in a significant decline in the effect of machine learning.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Target characteristic data mining method and apparatus
  • Target characteristic data mining method and apparatus
  • Target characteristic data mining method and apparatus

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0097] In order to make the above objects, features and advantages of the present application more obvious and comprehensible, the present application will be further described in detail below in conjunction with the accompanying drawings and specific implementation methods.

[0098] refer to figure 1 , which shows a flow chart of the steps of an embodiment of a method for mining target feature data of the present application, which may specifically include the following steps:

[0099] Step 101, counting feature frequencies for the first feature data;

[0100] In a specific implementation, the source data can be collected through the network log, such as parsing the source data, removing meaningless information, such as the field "-", and obtaining the first structured characteristic data, such as the user ID and the product ID accessed by the user , access time, user behavior (such as click, purchase, evaluation), etc.

[0101] For example, the website logs are:

[0102] ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

Embodiments of the invention provide a target characteristic data mining method and apparatus. The method comprises the steps of performing characteristic frequency statistics on first characteristic data; filtering low-frequency characteristic data from the first characteristic data according to a characteristic frequency, thereby obtaining second characteristic data; and filtering at least part of medium-frequency characteristic data from the second characteristic data according to the characteristic frequency, thereby obtaining target characteristic data. According to the method and the apparatus, the performance of a model is basically not influenced; and while the machine learning effect is ensured, the characteristic quantity is greatly reduced, so that the required machine quantity and resource quantity are greatly reduced, the training time is greatly shortened, the training speed is increased, and the training cost is greatly reduced.

Description

technical field [0001] The present application relates to the technical field of computer processing, in particular to a mining method for target feature data and a mining device for target feature data. Background technique [0002] Machine learning (Machine Learning, ML) is a multi-field interdisciplinary subject, involving probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and other disciplines, mainly used in artificial intelligence to acquire new knowledge or Skills, reorganize the existing knowledge structure to continuously improve its own performance. [0003] Data and features are two particularly important aspects in machine learning, and they greatly affect the effect of machine learning. [0004] Take estimating the click through rate (CTR, Click through rate) of a certain information as an example, the estimation of CTR needs at least two aspects of data, one is the data of the information itself, and the other is...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/2465G06N20/00G06N7/01G06F16/182
Inventor 周俊
Owner ZHEJIANG TMALL TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products