Bitcoin address classification method based on improved random forest

A classification method, random forest technology, applied to computer components, payment systems, payment circuits, etc., can solve the problems of complicated data collection methods, difficult data collection, and inability to completely cluster transaction input user address groups

Active Publication Date: 2020-10-09
TIANJIN UNIVERSITY OF TECHNOLOGY
View PDF6 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] However, the existing transaction graph analysis method to collect data is too complicated. It is necessary to use the Bitcoin historical transaction records to form a graph according to the composition rules defined by oneself, and then extract features from it. Often, different researchers have different composition methods. But there is no doubt that a super-large picture will be formed in the end. At the same time, the existing multi-classification method has many features for address extraction, which makes data collection too difficult and takes too long. This makes it difficult to quickly classify a Bitcoin address. difficulty
[0008] There are two defects in the existing heuristic-based address clustering method: first, the method is only effective for specific types of transaction addresses, such as multiple transaction inputs can be clustered into one category, but for a single input transaction, when the input address When never appearing in future transaction records, it will not be classified into any category
The second is the heuristic clustering method based on the change address. Due to the changes in the Bitcoin protocol, such as the change address using the new address automatically generated by the Bitcoin wallet or the new address specified by the user, this method cannot completely cluster the transaction input. User Controlled Address Groups
[0009] The existing machine learning classification methods have not yet reached a consensus on what features to extract from transaction history records, and how many features to extract. Therefore, different researchers will extract different numbers and types of features. In fact, this blind selection of features will cause There are many redundant features in the actual features, which increases the training cost of the learner and cannot provide a reference for the feature extraction of an address that needs to be classified

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Bitcoin address classification method based on improved random forest
  • Bitcoin address classification method based on improved random forest
  • Bitcoin address classification method based on improved random forest

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0057] The Bitcoin address classification based on the improved random forest provided by the present invention will be described in detail below in conjunction with the drawings and specific embodiments.

[0058] Such as figure 1 — Figure 5 As shown, the Bitcoin address classification method based on the improved random forest provided by the present invention comprises the following steps carried out in order:

[0059] S1: Extract the original feature of the address from the historical transaction records of the blockchain and add it to the feature set used by the existing machine learning classification method to build a larger feature set;

[0060] The specific method is as follows:

[0061] S101: Set the following rules for extracting the original features of the address: the unit of address survival time is days, and the survival time is less than 24 hours as one day, and the number of survival days in other cases is rounded down; for self-change transactions, that is...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a bitcoin address classification method based on an improved random forest. The method comprises the steps of constructing a feature set, forming a data set, obtaining a sampleset with a label, initializing parameters of a learner, iterating the learner, obtaining key features and the like. From the perspective of Bitcoin market supervision, the problem of judging whetherthe user participates in illegal transaction is converted into the Bitcoin address classification problem, and market supervision can be perfected; the sample set is directly obtained through the historical transaction record of the block chain, so that the data collection difficulty is greatly reduced; bitcoin addresses can be classified with the accuracy of about 84%, and only few statistical characteristics are needed. Compared with the prior art, the method has the advantages that the addresses can be well classified and a large number of constructed statistical characteristics are subjected to redundancy elimination; and after the learner completes final training, the key characteristics needing to be extracted finally can be obtained, so that for one address needing to be recognized,the data collection time is shortened, and the time expenditure of address classification is also reduced.

Description

technical field [0001] The invention belongs to the technical field of data mining and machine learning, in particular to a Bitcoin address classification method based on improved random forest. Background technique [0002] With the continuous development of the digital currency market, Bitcoin, as a typical representative of digital currency, has become more and more widely used. Bitcoin addresses are unique identifiers for users to participate in services, but due to the anonymity of Bitcoin itself, it also facilitates illegal activities such as money laundering. In this case, in order to better understand the use of Bitcoin, it is very important to explain user transaction behavior through Bitcoin address, but the anonymity of Bitcoin has brought challenges to this, so how to quickly and efficiently use Bitcoin in the system Classifying a Bitcoin address, that is, using less statistical features to determine whether the address is owned by an illegal user or belongs to ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06Q40/04G06Q20/38G06Q20/06G06K9/62
CPCG06Q40/04G06Q20/389G06Q20/065G06F18/24323G06F18/214
Inventor 王劲松陶峰张洪玮赵泽宁石凯
Owner TIANJIN UNIVERSITY OF TECHNOLOGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products