Method for preprocessing abnormal values of e-business sales amounts based on statistical discrimination process

An outlier and preprocessing technology, which is applied in the direction of electrical digital data processing, data processing applications, special data processing applications, etc., can solve problems such as missing data values, inconsistent data values, large sales volume, etc., and save time for consulting data , The effect of improving data accuracy and shortening the collection cycle

Inactive Publication Date: 2015-05-27
INSPUR GROUP CO LTD
View PDF3 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] 1) Missing data values, data noise, inconsistent data values, etc. due to omissions in data mining or other reasons
[0005] 2) Because the merchant provides false information and fabricates false sales records, the sales are too large
[0006] 3) Due to the merchant’s malicious swiping of orders, the sales volume is too large, which eventually leads to the overall sales being too large

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for preprocessing abnormal values of e-business sales amounts based on statistical discrimination process
  • Method for preprocessing abnormal values of e-business sales amounts based on statistical discrimination process
  • Method for preprocessing abnormal values of e-business sales amounts based on statistical discrimination process

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0022] The steps of this preprocessing method are as follows:

[0023] Step 1: Improve data mining techniques and tools;

[0024] Step 2: Preliminarily verify the basic data, find out outliers, include non-outliers in the original e-commerce database, and verify outliers again;

[0025] Step 3: classify outliers; outliers are classified as: 1) missing and noisy data; 2) false data; 3) brushing data;

[0026] Step 4: Strengthen the comparison and elimination with the false information database, reduce the missing and noisy data, and fill in zeros for the missing data;

[0027] Step 5: For the false data, use DDFAI to discriminate and verify it. If it is judged to be false information, it will be included in the false information database, and it will be deleted, and the non-false information will be included in the original e-commerce database;

[0028] Step 6: Perform verification processing on the swiping data; the verification processing method is: 1) The swiping website i...

Embodiment 2

[0032] For the abnormal value of e-commerce sales, first improve the abnormal database:

[0033] 1) Carry out outlier test on the data, if it is indeed an outlier, delete the data, and record the data information in the outlier database;

[0034] 2) When collecting data again, first compare the data to be collected with the outlier database. If the information is consistent, this piece of data will not be collected and stored;

[0035] 3) Carry out outlier test on the newly collected data. If it is detected as an outlier, delete the data, and record the data information into the outlier database to improve the outlier database; repeat the cycle to continuously improve the outlier database .

[0036] Secondly, on the basis of the complete abnormal database, the classification judgment is made:

[0037] 1) When there is data noise, that is, when there is a null value, zero-fill the data. In the later stage, developers need to further improve data mining technology and improve...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for preprocessing abnormal values of e-business sales amounts based on a statistical discrimination process. The preprocessing method comprises the following steps: improving data mining technologies and tools; performing preliminary verification on basic data; classifying abnormal values; enhancing comparison and elimination with a false information base, reducing missing data and noise data, and performing zero padding processing on real missing data; performing discrimination and verification on false data; performing verification processing on scalping data; comparing an acquired result with an abnormal database in a data acquisition process; and establishing a basic information base after forming a massive database, and performing batch processing on massive data. Compared with the prior art, the method disclosed by the invention has greater pertinence on abnormal e-business data, and ensures that the acquisition cycle can be shortened and the data accuracy can be greatly increased after the abnormal e-business data is checked; and moreover, the method is simple to operate and ensures that the time for customers to look up information can be saved.

Description

technical field [0001] The invention relates to the technical field of computer network data processing, in particular to a preprocessing method for abnormal values ​​of e-commerce sales based on a statistical discrimination method. Background technique [0002] The current e-commerce database is highly susceptible to noise, missing data and inconsistent data. In reality, e-commerce billing and merchants providing false information are also repeatedly prohibited. Low-quality data will lead to low-quality mining results, and low-quality basic data will directly lead to inability to make high-quality decisions. How to preprocess outliers in e-commerce data, improve data quality, and make efficient statistical decisions are issues that must be paid attention to when doing data analysis. [0003] The current e-commerce data outliers mainly exist in the following ways: [0004] 1) Missing data values, data noise, inconsistent data values, etc. due to omissions in data mining or...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06Q10/063G06Q50/06
Inventor 左少标贾亦真张鑫徐宏伟
Owner INSPUR GROUP CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products