Traffic detection method and system based on interaction strategy, storage medium and electronic equipment

A technology of flow detection and flow prediction, applied in transmission systems, electrical components, business, etc., can solve problems such as adjusting model performance, achieve the effects of reducing costs, improving interpretability and recognition performance

Pending Publication Date: 2021-09-28
SHANGHAI MININGLAMP ARTIFICIAL INTELLIGENCE GRP CO LTD
0 Cites 0 Cited by

AI-Extracted Technical Summary

Problems solved by technology

[0004] The embodiment of the present application provides a traffic detection method, system, storage medium and electronic equipment of an interactive strategy, to at least solve ...
View more

Abstract

The invention discloses a traffic detection method and system based on an interaction strategy, a storage medium and electronic equipment, and the detection method comprises the steps: a preprocessing step: obtaining return log information of touring contacts of all users, and preprocessing a traffic log of the return log information; a feature acquisition step: extracting a single-dimensional data feature from the preprocessed traffic log and receiving a multi-dimensional data feature; a flow test step: obtaining a flow prediction result through a trained isolated forest model according to the single-dimensional data features and the multi-dimensional data features; and a comparison step: comparing and identifying the traffic prediction result with the abnormal traffic identified by adopting the initialization rule to obtain the abnormal traffic. According to the weak supervision advertisement anti-fraud method based on the interaction strategy, in the abnormal traffic detection process, the cost brought by rule changes can be effectively reduced, and the interpretability and the recognition performance of an abnormal traffic model are improved.

Application Domain

TransmissionMarketing

Technology Topic

Traffic volumeInteraction strategy +6

Image

  • Traffic detection method and system based on interaction strategy, storage medium and electronic equipment
  • Traffic detection method and system based on interaction strategy, storage medium and electronic equipment
  • Traffic detection method and system based on interaction strategy, storage medium and electronic equipment

Examples

  • Experimental program(3)
  • Effect test(1)

Example Embodiment

[0051] Example 1:
[0052] Please refer to figure 1 , figure 1 Flow diagram of flow detection method based on interactive strategy. like figure 1 As shown, the flow detection method of the interactive strategy of the present invention includes:
[0053] The pre-processing step S1: obtains the return log information of each user's tour contact, pre-processes the traffic log of the backup log information;
[0054] Feature Getting Step S2: A single-dimensional data characteristic and receive multi-dimensional data characteristics are extracted from the pre-processed traffic log;
[0055] Flow test step S3: The flow prediction result is obtained from the isolated forest model after the single dimensional data characteristics and the multi-dimensional data characteristics;
[0056] Comparative Step S4: A abnormal flow rate is obtained after comparing the abnormal flow identified by the initialization rule.
[0057] Sort feedback Step S5: After sorting the single-dimensional data characteristics and the multi-dimensional data characteristics, the importance is given to at least one data feature and the abnormal flow rate of the importance according to the importance of data characteristics.
[0058] The model improves step S6: After analyzing at least one data characteristics before importance and the abnormal flow rate, new multi-dimensional data characteristics will be obtained, according to new multi-dimensional data characteristics and the single-dimensional data characteristics, the isolated forest model Carry out a new round of training and interaction.
[0059] Among them, the feature field is parsed from the traffic log, and the missing value of the resolution feature field is filled.
[0060] Please refer to figure 2 , figure 2 It is a flow chart for feature acquisition step S2. like figure 2 As shown, the feature acquisition step S2 includes:
[0061] Single-dimensional data feature extraction step S21: Extract the single dimensional data characteristics from the pre-processed flow log;
[0062] Multi-dimensional data characteristics Extract step S22: The multi-dimensional data characteristics are extracted from the expert rule base.
[0063] Please refer to image 3 , image 3 It is a flow chart of the single-dimensional data characterization step S21. like image 3 As shown, the single-dimensional data feature extraction step S21 includes:
[0064] Discrete Variable Coding Step S211: The discrete variable in the traffic log is encoded by an ANDM algorithm;
[0065] Continuous Variable Code Step S212: The continuous variable in the traffic log uses a segment mapping, and the value of different sections is reflected in different values.
[0066] Please refer to Figure 4 , Figure 4 It is a flow chart of comparative step S4. like Figure 4 As shown, the comparative step S4 includes:
[0067] Prediction Step S41: Predicting the normal and abnormal flow of the test concentration through the isolated forest model to obtain the prediction result;
[0068] Identifying step S42: Aligning the predicted results with an abnormal flow rate identified by initialization, identifying different abnormal flow.
[0069] like Figure 5 As shown, the specific steps are as follows:
[0070] The pre-processing step obtains the return log information of each user's tour contact through the advertisement traffic monitoring system, and the traffic log is pre-processed. The feature field is parsed from the traffic log, and the filling of the defective field lack value is used in accordance with the field features, such as: OS field, using the number or UNK fill; the number of seconds received, using average Number fill, etc.
[0071] Feature Engineering Construction Procedure, the Construction of Feature Projects plays a very important role in the final effect of the model. This paper uses interactive strategies to gradually abundant data characteristics. It mainly includes two parts, one is the single-dimensional data characteristics after log analysis. The second is the multidimensional data characteristics of expert knowledge.
[0072] Analysis of single-dimensional data feature extraction steps, including discrete variables and continuous variables. Discrete variables are coded in oneHot, such as OS, MD, Region and other features. Continuous variables use segment mapping to map the values ​​of different sections into different values, such as receiving the number of seconds.
[0073] The multi-dimensional data characteristics of expert knowledge returns, the experts summarize experts from expert rules banks, providing related characteristic methods rich in natural features. When the model completes the initial training, experts are sorted from the model of TOP-K from the model, and the relevant rules are refined, and the expertise summary characteristics are used to return to the model rich feature project.
[0074] Model training and testing steps, in the actual flow of the system, normal traffic and abnormal flow rates are large, and abnormal flow accounted for small samples in overall samples, and the characteristic performance of abnormal flow and characteristics of normal traffic. The difference is very large, and the abnormal flow is sparse, the high density group is far away. Therefore, this paper uses an isolated forest algorithm as a basic model to identify abnormal flow. First, n-data is randomly selected from the training data as a subample, put the root node of the isolated tree; then specify a dimension, and then generate a cut point P in the range of the current node data, and the cutting point is generated in the current node data. The maximum value of the specified dimension is between the minimum; the super plane is generated according to the cut point, and the current node data space is divided into two sub-spaces, put the current dimension in the left branch of the current node at the current node, greater than The P point is placed on the right branch of the current node, repeating the above operation on the left and right branches, constantly constructing a new leaf node, knowing that the leaf node has only one data or the tree has grown to the preset height. Finally, the results of all isolated trees are needed. Since the cutting process is random, it is necessary to start cut from the beginning and then calculate the average of each cut-out result until the result converges.
[0075] Results Differential contrast and feature of characteristics The steps are sorted, using training-well model predicting the normal and abnormal flow in the test concentration, compares the abnormal flow rate identified by the initialization rule, identifies the different abnormal flows. The characteristics of model prediction use are sorted, and the feature and difference flow feedback to the TOP5 are selected.
[0076] Performing an interactive policy step, using results difference comparisons and feature importance sorting TOP5 characteristics and difference traffic, expert knowledge analysis of rule features contained in TOP5 characteristics and difference flow, and adds it to expert rules library, and use identification The new rule enrich model characteristic project. Then, a new round of model training and interaction, until the model converges or expert knowledge cannot be judged, that is, the identified abnormal flow cannot be described in rule description.

Example Embodiment

[0077] Example 2:
[0078] Please refer to Image 6 , Image 6 It is a schematic structural diagram of a flow detection system based on an interactive strategy of the present invention.
[0079] like Image 6 A traffic detection system based on an interactive strategy is shown in the present invention, including:
[0080] The pretreatment module, the pre-processing module acquires the return log information of each user's tour contact, and pre-processes the traffic log of the backup log information;
[0081] Feature acquisition module, the feature acquisition module extracts single-dimensional data characteristics and receiving multi-dimensional data characteristics from the pre-processed traffic log;
[0082] Flow test module, the flow test module obtains flow prediction results in accordance with the single dimensional data characteristics and the multi-dimensional data characteristics by training.
[0083] Comparative module, the comparison module obtains an abnormal flow rate after comparing the abnormal flow of the initialization rule identified.
[0084] Sort feedback module, the sort feedback module is sorted by the data characteristics and the multi-dimensional data characteristics, and feedback the importance of at least one data characteristic and the abnormal flow rate of the importance according to the importance of data characteristics.
[0085] Model Improves Module, the model improves module to analyze the at least one data characteristics of importance and the abnormal flow rate, will obtain new multi-dimensional data characteristics, according to new multi-dimensional data characteristics and the single-dimensional data characteristic pair A new round of training and interaction.

Example Embodiment

[0086] Example 3:
[0087] Combine Figure 7 As shown, this embodiment discloses a specific embodiment of an electronic device. The electronic device can include a processor 81 and a memory 82 stored with a computer program instruction.
[0088] Specifically, the processor 81 may include a central processor (CPU), or a particular integrated circuit (Application Specific Integrated Circuit, an ASIC), or can be configured to implement one or more integrated circuits of the present application embodiment.
[0089]Wherein, the memory 82 may comprise a mass storage for data or instructions. By way of example, and not limitation, memory 82 may include a hard disk drive (Hard Disk Drive, referred to as an HDD), a floppy disk drive, a solid state drive (SolidState Drive, referred to as the SSD), a flash memory, an optical disk, a magneto-optical disk, magnetic tape, or a Universal Serial Bus (Universal SerialBus, simply referred to as USB) drive or a combination of two or more of these. In appropriate circumstances, the memory 82 may include removable or non-removable (or fixed) media. In appropriate circumstances, the memory 82 may be internal or external to the processing means in the data. In a particular embodiment, the nonvolatile memory 82 (Non-Volatile) memory. In a particular embodiment, the memory 82 includes read only memory (Read-Only Memory, referred to as ROM) and random access memory (RandomAccess Memory, referred to as a RAM). Where appropriate, this ROM may be mask-programmed ROM, programmable ROM (Programmable Read-Only Memory, referred to as the PROM), erasable PROM (Erasable ProgrammableRead-Only Memory, referred to as EPROM), electrically erasable in addition PROM (electrically Erasable ProgrammableRead-Only Memory, referred to as EEPROM), electrically alterable ROM (electrically alterable Read-OnlyMemory, EAROMs for short) or a flash memory (FLASH), or combinations of these or more than two. Where appropriate, this RAM may be static random access memory (Static Random-Access Memory, referred to as SRAM), or dynamic random access memory (Dynamic Random Access Memory, referred to as DRAM), which, may be a DRAM fast page mode dynamic random access memory (Fast Page mode dynamic random access memory, referred to as FPMDRAM), extended data out dynamic random access memory (extended Date Out dynamic RandomAccess memory, referred to as EDODRAM), synchronous dynamic random access memory (synchronous dynamic Random-Access Memory, referred to as SDRAM) and the like.
[0090] The memory 82 may be used to store or buffer may need to deal with computer program instructions and / or various data files used for communication, and executed by the processor 81.

PUM

no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.

Similar technology patents

Material including graphene and an inorganic material and method of manufacturing the material

ActiveUS20110129675A1reduce defectlow cost
Owner:SAMSUNG ELECTRONICS CO LTD +1

Haptic trackball device

InactiveUS7710399B2simple actuatorlow cost
Owner:IMMERSION CORPORATION

Mobile authentication for network access

ActiveUS20060069916A1low cost
Owner:ALCATEL LUCENT SAS

Classification and recommendation of technical efficacy words

  • low cost

System and method for transmitting wireless digital service signals via power transmission lines

ActiveUS7929940B1reduce bandwidth requirementlow cost
Owner:NEXTEL COMMUNICATIONS

System and method for determination of position

InactiveUS20090149202A1low costreduce requirement
Owner:STEELE CHRISTIAN

Adaptive antenna optimization network

InactiveUS6961368B2low costminimal space
Owner:ERICSSON INC

Antenna device and method for attaching the same

ActiveUS20150138022A1low costimproved strength characteristic
Owner:NEC CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products