Malicious website recognition and interception technology based on communication operator network transmission layer

A technology for carrier networks and malicious URLs, applied in transmission systems, electrical components, etc., can solve problems that endanger users' online security, and achieve the effects of short feedback time, high efficiency and accuracy, and accurate collection

Pending Publication Date: 2020-04-21
多彩贵州印象网络传媒股份有限公司
4 Cites 2 Cited by

AI-Extracted Technical Summary

Problems solved by technology

[0004] At the same time, with the strengthening of domestic Internet information content management, a large number of malicio...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Method used

(5) the original domain name database of the bottom layer is merged and belonged to and roughly divided into a domain name database, a root domain name database, and a URL database, which is beneficial to the processing of upper-level website collection and analysis tools, thereby more efficiently carrying out targeted processing and reducing repetitive operations ;
For high concurrent website request data, adopt DPDK to pass through on multi-core equipment, create a plurality of threads, each thread is bound to on the independent core, reduces the overhead of thread scheduling, to improve performance; DPDK does not use conventional memory Allocation functions, such as malloc(); instead, DPDK manages its own memory; more specifically, DPDK allocates large pages and creates a heap in this memory and provides it to user applications and is used to access data inside the application structure; will give end application performance advantages: DPDK creates memory regions to be used by applications, and applications can natively support performance benefits such as large pages, NUMA node affinity, access to DMA addresses, IOVA continuity, etc. , without any additional development;
Step 3, malicious website identification and blocking, can carry out fast real-time detection to user's access URL according to the local malicious website storehouse of setting up, support the self-defining fuzzy matching detection rule based on domain name characteristic; To missing local malicious website storehouse and The u...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Abstract

The invention discloses a malicious website recognition and interception technology based on a communication operator network transmission layer, and the technology specifically comprises the steps: malicious website data collection: there are a large number of malicious websites in the Internet, the malicious websites are hidden in a large number of Internet, and the malicious websites are quickly found through the collection of user access traces; malicious website recognition and blocking: a malicious URL two-stage detection module mechanism ensures timeliness and accuracy, a local malicious website recognition module is established, and malicious website blocking can be quickly and effectively carried out; establishing a cloud malicious website recognition module, performing deep networking analysis on malicious websites which cannot be identified locally, and performing centralized processing to reduce resource consumption; malicious website recognition efficiency: malicious website recognition needs to compare and process massive website data, a powerful hardware system and algorithm optimization are established, and the recognition speed can reach milliseconds.

Application Domain

Technology Topic

Image

  • Malicious website recognition and interception technology based on communication operator network transmission layer
  • Malicious website recognition and interception technology based on communication operator network transmission layer
  • Malicious website recognition and interception technology based on communication operator network transmission layer

Examples

  • Experimental program(1)

Example Embodiment

[0041] The present invention will be further clarified below in conjunction with the drawings and specific embodiments.
[0042] A malicious website identification and interception technology based on the network transport layer of communication operators, including system management center, malicious website blocking engine, intelligent detection engine, website collection module, cloud detection module, data management module, data statistics module, strategy Audit module and security management module; The management center provides interface-based integrated management and control methods for each engine module, configures and distributes a series of instructions so that each module can complete corresponding business operations;
[0043] It includes the following steps:
[0044] Step 1. Establish a database of local malicious URLs, support image recognition, keyword detection, and collect data from the Internet based on backlinks from existing data, and support data connection with the owner’s other sources; it can be connected to third-party open Internet The malicious URL interface is established to learn malicious URLs through intelligence;
[0045] Step 2: Collect malicious web site data, collect access data on the web sites requested and visited by users, including links sent by pseudo base stations, so as to quickly find malicious web sites. Data collection can be targeted at all 2/3/4G mobiles that access CMNET and CMWAP Users, personal information security content includes the total number of malicious URLs, source, category, number of visits, blocked times, number of warnings, website location, and record information;
[0046] Step 3. Identify and block malicious URLs. According to the established local malicious URL database, it can quickly detect the user's access URL in real time, and support custom fuzzy matching detection rules based on domain name characteristics; local malicious URL database and local detection algorithm for misses The detected unknown URL is transmitted to the cloud and used the cloud detection model for in-depth analysis and detection to determine whether it is a malicious URL; the cloud returns the unknown URL detection result through independent analysis or network analysis, and the result is fed back to the local side through the malicious URL detection module; The cloud detection model is used for in-depth analysis and detection, and the detection results are downloaded to the local detection system for storage, and the malicious URL library of the local detection system is continuously enriched; the local detection algorithm mainly judges the string suffix that meets the requirements and adopts optimized The regular matching and sunday single-pattern string matching algorithm are more efficient and applicable in message data than traditional string matching. The core idea of ​​the sunday algorithm is: in the matching process, when the pattern string finds a mismatch, the algorithm can jump Pass as many characters as possible to perform the next step of matching, thereby improving the matching efficiency; if the character does not appear in the matching string, skip it directly, that is, moving step length = matching string length + 1; otherwise, the same as the BM algorithm The moving step = the distance from the rightmost character to the end of the matching string+1; in the sunday algorithm, we need to preprocess the pattern string in advance, that is, calculate the offset table:
[0047] Calculate the offset table of size |∑| according to the calculation formula;
[0048]
[0049] P is the pattern string, m is the length of the pattern string,
[0050] For example: P="search"
[0051] m=6
[0052] shift[s]=6-max (position of s)=6-0=6
[0053] shift[e]=6-max (position of e)=6-1=5
[0054] shift[a]=6-max(position of a)=6-2=4
[0055] shift[r]=6-max (position of r)=6-3=3
[0056] shift[c]=6-max (position of c)=6-4=2
[0057] shift[h]=6-max (location of h)=6-5=1
[0058] shift[other]=m+1=6+1=7
[0059] Among them, the comparison between 1 million URL request data and 10 million malicious URL data is completed within 50 milliseconds; the efficient hash search algorithm is used in the comparison of request data and malicious URL data to perform real-time comparison and matching of URLs. The complexity is greatly reduced. The cuckoo hash is used to solve the hash conflict, and less calculation is used in exchange for a larger space; it takes less time and the query speed is very fast; the specific description of the cuckoo hash algorithm: the algorithm uses hashA and hashB to calculate Corresponding to the position of the key; when any position of the two hashes is empty, choose a position to insert, and when the two hash positions are empty, insert to the empty position; when the two hash positions are not empty, random Choose one of the two positions to kick out the keyx, calculate the kicked keyx and insert the position corresponding to the other hash value, and go to 2 to execute, that is, insert when the insertion position is empty again, and when it is still not empty, kick Out this keyy;
[0060] For highly concurrent URL request data, DPDK is used to create multiple threads on multi-core devices, and each thread is bound to a separate core to reduce thread scheduling overhead and improve performance; DPDK does not use conventional memory allocation functions, Such as malloc(); on the contrary, DPDK manages its own memory; more specifically, DPDK allocates large pages and creates a heap in this memory and provides it to user applications and uses them to access data structures within the application; Gives terminal application performance advantages: DPDK creates the memory area to be used by the application, and the application can natively support large pages, NUMA node affinity, access to DMA addresses, IOVA continuity, and other performance advantages without the need Any additional development;
[0061] DPDK memory allocation is always aligned on the boundary of the CPU cache line, and the starting address of each allocation will be a multiple of the system cache line size; this method prevents many common performance problems, such as unaligned accesses and errors Data sharing, in which a single cache line inadvertently contains data that may be unrelated simultaneous access by multiple cores; for use cases that require this alignment, any other power of two values ​​are also supported, where of course> = Cache line size;
[0062] DPDK's shared memory implementation is not only achieved by mapping the same resources in different processes, similar to the shmget () mechanism, but also by copying the address space of the main process in another process; therefore, because everything in the two processes is Located at the same address, any pointer to the DPDK memory object will work across processes without any address translation; this is very important for the performance of transferring data across processes; in addition, polling is used instead of interrupts to process data packets; When the data packet arrives, the network card driver reloaded by DPDK will not notify the CPU through interruption, but will directly store the data packet in the memory, and deliver the application layer software for direct processing through the interface provided by DPDK.
[0063] Step 4. According to the detection result type returned by the malicious URL detection module, when the user clicks on the malicious page, an alert window will pop up based on the browser window of the current browser, and the extension of the security alert mode function to the page jump reminder mode is supported;
[0064] Step 5. Count the content that endangers the user's personal information security and provide a phased situation analysis report;
[0065] Step 6. If the system has abnormal conditions, such as being attacked, data exceeding the warning limit, server abnormality, etc., information is reported in real time through various reserved message system interfaces, the system operation status is obtained as soon as possible, and related problems are dealt with in time.
[0066] The security 123 management system includes an interface management and multi-dimensional display of information for the audit library, unblocking library, malicious website, and black and white lists.
[0067] (1) Through manual and robot collection and analysis, newly discovered malicious URL keywords can be added to the malicious keyword database to provide search seeds for malicious keyword search engines; data from the core analysis module that is judged as suspicious by malicious URLs The robot collected and analyzed by the audit library; in addition, the audit library also supports multi-channel collection and analysis of user reports.
[0068] (2) The unblocked library displays historical unblocked URLs, derived from the analysis results of the audit library.
[0069] (3) Malicious URL management is an interface function for classifying and querying malicious URLs; manual management of the malicious URL database, the malicious URLs in the malicious URL database mainly come from the core analysis module and the three-party query interface The result of malicious URL judgment on the URL of the domain name library.
[0070] (4) The management of the black and white list is based on the black and white list filtering and cleaning of malicious URLs based on the black and white list database; mainly through the regular cleaning of the upper-level malicious URL database and the filtering processing of the underlying original domain name database.
[0071] (5) The original domain name library at the bottom level is merged and assigned roughly into domain name library, root domain name library, and URL library, which is conducive to the processing of upper-level URL collection and analysis tools, so as to perform targeted processing more efficiently and reduce repetitive operations;
[0072] The domain name-related information database contains some basic elements and composite information of the domain name, including record information, IP, registration information, IP attribution malicious code characteristics, access status, screenshot snapshots, webpage source code snapshots, etc.; this library mainly provides judgments for core analysis modules Basis; Regularly check the malicious URL related information changes in the malicious URL database and update it in time; and automatically analyze the information on the URL of the upper-level domain name database.
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Similar technology patents

Molten pool plasma radiation spectrum acquisition mechanism and laser welding device

ActiveCN106289519AAccurate collectionReliable data supportRadiation pyrometryLaser beam welding apparatusPlasma radiationLaser assisted
Owner:HUAZHONG UNIV OF SCI & TECH

Classification and recommendation of technical efficacy words

  • Accurate collection
  • Improve efficiency
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products