Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

112 results about "Deterministic finite automaton" patented technology

In the theory of computation, a branch of theoretical computer science, a deterministic finite automaton (DFA)—also known as deterministic finite acceptor (DFA), deterministic finite state machine (DFSM), or deterministic finite state automaton (DFSA)—is a finite-state machine that accepts or rejects a given string of symbols, by running through a state sequence uniquely determined by the string. Deterministic refers to the uniqueness of the computation run. In search of the simplest models to capture finite-state machines, Warren McCulloch and Walter Pitts were among the first researchers to introduce a concept similar to finite automata in 1943.

Method and system for recognizing machine generated character glyphs and icons in graphic images

A deterministic finite automaton uses binary search (and optionally hashing) method(s) of sparse matrix representation to recognize the graphical representations of characters or icons from a bitmap representation of the computer screen. This recognition can be applied to translate data from an unknown form of original specification (file format) into a known form of representation, such as HTML. Alternatively this recognition can be applied to another process that can “see” what is on the screen, and perform programmed actions, based on what it “sees”.
Owner:OLCOTT PETER L

Method and device for high performance regular expression pattern matching

Disclosed herein is an improved architecture for regular expression pattern matching. Improvements to pattern matching deterministic finite automatons (DFAs) that are described by the inventors include a pipelining strategy that pushes state-dependent feedback to a final pipeline stage to thereby enhance parallelism and throughput, augmented state transitions that track whether a transition is indicative of a pattern match occurring thereby reducing the number of necessary states for the DFA, augmented state transition that track whether a transition is indicative of a restart to the matching process, compression of the DFA's transition table, alphabet encoding for input symbols to equivalence class identifiers, the use of an indirection table to allow for optimized transition table memory, and enhanced scalability to facilitate the ability of the improved DFA to process multiple input symbols per cycle.
Owner:IP RESERVOIR

Method and device for high performance regular expression pattern matching

Disclosed herein is an improved architecture for regular expression pattern matching. Improvements to pattern matching deterministic finite automatons (DFAs) that are described by the inventors include a pipelining strategy that pushes state-dependent feedback to a final pipeline stage to thereby enhance parallelism and throughput, augmented state transitions that track whether a transition is indicative of a pattern match occurring thereby reducing the number of necessary states for the DFA, augmented state transition that track whether a transition is indicative of a restart to the matching process, compression of the DFA's transition table, alphabet encoding for input symbols to equivalence class identifiers, the use of an indirection table to allow for optimized transition table memory, and enhanced scalability to facilitate the ability of the improved DFA to process multiple input symbols per cycle.
Owner:IP RESERVOIR

Determining regular expression match lengths

A method and apparatus are disclosed for determining the lengths of one or more substrings of an input string that matches a regular expression (regex) The input string is searched for the regex using an non-deterministic finite automaton (NFA), and upon detecting a match state a selected portion of the input string is marked as a match string. The NFA is inverted to create a reverse NFA that embodies the inverse of the regex. For some embodiments, the reverse NFA is created by inverting the NFA such that the match state of the NFA becomes the initial state of the reverse NFA, the initial state of the NFA becomes the match state of the reverse NFA, and the goto transitions of the NFA are inverted to form corresponding goto transitions in the reverse NFA. The match string is reversed and searched for the inverted regex using the reverse NFA, and a counter is incremented for each character processed during the reverse search operation. The current value of the counter each time the match state in the reverse NFA is reached indicates the character length of a corresponding substring that matches the regex.
Owner:AVAGO TECH WIRELESS IP SINGAPORE PTE

Method and device for matching regular expressions

The embodiment of the invention provides a method and device for matching regular expressions. The method comprises the following steps of: firstly, inputting a message to be matched and a DFA (Deterministic Finite Automaton) state table, wherein the DFA state table comprises a state transition table including transition relationships between all state addresses and each state in the matching process of the regular expressions; secondly, judging a data type corresponding to the present state, wherein the data type includes a single-character Char type and a multi-character Str type, and the data corresponding to the Str type are a plurality of continuous characters; if the data type is the Str type, matching a plurality of character values in the current state of the message to be matchedand the matching condition, and when matched, shifting to the next state satisfying the matching condition; if the data type is the Char type, matching a single character value at the current state in the message to be matched and the matching condition, and when matched, shifting to the next state satisfying the matching condition; and when the next state is an accept state, finishing the matching process and outputting a success result of matching. The method for matching regular expressions has high matching speed and high efficiency and the space occupied by the DFA items is small.
Owner:杨志杰

Regular expression pattern matching using keyword graphs

Expanding a regular expression set into an expanded expression set that recognizes a same language as the regular expression set and includes more expressions than the regular expression set, with less operators per expression includes: logically connecting the expressions in the regular expression set; parsing the expanded expression set; transforming the parsed expanded expression set into a Glushkov automata; transforming the Glushkov automata into a modified deterministic finite automaton in order to maintain fundamental graph properties; combining the modified DFA into a keyword graph using a combining algorithm that preserves the fundamental graph properties; and computing an Aho-Corasick fail function for the keyword graph using a modified algorithm to produce a modified Aho-Corasick graph with a goto and a fail function and added information per state.
Owner:IBM CORP

Method and apparatus for pattern matching for intrusion detection/prevention systems

A packet is compared to a pattern defined by a regular expression with back-references (backref-regex) in a single pass of a non-deterministic finite automaton corresponding to the backref-regex (backref-NFA) that includes representations for all backref-regex's back-references. The packet's characters are sequentially selected and analyzed against the backref-NFA until a match or no-match between the packet and pattern is determined. Upon selecting a character, a corresponding configurations-set is updated, where the set includes configurations associated with respective NFA-states of the backref-NFA and indicating whether the selected character is being matched against a back-reference. With the configurations-set being updated the comparison process proceeds along backref-NFA's NFA-states. The updated configurations-set includes configurations associated with NFA-states reachable from the configurations in the pre-updated set. When the configurations-set includes a final state, a match is determined. When the configurations-set becomes empty, or upon selection of all characters lacks the final state, a no-match is determined.
Owner:RPX CORP

Method for matching in speedup regular expression based on finite automaton containing memorization determination

The invention discloses a matching and accelerating method of a regular expression based on a deterministic finite automaton with memory, including a rule compiler of the regular expression and a pattern matching engine; the rule compiler of the regular expression firstly transforms the regular expression into an analytic tree, and then transforms the analytic tree into a nondeterministic finite automaton with memory and the deterministic finite automaton with memory respectively; the pattern matching engine can accelerate pattern matching by using the deterministic finite automaton with memory generated by the rule compiler. The invention has the advantages that: 1) by directly supporting repeat operators, the compiler does not need to unfold the repeat expression, thus the difficulty of the development of the compiler is greatly reduced and the memory occupation and the compile time of the compiler are decreased as well; 2) for the same reason, the volume of a rules database generated by the compiler can be reduced, so the cost and complexity of the pattern matching engine can be lowered.
Owner:ZHEJIANG UNIV

SYSTEM, METHOD, AND PROGRAM FOR GENERATING NON-DETERMINISTIC FINITE AUTOMATON NOT INCLUDING e-TRANSITION

An initial setting unit receives from an input device a syntax tree generated from a regular expression, and initializes an NFA and an NFA converting section that applies five conversion patterns to each node of the syntax tree to directly convert the node into an NFA not including ε-transition. When the conversion is finished, the NFA converting section outputs the NFA generated to an output device.
Owner:NEC CORP

Interactive-question semantic understanding method in intelligent customer services

The invention provides an interactive-question semantic understanding method in intelligent customer services. The method includes the following steps that the conversation content between current intelligent customer services and clients is subjected to co-text language environment expression, wherein co-text language environment expression comprises event expression and language environment expression; according to the co-text language environment expression, a conversation semantic event graph is constructed; according to multiple conversation corpora of the intelligent customer services and the clients, a business logic tree is constructed; according to a determined finite state automata, an order state machine is established; according to the semantic event graph, logic decision branches are selected from the business logic tree; according to the logic decision branches and the order state machine, a semantic processing template is returned to the intelligent customer services, and semantic expression generation is carried out. According to the interactive-question semantic understanding method in the intelligent customer services, interaction questions and answers based on aflow diagram are achieved, the accuracy of client interrogation understanding of the intelligent customer services is increased, the consistency between the intelligent customer services and client conversations is guaranteed, and the work efficiency of the intelligent customer services is improved.
Owner:KANGCHENG INVESTMENT CHINA

Sensitive word filtering method based on text content

The invention discloses a sensitive word filtering method based on text content. The method comprises the steps that a Chinese sensitive word bank is constructed, Chinese words in the Chinese sensitive word bank are expanded to be pinyin blend words, and a pinyin blend sensitive word bank is formed; a transfer function for determining all sensitive words in a finite state automata is established through a sensitive word search tree structure, and sensitive words in the pinyin blend sensitive word bank are made into a sensitive word tree; and the sensitive words are retrieved in a text according to the structure of the sensitive word tree, and the retrieved sensitive words are replaced with designated signs to complete sensitive word filtering. The method is high in recall ratio and easy to implement in practical application.
Owner:杭州言旭网络科技有限公司

Efficient dfa generation for non-matching characters and character classes in regular expressions

A character class is detected in a regular expression and substituted with a pseudo character. A table is created with a bit vector for each pseudo character inserted into the regular expression. Each bit in the bit-vector represents one character of the alphabet from which the expression is generated. The status of the bits in a bit-vector indicates which characters of the alphabet are included in the character class. The pseudo character in the modified regular expression is used to construct a non-deterministic finite automaton (NFA). The NFA with the pseudo character is then used to construct a deterministic finite automaton (DFA). When constructing the DFA, the bit-vectors are used to determine if a certain transition should be constructed in the DFA.
Owner:QUEST SOFTWARE INC

Layered memory architecture for deterministic finite automaton based string matching useful in network intrusion detection and prevention systems and apparatuses

The present invention provides a method and apparatus for searching multiple strings within a packet data using deterministic finite automata. The apparatus includes means for updating memory tables stored in a layered memory architecture comprising a BRAM, an SRAM and a DRAM; a mechanism to strategically store the relevant data structure in the three memories based on the characteristics of data, size / capacity of the data structure, and frequency of access. The apparatus intelligently and efficiently places the associated data in different memories based on the observed fact that density of most rule-sets is around 10% for common data in typical network intrusion prevention systems. The methodology and layered memory architecture enable the apparatus implementing the present invention to achieve data processing line rates over 2 Gbps.
Owner:FORTINET

Systems and methods for determining the determinizability of finite-state automata and transducers

Finite-state transducers and weighted finite-state automata may not be determinizable. The twins property can be used to characterize the determinizability of such devices. For a weighted finite-state automaton or transducer, that weighted finite-state automaton or transducer and its inverse are intersected or composed, respectively. The resulting device is checked to determine if it has the cycle-identity property. If not, the original weighted finite-state automaton or transducer is not determinizable. For a weighted or unweighted finite-state transducer, that device is checked to determine if it is functional. If not, that device is not determinizable. That device is then composed with its inverse. The composed device is checked to determine if every edge in the composed device having a cycle-accessible end state meets at least one of a number of conditions. If so, the original device has the twins property. If the original device has the twins property, then it is determinizable.
Owner:AMERICAN TELEPHONE & TELEGRAPH CO

Graphics processing unit (GPU) based method for detecting message content of high-speed network

The invention discloses a graphics processing unit (GPU) based method for detecting the message content of a high-speed network to solve the technical problems of reducing the frequency of branch appearance during GPU matching, optimizing the memory access strategies and improving the performance of message content inspection. The technical scheme is as follows: the method comprises the following steps: firstly preprocessing a pattern set and allocating buffer zones, extending a state transfer table of a deterministic finite automaton (DFA) and allocating the buffer zones for the message and the matching result in a central processing unit (CPU) memory and a GPU global memory respectively; secondly loading the message to be matched to a shared memory by the GPU matching thread; and thirdly realizing a GPU-based regular expression matching engine through designing and controlling the regular expression matching engine to carry out pattern matching. By adopting the method, the parallelism of message buffer and message transmission can be improved, the regular expression matching speed is improved and the performances of message buffer and message content inspection are improved.
Owner:NAT UNIV OF DEFENSE TECH

Method and apparatus for pattern matching for intrusion detection/prevention systems

A packet is compared to a pattern defined by a regular expression with back-references (backref-regex) in a single pass of a non-deterministic finite automaton corresponding to the backref-regex (backref-NFA) that includes representations for all backref-regex's back-references. The packet's characters are sequentially selected and analyzed against the backref-NFA until a match or no-match between the packet and pattern is determined. Upon selecting a character, a corresponding configurations-set is updated, where the set includes configurations associated with respective NFA-states of the backref-NFA and indicating whether the selected character is being matched against a back-reference. With the configurations-set being updated the comparison process proceeds along backref-NFA's NFA-states. The updated configurations-set includes configurations associated with NFA-states reachable from the configurations in the pre-updated set. When the configurations-set includes a final state, a match is determined. When the configurations-set becomes empty, or upon selection of all characters lacks the final state, a no-match is determined.
Owner:RPX CORP

Method for broadcast authentication of wireless sensor network based on automaton and game of life

The invention discloses a method for the broadcast authentication of a wireless sensor network based on an automaton and a game of life, which aims at the problems of limitations to coverage and special network traffic distribution of a base station in the authentication of the wireless sensor network which has relatively more hidden dangers due to own characteristics. By the method, the coverageof the base station can be expanded to realize the broadcast of nodes of the whole network, and the distributional pattern of the network traffic of the wireless sensor network (WSN) is simulated. The broadcast authentication of the wireless sensor network is realized mainly by combining a plurality of ways such as an improved deterministic finite automaton, a clustering algorithm, a game of lifealgorithm and the like. A specific technical scheme and specific steps and flows are designed. The method is remarkably distinguished from the conventional broadcast authentication method used for the WSN, and is advanced in the aspects of communication ranges of nodes of the base station, the rational allocation of node energy in the network, the simulation of a network traffic pattern, and the like.
Owner:NANJING UNIV OF POSTS & TELECOMM

Method for inspecting deep packets based on suffix automaton regular engine structure

InactiveCN103259793ARapid Intrusion DetectionEfficient Intrusion DetectionData switching networksWeb serviceSuffix automaton
The invention discloses a method for inspecting deep packets based on a suffix automaton regular engine structure. The method comprises the following steps: S1, intruding an inspection system, extracting attack features and constructing regular expression, S2, constructing suffix nondeterministic finite automaton (NFA) engine and utilizing the suffix NFA engine to conduct multiple-pattern matching, S3, obtaining application layer protocol data packets and Web server log files from a Web server, S4, conducting deep packet inspecting on the protocol data packets and the log files and sending inspecting results to a firewall. According to the method for inspecting the deep packets based on the suffix automaton regular engine structure, matching of the multiple regular expression of a deterministic finite automaton (DFA) can be achieved by using a single automaton in a NFA mode, the problems that the NFA can not achieve the matching of the multiple regular expression and space explosion occurs when the DFA achieves the matching of the multiple regular expression are solved, the space size of the NFA is effectively reduced, the problems that a traditional NFA engine constructing method is waste in space and invalid traversal exists in the process of executing mode matching are solved, response time of deep packet inspecting is effectively shortened, and whole performance and efficiency of a system are improved.
Owner:NORTHEASTERN UNIV

Using a tunable finite automaton for regular expression matching

Deterministic Finite Automatons (DFAs) and Nondeterministic Finite Automatons (NFAs) are two typical automatons used in the Network Intrusion Detection System (NIDS). Although they both perform regular expression matching, they have quite different performance and memory usage properties. DFAs provide fast and deterministic matching performance but suffer from the well-known state explosion problem. NFAs are compact, but their matching performance is unpredictable and with no worst case guarantee. A new automaton representation of regular expressions, called Tunable Finite Automaton (TFA), is described. TFAs resolve the DFAs' state explosion problem and the NFAs' unpredictable performance problem. Different from a DFA, which has only one active state, a TFA allows multiple concurrent active states. Thus, the total number of states required by the TFA to track the matching status is much smaller than that required by the DFA. Different from an NFA, a TFA guarantees that the number of concurrent active states is bounded by a bound factor b that can be tuned during the construction of the TFA according to the needs of the application for speed and storage. A TFA can achieve significant reductions in the number of states and memory space.
Owner:POLYTECHNIC INSTITUTE OF NEW YORK UNIVERSITY

Tree complex event processing process-based operator internal processing system

The invention discloses a tree complex event processing process-based operator internal processing system. The operator internal processing system comprises an output stream customization module, an event matching judgment module and an event composite module, wherein the output stream customization module processes through a Hash function after summating according to an operator semantic input event stream and an operator semantic identifier character string, and obtains and outputs a customization-event stream type SJL10; the event matching judgment module sequentially performs type constraint, time constraint and predicate constraint on a customization-event stream type MD10 to obtain and output a matching event stream type SJL20; and the event composite module respectively extracts the current start time, the current end time and all predicates of the received matching event stream type SJL20, and obtains and outputs a composite event stream SJL30. The throughput of a tree model-based complex event processing engine after optimizing is 3 to 6 times that of an open source engine Esper realized based on non-deterministic finite automaton (NFA); and the performance is stable under different event volumes or event sequence complexities.
Owner:BEIHANG UNIV

Data pattern analysis using optimized deterministic finite automaton

Techniques for data pattern analysis using deterministic finite automaton are described herein. In one embodiment, a number of transitions from a current node to one or more subsequent nodes representing one or more sequences of data patterns is determined, where each of the current node and subsequent nodes is associated with a deterministic finite automaton (DFA) state. A data structure is dynamically allocated for each of the subsequent nodes for storing information associated with each of the subsequent nodes, where data structures for the subsequent nodes are allocated in an array maintained by a data structure corresponding to the current node if the number of transitions is greater than a predetermined threshold. Other methods and apparatuses are also described.
Owner:QUEST SOFTWARE INC

Out-of-order data packet string matching method and system

The invention relates to an out-of-order data packet string matching method and system. The out-of-order data packet string matching method comprises the following steps of initializing and determining a finite state automata DFA and a mode suffix tree PST; initializing a buffering area and receiving character strings transmitted in network and obtained through data flows one by one, wherein every data flow is formed by at least two character strings orderly; obtaining character strings belonging to the same data flow one by one; setting and determining a current state of the finite state automata if the current character string has a prefix; adding a finding state to the tail of the current character string and obtaining a combined fragment if the current character string has the suffix; inputting the combined fragment to the finite state automata; storing the current character string information and enabling the current character string to pass. According to the out-of-order data packet string matching method, the model does not need caching of the data package but only caches states and accordingly matching of the character string with out-of-order data package is achieved.
Owner:INST OF INFORMATION ENG CAS

Regular expression matching method and device

The invention discloses a regular expression matching method and device and aims to increase matching speed of regular expressions. The method includes: determining fingerprint of a regular expression; determining representative fingerprint of the regular expression; determining a regular expression set according to the representative fingerprint of the regular expression, determining the regular expression set and determining representative fingerprint of the regular expression set; performing regular expression matching on data to be matched according to correspondence between the representative fingerprint of the regular expression set and a DFA (deterministic finite automaton) complied with the regular expression set.
Owner:浙江杭海新城控股集团有限公司

Regular expression matching equipment and method on basis of deterministic finite automaton

The invention provides regular expression matching equipment and a method on the basis of a deterministic finite automaton. The regular expression matching equipment comprises a packet dispatcher and a result collecting module. A regular expression matching system comprises a matching unit and a storage unit connected with the matching unit, the matching unit is respectively connected with the packet dispatcher and the result collecting module. In the method, each status transfer table is disintegrated into a character substitution table and a simplified status table, many statuses have identical character substitution tables and can be shared after disintegration, and furthermore, many statuses have identical character substitution tables, and can share the identical character substitution tables after minority skips are extracted. By the regular expression matching equipment and the method on the basis of the deterministic finite automaton, storage space for the DFA (deterministic finite automaton) is greatly reduced, and more regular expressions can be stored in a limited space.
Owner:DAWNING INFORMATION IND BEIJING +1

Method and system for decompression-free inspection of shared dictionary compressed traffic over HTTP

A system and a method for decompression-free inspection of compressed data are provided herein. The method includes the following stages: obtaining a dictionary file comprising a string of symbols, each associated with a respective index; obtaining at least one delta file associated with said dictionary file, wherein said delta file comprises a sequence of instructions that include at least one copy instruction pointing to an index within said dictionary and a length of a copy substring to be copied; scanning said dictionary using a pattern matching algorithm associated with a plurality of patterns and implemented as a Deterministic Finite Automaton (DFA), to yield DFA execution data; scanning said at least one delta file, using said pattern matching algorithm, wherein said DFA execution data is used to skip at least part of the scanning of the copy substrings for at least one of the copy instructions.
Owner:YISSUM RES DEV CO OF THE HEBREWUNIVERSITY OF JERUSALEM LTD +2

Content search system including multiple deterministic finite automaton engines having shared memory resources

A content search system for determining whether an input string matches one or more of a number of patterns embodied by a deterministic finite automaton (DFA) includes a plurality of DFA engines that simultaneously compare sequential overlapping segments of the input string. The overlap region shared by adjacent pairs of input string segments is of a predetermined size. Initially, the first DFA engine is designated as the master engine, and the remaining DFA engines are designated as slave engines whose state results are speculative. Resolution logic compares the state results of the master engine with the state results of the adjacent slave engine to selectively validate the state results of the successor engine, which upon validation becomes the new master engine.
Owner:AVAGO TECH INT SALES PTE LTD

Encoding non-derministic finite automation states efficiently in a manner that permits simple and fast union operations

Deterministic Finite Automatons (DFAs) and Nondeterministic Finite Automatons (NFAs) are two typical automatons used in the Network Intrusion Detection System (NIDS). Although they both perform regular expression matching, they have quite different performance and memory usage properties. DFAs provide fast and deterministic matching performance but suffer from the well-known state explosion problem. NFAs are compact, but their matching performance is unpredictable and with no worst case guarantee. A new automaton representation of regular expressions, called Tunable Finite Automaton (TFA), is described. TFAs resolve the DFAs' state explosion problem and the NFAs' unpredictable performance problem. Different from a DFA, which has only one active state, a TFA allows multiple concurrent active states. Thus, the total number of states required by the TFA to track the matching status is much smaller than that required by the DFA. Different from an NFA, a TFA guarantees that the number of concurrent active states is bounded by a bound factor b that can be tuned during the construction of the TFA according to the needs of the application for speed and storage. A TFA can achieve significant reductions in the number of states and memory space.
Owner:POLYTECHNIC INSTITUTE OF NEW YORK UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products