System and method for discovering patterns with noise

A pattern and candidate pattern technology, applied in character and pattern recognition, biostatistics, data processing applications, etc., can solve problems such as noise distortion

Inactive Publication Date: 2004-05-19
IBM CORP
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] This problem becomes critical when the pattern is very lon

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • System and method for discovering patterns with noise
  • System and method for discovering patterns with noise
  • System and method for discovering patterns with noise

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] The present invention provides a system and method for discovering valid patterns in data while taking into account the effects of noise. The present invention provides a new metric for accounting for mutations or naturally occurring changes in data found in valid patterns. The invention allows some flexibility in pattern matching. Prior art models for patterns typically only consider exact pattern matches in the data. The present invention provides a more flexible model that allows for ambiguity in pattern matching. A consistency matrix is ​​included to clearly represent the likelihood in symbol substitution. Each entry in this matrix corresponds to a pair of symbols (x, y) and represents the conditional probability that x is true given the observation of y. The present invention also provides an efficient method of finding patterns satisfying a certain minimum matching threshold.

[0020] should be understood Figure 1-4 The various parts shown can be realized in...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A system and method are provided for discovering significant patterns from a list of records in a dataset. Each record includes a set of items, and each significant pattern includes a subset of items such that a significance of the pattern exceeds a significance level. A significance is computed for each item in the list of records to determine significant items. The records are randomly sampled to select a sample portion of the records. Ambiguous patterns are identified against the sample portion of the records and verified against the entire list of records in the dataset.

Description

technical field [0001] The present invention relates to finding efficient patterns in sequences of long terms, and more particularly to a system and method for identifying efficient patterns in sequences with noise. Background technique [0002] As large amounts of data are stored and used, it becomes more important to discover and understand effective patterns in large datasets. The discovery of efficient patterns is even more important in many new fields and in many new applications of existing technologies. In the article "Mining association rules between sets of items in Large database" (Proc. ACM SIGMOD Conf. on Management of Data, 207-216, 1993) of R. Agrawal et al. measure of sex. As discussed in Agrawal et al., the input is a set of transactions, each transaction containing a set of items. The validity of a set of items is determined by the number of transactions that contain the set of items. [0003] Due to the presence of noise, one symbol may be misrepresente...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/68G16B40/00
CPCG06K9/62Y10S707/99936G06F19/24G16B40/00G06F18/00
Inventor 王伟杨炯P·S-L·渝
Owner IBM CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products