Biclustering mining method based on butterfly network under synchronous programming model Hama BSP
A butterfly network and programming model technology, applied in the field of HamaBSP programming, to improve the utilization rate, reduce the amount of data, and reduce the amount of communication
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0031] Example 1 (implementation of bi-clustering mining method on BNHB). See Table 1(a) for an example of a gene expression data set, and see Table 1(b) for the source data input by the algorithm. It is replaced by the generated column label, see the fragmented data when reading the data image 3 The first row of , the final mining results are shown in Table 1(c), and the result threshold of the biclustering column (attribute) is 0.6.
[0032] Table 1
[0033]
[0034]
[0035]
[0036]
[0037] The detailed process of Example 1 is as follows. First, each node reads a piece of data, and then enters the 2 The processing of N supersteps. In the first superstep process (step=1), it first enters the local calculation stage, and each node uses image 3 The data obtained in the first row of the above data are compared locally, and then the intermediate results are generated, see image 3 In the 2nd row. Next, enter the global communication stage. First, the 4 node...
Embodiment 2
[0073] [specific performance analysis]
[0074] We analyze the performance of the method of the present invention, and the most critical factors to measure the pros and cons of the double-clustering mining method based on the butterfly network under the synchronous programming model Hama BSP include: processing efficiency and scalability. Processing efficiency is usually measured by task processing time, which refers to the time between when the user initiates the bi-clustering mining request and when the user gets the mining result. Scalability is usually measured by continuously increasing the amount of data or the number of processing nodes, and the measurement index is generally task processing time. The performance metric adopted in our performance analysis is task processing time.
[0075] We used 6 real gene expression data sets from the BroadInstitute website. The behavioral genes in each data set are listed as experimental conditions, and each cell stores gene expres...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com