Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Methods and systems for sparse vector-based matrix transformations

A sparse vector and matrix technology, applied in biological systems, genomics, biostatistics, etc., can solve problems such as difficult query or transformation of data, lack of data, etc.

Pending Publication Date: 2021-04-09
REGENERON PHARM INC
View PDF2 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Scalability: The amount of data grows rapidly, which makes it difficult to query or transform the data
Decentralized analytics: lack of a unified engine for big data processing that provides shared APIs and common code bases

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Methods and systems for sparse vector-based matrix transformations
  • Methods and systems for sparse vector-based matrix transformations
  • Methods and systems for sparse vector-based matrix transformations

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0249] Embodiment 1. A method comprising:

[0250] receiving genotype data and phenotype data for a plurality of individuals from a plurality of cohorts;

[0251] generating a genotype matrix based on the genotype data, wherein the genotype matrix includes a column for each of the plurality of individuals and a plurality of rows for each of the plurality of variants;

[0252] generating a quantitative trait matrix based on the phenotypic data, wherein the quantitative trait matrix includes a column for each of a plurality of quantitative traits and a plurality of rows for each of the plurality of individuals;

[0253] generating a binary trait matrix based on the phenotype data, wherein the binary trait matrix includes a column for each of a plurality of binary traits and a plurality of rows for each of the plurality of individuals;

[0254] appending at least a portion of a metadata matrix to each of the genotype matrix, the quantitative trait matrix, and the binary trait ma...

Embodiment 2

[0262] Embodiment 2. The method of embodiment 1, wherein the sparse vector representing one or more values ​​of the genotype matrix comprises A data structure that associates columns for each group identifier.

Embodiment 3

[0263] Embodiment 3. The method of embodiment 1, wherein the sparse vector representing one or more values ​​of the genotype matrix comprises a homozygous reference.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Methods and systems are described for converting a matrix to a sparse vector-based matrix utilizing one or more of a global identifier, a cohort identifier, an n-tuple representation, and a sparse vector. Methods and systems are described for partitioning matrices. Methods and systems are described for managing execution of tasks in a distributed computing environment. Methods and systems are described for positioning data within the distributed computing environment.

Description

[0001] Cross references to related patent applications [0002] This application claims priority to U.S. Provisional Application 62 / 679,517, filed June 1, 2018, and U.S. Provisional Application 62 / 840,986, filed April 30, 2019, which are hereby incorporated by reference in their entirety. Background technique [0003] The discovery, development and commercialization of new classes of drugs can require decades and billions in research and development investments. Research has shown that new drug target candidates based on evidence from human genetics have a significantly improved probability of success. In response, comprehensive genetic databases were created to complement the drug development pipeline. This comprehensive genetic database includes DNA sequence data from more than 250,000 individuals with paired de-identified electronic health records. High-throughput pipelines have been developed for examining associations between all genetic mutations and disease traits. A...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G16B5/10G06F17/10
CPCG16B5/10G06F16/2462G16B50/30G06F16/221G16B20/20G16B30/10G16B40/20
Inventor E·麦克斯韦L·巴纳德A·亚达夫J·史泰博J·雷德L·赫碧嘉
Owner REGENERON PHARM INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products