Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Vector-based intergenic semantic similarity calculation method

A technology of semantic similarity and calculation method, applied in computing, genomics, biostatistics and other directions, can solve the problem of inaccurate query methods and achieve the effect of improving effectiveness

Inactive Publication Date: 2017-06-27
EAST CHINA NORMAL UNIV
View PDF4 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, since functionally similar or related proteins do not necessarily have strong similarity in sequence, the query method based on sequence comparison is sometimes not very accurate

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Vector-based intergenic semantic similarity calculation method
  • Vector-based intergenic semantic similarity calculation method
  • Vector-based intergenic semantic similarity calculation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0052] refer to figure 1 , the present invention proposes a vector-based calculation method for semantic similarity between genes to solve the problem of semantic similarity between genes. Taking gene 1 and gene 2 as an example, assuming that there are a total of terms in the gene ontology database: a, b, c, d, e, f, g, h, x, y, z, terms correspond to nodes one by one, the gene The semantic contribution w of the parent-child relationship (“is_a”) in the ontology is_a = 1, the semantic contribution w of the inclusion relation ("part_of") in the Gene Ontology part_of=0.7, the specific implementation method comprises the following steps altogether:

[0053] A. Initialize the vector of two genes

[0054] (1) For each gene, the component corresponding to the term directly annotated by the gene is initialized to 1, and the other components are initialized to 0, and the term directly annotated can be obtained through the gene annotation file. Assuming that the terms directly anno...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a vector-based intergenic semantic similarity calculation method. According to the method, a hierarchical structure of a gene ontology is further considered on the existing basis of calculating gene similarity based on a vector method; and in a vector construction process, not only terms directly annotated by genes but also child nodes and father nodes of the terms in the gene ontology are considered, so that generated vectors can reflect attributes of the genes more comprehensively and more detailedly.

Description

technical field [0001] The invention belongs to the technical field of biological information, in particular to a method for calculating the semantic similarity between genes based on vectors. Background technique [0002] The development of science and technology has led to the exponential growth of biological data, and at the same time the complexity has also increased. Because different biologists describe the same biological data differently, the understanding of biological data is biased. For example, the same biological term has different meanings in different places, or different meanings are expressed by the same term. Such semantic confusion makes biologists spend a lot of time and energy searching for the required biological information. This confusion of biological semantic definitions not only makes it difficult for computers to find the desired results, but even manual processing is also difficult to achieve satisfactory results. Applying ontology to the fiel...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F19/18G06F19/24
CPCG16B20/00G16B40/00
Inventor 章炯民贾柯
Owner EAST CHINA NORMAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products