Method of automatically acquiring QTL data from literature

A technology of automatic acquisition and documentation, applied in the field of bioinformatics, can solve problems such as unfavorable updates, heavy workload, slow speed, etc., and achieve the effect of reducing labor burden

Inactive Publication Date: 2018-01-16
武汉古奥基因科技有限公司
View PDF1 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, at present, the acquisition of QTL information is mainly through manual reading of literature, which has a large workload and slow speed, which is not conducive to timely update

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method of automatically acquiring QTL data from literature
  • Method of automatically acquiring QTL data from literature
  • Method of automatically acquiring QTL data from literature

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] 1. Extract the structure and content of the table from the document in PDF format

[0031] In academic literature in pdf format, tables are usually presented in papers in the form of three-line tables, such as figure 1 shown. Since PDF can be regarded as a natural image format without noise, we use the method of image recognition to analyze and process the three-line table. Usually the three-line table is separated by three table lines of the same length to separate the table header part and the table data field part. The length of these three table lines can be regarded as the number of continuous black pixels. By scanning the page line by line, we can quickly locate the position of the table line, and then locate the position of the table. According to the position of the table line, we It can also distinguish the header field and the data field.

[0032] First, we determine the vertical dividing line of the table ( figure 1 All vertical lines except the vertical ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the bioinformation field, and especially relates to a method of automatically acquiring QTL (quantitative trait locus) data from literature. The method of automatically acquiring QTL data from literature automatically excavates the information from the related literature, such as QTL and gene functions through a text excavation method, automatically acquires the QTL information from the literature in the PDF format through a computer data excavation technology so as to solve the problem that the current literature artificial reading method is large in the workload, isslow and cannot timely processing newly published number. And at same time, the method of automatically acquiring QTL data from literature can greatly reduce the labor burden for database construction.

Description

technical field [0001] The invention belongs to the field of biological information, in particular to a method for automatically acquiring QTL data from documents. Background technique [0002] Biological researchers publish a large amount of research data in the literature. With the rapid increase of literature, how to quickly obtain these data has become a challenge. It is often difficult to find the information concerned in a timely and effective manner by manually reading these documents. Therefore, how to automatically obtain effective information from massive data has become an urgent problem in bioinformatics, and using natural language processing and machine learning to mine literature will become an important means to solve this problem. [0003] QTL (quantitative trait locus) is important genome annotation information. However, at present, the acquisition of QTL information is mainly through manual reading of literature, which has a large workload and slow speed,...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06F17/27
Inventor 袁晓辉
Owner 武汉古奥基因科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products