Treffer: A data management framework for clinical interpretation of human variations

Title:
A data management framework for clinical interpretation of human variations
Authors:
Publisher Information:
The University of Hong Kong (Pokfulam, Hong Kong)
Publication Year:
2017
Collection:
University of Hong Kong: HKU Scholars Hub
Document Type:
Dissertation doctoral or postdoctoral thesis
Language:
English
Relation:
HKU Theses Online (HKUTO); Ou, M. [区敏]. (2017). A data management framework for clinical interpretation of human variations. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.; b5864196; http://hdl.handle.net/10722/241409
Rights:
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. ; The author retains all proprietary rights, (such as patent rights) and the right to use in future works.
Accession Number:
edsbas.F0C3BDBD
Database:
BASE

Weitere Informationen

The emergence of high-throughput, low-cost next-generation sequencing (NGS) technologies has led to an explosion in genetic information for clinical care. The exploitation of such massive genetic information has the potential to revolutionize disease diagnosis and drug development, but it also reveals an urgent need for efficient and accurate tools to analyze genetic information, in particular, to interpret genetic variants for clinical purposes. The challenge of NGS data management and analysis is not only in managing and analyzing the massive amount of data generated from genetic tests. Diverse sources (databases) of medical knowledge in annotations of genetic variants complicate the process of automating the variant analysis. For example, the coordinate system and naming convention vary from case to case. Integrating these annotations is an important, but enormous task, and the resulting databases require substantial storage space, and querying can be very slow without proper indexing and pre-processing. Another issue is that, in order to help users get a better understanding of genetic related annotations, visualization of different aspects of variant information needs to be handled carefully. Existing software tools have solved some of these problems, but lack other features. In this thesis, I present a data management framework for the clinical interpretation of human variations. First, it involves a unified coordinate system in which annotations are categorized according to variants, genes or proteins. Second, the annotation process can be speeded up by pre-processing the data on a supercomputer, and the integrated database storage can be reduced via a unified database representation with compressed fields. Based on this framework, an variant interpretation software tool called database.bio was designedand developed. It combines variant annotation, categorization, and visualization in order to support clinical doctors or bioinformaticians with insight into individual genetic characteristics. Moreover, the ...