Treffer: Bayesian approach for two model-selection-related bioinformatics problems. ; CUHK electronic theses & dissertations collection

Title:
Bayesian approach for two model-selection-related bioinformatics problems. ; CUHK electronic theses & dissertations collection
Contributors:
Liang, Tong, Chinese University of Hong Kong Graduate School. Division of Information Engineering.
Publication Year:
2013
Collection:
The Chinese University of Hong Kong: CUHK Digital Repository / 香港中文大學數碼典藏
Document Type:
Fachzeitschrift text
File Description:
electronic resource; remote; 1 online resource (xi, 130 leaves) : ill. (some col.)
Language:
English
Chinese
Rights:
Use of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Accession Number:
edsbas.7C5CB104
Database:
BASE

Weitere Informationen

在貝葉斯推理框架下,貝葉斯方法可以通過數據推斷複雜概率模型中的參數和結構。它被廣泛應用於多个領域。對於生物信息學問題,貝葉斯方法同樣也是一個理想的方法。本文通過介紹新的貝葉斯模型和計算方法討論並解決了兩個與模型選擇相關的生物信息學問題。 ; 第一個問題是關於在DNA 序列中的模式識別的相關研究。串聯重複序列片段在DNA 序列中經常出現。它對於基因組進化和人類疾病的研究非常重要。在這一部分,本文主要討論不確定數目的同一模式的串聯重複序列彌散分佈在同一個序列中的情況。我們首先對串聯重複序列片段構建概率模型。然後利用馬爾可夫鏈蒙特卡羅算法探索後驗分佈進而推斷出串聯重複序列的重複片段的模式矩陣和位置。此外,利用RJMCMC 算法解決由不確定數目的重複片段引起的模型選擇問題。 ; 另一個問題是對於生物分子的構象轉換的分析。一組生物分子的構象可被分成幾個不同的亞穩定狀態。由於生物分子的功能和構象之間的固有聯繫,構象轉變在不同的生物分子的生物過程中都扮演者非常重要的角色。一般我們從分子動力學模擬中可以得到構象轉換的數據。基於從分子動力學模擬中得到的微觀狀態水準上的構象轉換資訊,我們利用貝葉斯方法研究從微觀狀態到可變數目的亞穩定狀態的聚合問題。 ; 本文通過對以上兩個問題討論闡釋貝葉斯方法在生物信息學研究的多個方面具備優勢。這包括闡述生物問題的多變性,處理噪聲和失數據,以及解決模型選擇問題。 ; Bayesian approach is a powerful framework for inferring the parameters and structures of complicated probabilistic models from data. It is widely applied in many areas and also ideal for Bioinformatics problems due to their usually high complexity. In this thesis, new Bayesian models and computing methods are introduced to solve two Bioinformatics problems which are both related to model selection. ; The first problem is about the repeat pattern recognition. Tandem repeats occur frequently in DNA sequences. They are important for studying genome evolution and human disease. This thesis focuses on the case that an unknown number of tandem repeat segments of the same pattern are dispersively distributed in a sequence. A probabilistic generative model is introduced for the tandem repeats. Markov chain Monte Carlo algorithms are used to explore the posterior distribution as an effort to infer both the specific pattern of the tandem repeats and the location of repeat segments. Furthermore, reversible jump Markov chain Monte Carlo algorithms are used to address the transdimensional model selection problem raised by the variable number of repeat segments. ; The second part of this thesis is engaged in the conformational transitions of biomolecules. Because the function of a biological biomolecule is inherently related to its variable conformations which can be grouped into a set of metastable or long-live states, conformational ...