Treffer: Towards understanding genome regulation via high-resolution analysis of chromatin accessibility

Title:
Towards understanding genome regulation via high-resolution analysis of chromatin accessibility
Authors:
Contributors:
Lunter, G, Hughes, J
Publication Year:
2024
Collection:
Oxford University Research Archive (ORA)
Document Type:
Dissertation thesis
Language:
English
DOI:
10.5287/ora-dav75pnkb
Rights:
info:eu-repo/semantics/openAccess
Accession Number:
edsbas.F6DE0652
Database:
BASE

Weitere Informationen

Next generation sequencing has been used to in functional genomics to rep- resent specific aspects of chromatin structure, DNA-protein interactions and the epigenome, enriching the knowledge of the non-coding genome which plays significantly roles in gene regulation but is vaguely understood. Among the assays for functional genomics, Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) is a powerful and popular tool as its simple read out of ‘chromatin accessibility’ provides both the positioning of functional regulatory elements and a general representation of the active genome. Despite its advantages, the analysis of ATAC-seq is challenging and due to the data sparsity and the sub-optimal use of the data, especially the fragment size of the sequencing reads. In this thesis, I address the use of ATAC-seq fragment size in two different ways with respect to different goals: prioritising functional variants and peak calling. The first goal is achieved through a collaborative work in Avocato, which develops an end-to-end pipeline for single-cell ATAC-seq (scATAC-seq) functional analysis. Avocato employs a simple but useful fragment size filtering strategy, retaining only short fragments from high-quality scATAC-seq data, which successfully prioritises functional SNP variants hidden in the raw signals. Several statistical optimisations in pre-processing stages and a powerful interactive visualisation platform assist Avocato to be a solid and hands-on tool for high- resolution scATAC-seq analysis. The second goal, peak calling, focuses on both bulk and single-cell ATAC-seq data. I developed EpiCall, a novel ATAC-seq specific peak caller that shows superior performance compared to current popular peak callers. By modeling ATAC-seq fragments of varying sizes differently, EpiCall optimally uses fragment size and coverage information, which is integrated into a mechanistic model for the actual formation of sequencing fragments. I applied EpiCall in various datasets and evaluated it extensively with ...