  • Thesis

  • Authors: Walker, Megon Jarmaine (2006)

  • Understanding molecular interactions is at the core of computational biology and includes problems such as characterizing protein-protein, protein-small molecule, protein-DNA, and Protein-RNA binding events. These interactions are often elucidated by expensive and time-consuming assays during which candidate binders are screened against a target. The main aim of this dissertation is to improve the speed, cost, and overall efficiency of screening assays in the context of drug design and molecular systems biology. Sequential screening is an iterative process of experimentation and model refinement. Target binding activity is determined for samples of putative binders, results are used to update a classification model, and subsequent binding experiments are performed based on knowledg...

  • Thesis

  • Authors: Vinar, Tomas (2006)

  • In this thesis, we present enhancements of hidden Markov models for the problem of finding genes in DNA sequences. Genes are the parts of DNA that serve as a template for synthesis of proteins. Thus. gene finding is a crucial step in the analysis of DNA sequencing data. Hidden Markov models are a key tool used in gene finding. Yhis thesis presents three methods for extending the capabilities of hidden Markov models to better capture the statistical properties of DNA sequences. In all three, we encounter limiting factors that lead to trade-offs between the model accuracy and those limiting factors. First. we build better models for recognizing biological signals in DNA sequences. Our new models capture non-adjacent dependencies within these signals. In this case. the main limiting ...

  • Thesis

  • Authors: Brejova, Bronislava (2006)

  • This thesis introduces new techniques for finding genes in genomic sequences. Genes are regions of a genome encoding proteins of an organism. Identification of genes in a genome is an important step in the annotation process after a new genome is sequenced. The prediction accuracy of gene finding can be greatly improved by using experimental evidence. This evidence includes homologies between the genome and databases of known proteins, or evolutionary conservation of genomic sequence in different species. We propose a flexible framework to incorporate several different sources of such evidence into a gene finder based on a hidden Markov model. Various sources of evidence are expressed as partial probabilistic statements about the annotation of positions in the sequence, and these a...