Computational Biology: Insights into Hemagglutinin and Polycomb Repressive Complex 2 Function

Date
2012
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract

Influenza B virus hemagglutinin (HA) is a major surface glycoprotein with frequent amino-acid substitutions. However, the roles of antibody selection in the amino-acid substitutions of HA were still poorly understood. An analysis was conducted on a total of 271 HA 1 sequences of influenza B virus strains isolated during 1940∼2007 finding positively selected sites all located in the four major epitopes (120-loop, 150-loop, 160-loop and 190-helix) supporting a predominant role of antibody selection in HA evolution. Of particular significance is the involvement of the 120-loop in positive selection. Influenza B virus HA continues to evolve into new sublineages, within which the four major epitopes were targeted selectively in positive selection. Thus, any newly emerging strains need to be placed in the context of their evolutionary history in order to understand and predict their epidemic potential. As key epigenetic regulators, polycomb group (PcG) proteins are responsible for the control of cell proliferation and differentiation as well as stem cell pluripotency and self-renewal. To facilitate experimental identification of PcG target genes, which are poorly understood, we propose a novel computational method, EpiPredictor , which models transcription factor interaction using a non-linear kernel. The resulting targets suggests that multiple transcription factor networking at the cis -regulatory elements is critical for PcG recruitment, while high GC content and high conservation level are also important features of PcG target genes. To try to translate the EpiPredictor into human data, we performed a computational study utilizing 22 human genome-wide CHIP data to identify DNA motifs and genome features that would potentially specify PRC2 using five motif discovery algorithms, Jaspar known transcription binding motifs, and other whole genome data. We have found multiple motifs within the various subgroups of experimental categories that have much higher enrichment against CHIP identified gene promoter than among random gene promoters. Specifically, we have identified Low CpG content CpG Islands (LeG's) as being critical in the separation of Cancer cell line identified targets from Embryonic Stem cell line identified targets. Additionally, there are differences between human and mouse ES cell predictions using the same motifs and features suggesting relevant evolutionary divergence.

Description
Degree
Doctor of Philosophy
Type
Thesis
Keywords
Applied sciences, Biological sciences, Polycomb, Polycomb response element, Hemagglutinin, Motif discovery, Positive selective pressure, Viral evolution, Biomedical engineering, Bioinformatics
Citation

Kirk, Brian David. "Computational Biology: Insights into Hemagglutinin and Polycomb Repressive Complex 2 Function." (2012) Diss., Rice University. https://hdl.handle.net/1911/70293.

Has part(s)
Forms part of
Published Version
Rights
Copyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.
Link to license
Citable link to this page