![]() |
||||
![]() |
![]() |
|||
The
Gupta group uses statistical and computational approaches to find conserved
stochastic patterns or motifs
in genome sequences. They are particularly interested in using these
approaches to discover gene regulatory modules and interaction networks
involved in
specific biological processes. A number of theoretical issues arise from
these studies such as: What are the limitations beyond which a pattern
is completely unidentifiable from the background? How is local and global
sequence composition related to the degree of difficulty in finding patterns?
One observation from the genomes of higher organisms such as the mouse
or human is that true motifs are not always well-conserved, but often
occur in clusters or regulatory modules close to the regulation start
site. In
such a scenario, standard motif searching methods are not effective and
often lead to a high number of false predictions or missed sites. Dr.
Gupta and colleagues have developed a Monte Carlo approach to find
the optimal
set of pattern classes by introducing a framework that is assumed to
have an underlying Markov structure for pattern-type occurrences and
inter-site
distances. Under a Bayesian framework, using appropriate choices of priors
then allows the formulation of a recursive algorithm to evaluate the
new likelihood function exactly, obtain posterior samples and derive
improved
parameter estimates. This framework for the module model has the potential
to elucidate gene regulation networks using only the genomic sequence
information. For example, if the binding sites for a large group of
human transcription
factors were defined experimentally, one can search for the best subset
of motifs in the non-coding regions of their target sequences. Transcription
factors that bind the same subset of motifs may be co-regulated and may
therefore be functionally related. These regulatory modules should lead
to experimentally testable models that will shed light on many interesting
biological processes and disease states. Selected References: Giresi PG, Gupta M, Lieb JD. (2006) Regulation of nucleosome stability as a mediator of chromatin function. Curr Opin Genet Dev. 16:171-6. Gupta M, Liu JS. (2005) De novo cis-regulatory module elicitation for eukaryotic genomes. Proc Natl Acad Sci U S A 102:7079-84. Gupta M and Liu JS. (2003) Discovery of conserved sequence patterns using a stochastic dictionary model. J Am Stat Assoc 98:55-66. Liu JS, Gupta M, Liu XL and Lawrence CL. (2002) Statistical models for
motif discovery. In Case Studies in Bayesian Statistics vol 6, Springer-Verlag,
New York. |
||||
contact information: [phone] [email] |
||||