<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "http://dtd.nlm.nih.gov/publishing/2.3/journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
	<front>
		<journal-meta>
			<journal-id journal-id-type="nlm-ta">J Proteomics Bioinform</journal-id>
			<journal-id journal-id-type="publisher-id">opg</journal-id>
            <journal-title>Journal of Proteomics &amp; Bioinformatics</journal-title>
			<issn pub-type="epub">0974-276X</issn>
			<publisher>
				<publisher-name>OMICS Publishing Group</publisher-name>
				<publisher-loc>India, USA</publisher-loc>
			</publisher>
		</journal-meta>
		<article-meta>
		<article-id pub-id-type="doi">10.4172/jpb.1000035</article-id>
		<article-id pub-id-type="publisher-id">000063</article-id>
		<article-categories>
				<subj-group subj-group-type="heading">
					<subject>Research Article</subject>
				</subj-group>
				<subj-group subj-group-type="Discipline">
					<subject>Biochemistry</subject>
				</subj-group>
				<subj-group subj-group-type="System Taxonomy">
					<subject>Proteomics</subject>
					<subject>Bioinformatics</subject>
					<subject>Genomics</subject>
					<subject>Transcriptomics</subject>
					<subject>Biomarkers</subject>
				</subj-group>
			</article-categories>
			<title-group>
				<article-title>Pathway Modeling: New face of Graphical Probabilistic Analysis</article-title>
			</title-group>
			<contrib-group>
				<contrib contrib-type="author">
					<name>
						<surname>Tagore</surname>
						<given-names>Somnath</given-names>
					</name>
					<xref ref-type="aff" rid="A1">1</xref>
				</contrib>
				<contrib contrib-type="author">
					<name>
						<surname>Gomase</surname>
						<given-names>Virendra S.</given-names>
					</name>
					<xref ref-type="aff" rid="A1">1</xref>					
				</contrib>
				<contrib contrib-type="author">
					<name>
						<surname>De</surname>
						<given-names>Rajat K.</given-names>
					</name>
					<xref ref-type="aff" rid="A2">2</xref>
				</contrib>
			</contrib-group>
			<aff id="A1"><label>1</label>Department of Bioinformatics, Padmashree Dr. D.Y. Patil University, Plot No-50, Sector-15, CBD Belapur, Navi Mumbai 400614, India</aff>
			<aff id="A2"><label>2</label>Machine Intelligence Unit, Indian Statistical Institute, Kolkata 700108, India</aff>
			<author-notes>
				<corresp>To whom correspondence should be addressed: Virendra S. Gomase, Department of Bioinformatics, Padmashree Dr. D.Y. Patil University, Plot No-50, Sector-15, CBD Belapur, Navi Mumbai 400614, India, E-mail: <email >virusgene1@yahoo.co.in</email></corresp>
			</author-notes>
			<pub-date pub-type="collection">
				<month>08</month>
				<year>2008</year>
			</pub-date>
			<pub-date pub-type="epub">
				<day>14</day>
				<month>08</month>
				<year>2008</year>
			</pub-date>						
			<volume>1</volume>
			<issue>5</issue>
			<fpage>281</fpage>
			<lpage>286</lpage>
			<history>
			<date date-type="received">
			     <day>11</day>
				 <month>07</month>
				 <year>2008</year>
			</date>
			<date date-type="accepted">
			      <day>02</day>
				  <month>08</month>
				  <year>2008</year>
			</date>
			</history>				
			<permissions>
			<copyright-statement>Copyright: &copy; Somnath T, et al.</copyright-statement>
        <copyright-year>2008</copyright-year>
        <license license-type="open-access">
          <p>This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.</p>
        </license>
      </permissions>	 
			<abstract>
				<p>Pathway analysis is one of the most interesting aspects of Systems Biology. Modeling biological pathways is interesting as well as difficult to optimize. Various modeling problems of diseases can be successfully analyzed using this simulation approach. Graphical probabilistic approaches are one of the unique methodologies that are used for designing and analyzing pathways. We have discussed the various graphical approaches that are actively involved in pathway modeling.</p>
			</abstract> 
			 <kwd-group>
				<kwd>Pathway modeling</kwd>
				<kwd>Pathway analysis</kwd>
				<kwd>Helmholtz machine</kwd>
				<kwd>HMM</kwd>
			</kwd-group>
			<custom-meta-wrap>
				<custom-meta>
					<meta-name>citation</meta-name>
					<meta-value>Somnath T, Virendra SG, Rajat KD (2008) Pathway Modeling: New face of Graphical Probabilistic Analysis. J Proteomics Bioinform 1: 281-286. doi:<ext-link ext-link-type="doi" xlink:href="10.4172/jpb.1000035">10.4172/jpb.1000035</ext-link></meta-value>
				</custom-meta>
			</custom-meta-wrap>
		</article-meta>
	</front>
	<body>
		<sec>
			<title>Introduction</title>
				<p>Biological pathways are modeled for analyzing and visualizing various sub-steps of the network, study gene expression profiles and predicting outcome of various alterations made to the cells. A major challenge in developing these models is to choose the correct abstraction. Due to the large and diverse nature of biological networks, it is essential to balance computational complexity against model fidelity and to move between models of different levels of detail, using different meaning ways. Here, graphical probabilistic models are discussed for modeling biochemical pathways. Biological pathways are categorized into Metabolic Pathways, Signal Transduction Pathways and Gene regulatory Networks. Here, we have tried to look into all these aspects of biological pathway modeling.</p>
		</sec>
		<sec>
			<title>Graphical Probabilistic Models</title>
				<p>Graphical Probabilistic Models represent multivariate probability densities. These multivariate probability densities are represented by a product of terms that involves few variables. Furthermore, the products are represented by graph theoretical approach. This graph relates the variables that are represented by a common term. The common types of graphical models are discussed here (<xref ref-type="bibr" rid="r1">Agarwal et al., 2000</xref>; <xref ref-type="bibr" rid="r16">Hall et al., 1999</xref>).</p>
		</sec>
		<sec>
			<title>Types of Graphical Probabilistic Models</title>
				<sec>
					<title>Bayesian Networks</title>
						<p>Bayesian Networks are used for predicting relationship within variables. It is a directed acyclic graph whose nodes represent random variables; arcs represent statistical dependence relations among the variables and local probability distributions for each variable given values of its parents (<xref ref-type="bibr" rid="r21">Levitsky et al., 2007</xref>; <xref ref-type="bibr" rid="r23">Marashi et al., 2007</xref>).</p>
				<fig id="g1">
					<label>Figure 1</label>
					<caption>
						<title>Figure showing a gene regulatory network explained using Bayesian statistics.</title>
					</caption>
					<graphic xlink:href="JPB-01-281-g001.tif"/>
				</fig>
				<p>Thus, for each variable X<sub>i</sub>,
				 <disp-formula id="FD1">
					 <label>[1]</label>
				<mml:math id="M1" display='block'>
 <mml:mrow>
  <mml:mi>i</mml:mi><mml:mtext>&#x2009;</mml:mtext><mml:mo>&#x00B7;</mml:mo><mml:mtext>&#x2009;</mml:mtext><mml:mrow><mml:mo>{</mml:mo> <mml:mrow>
   <mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:mi>N</mml:mi>
  </mml:mrow> <mml:mo>}</mml:mo></mml:mrow>
 </mml:mrow>
</mml:math>
</disp-formula>
 </p>
				<p>the set of parent variables is denoted by parents (Xi), then the joint distribution of the variables is product of the local distributions. <disp-formula id="FD2">
					 <label>[2]</label>
				<mml:math id="M2" display='block'>
 <mml:mrow>
  <mml:mi>Pr</mml:mi><mml:mrow><mml:mo>(</mml:mo>
   <mml:mrow>
    <mml:msub>
     <mml:mi>X</mml:mi>
     <mml:mi>i</mml:mi>
    </mml:msub>
    <mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msub>
     <mml:mi>X</mml:mi>
     <mml:mi>n</mml:mi>
    </mml:msub>
    
   </mml:mrow>
  <mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mtext>&#x00D0;&nbsp;Pr</mml:mtext><mml:mrow><mml:mo>(</mml:mo>
   <mml:mrow>
    <mml:msub>
     <mml:mi>X</mml:mi>
     <mml:mi>r</mml:mi>
    </mml:msub>
    <mml:mo>&#x007C;</mml:mo><mml:mi>p</mml:mi><mml:mi>a</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi><mml:mrow><mml:mo>(</mml:mo>
     <mml:mrow>
      <mml:msub>
       <mml:mi>X</mml:mi>
       <mml:mi>i</mml:mi>
      </mml:msub>
      
     </mml:mrow>
    <mml:mo>)</mml:mo></mml:mrow>
   </mml:mrow>
  <mml:mo>)</mml:mo></mml:mrow><mml:mtext>&nbsp;</mml:mtext>
 </mml:mrow>
</mml:math></disp-formula>
 </p>
				</sec>
				<sec>
					<title>Gaussian Networks</title>
						<p>The normal distribution is univariate in nature. But, there is a difficulty working with univariate distribution as the covariance matrix must be positive definite in nature. But with gaussian networks, this constraint needs not to be considered (<xref ref-type="bibr" rid="r25">McKinney, 2006</xref>).</p>
				<fig id="g2">
					<label>Figure 2</label>
					<caption>
						<title>Figure showing the Gaussian network</title>
					</caption>
					<graphic xlink:href="JPB-01-281-g002.tif"/>
				</fig>	
				</sec>
				<sec>
					<title>Maximum Likelihood</title>
						<p>Maximum Likelihood Estimation begins with writing a mathematical expression called the Likelihood Function of the sample data. It is the probability of obtaining that particular set of data, given the chosen probability distribution model. This expression contains the unknown model parameters. The values of these parameters that maximize the sample likelihood are known as the Maximum Likelihood Estimators (MLE’s) (<xref ref-type="bibr" rid="r17">Hu et al., 2004</xref>; 
<xref ref-type="bibr" rid="r18">Jin et al., 2008</xref>).</p>
						<p>Thus, Given a family M{i} of probability distributions parameterized by ‘i’ associated with a known probability function fn{i}, we may draw a sample x{1} to x{n} of ‘n’ values from this distribution and then using fn{i} we may compute the probability density (<xref ref-type="bibr" rid="r20">Justenhoven et al., 2008</xref>).<disp-formula id="FD3">
					 <label>[3]</label>
				<mml:math id="M3" display='block'>
 <mml:mrow>
  <mml:mrow><mml:mo>[</mml:mo> <mml:mrow>
   <mml:mi>f</mml:mi><mml:mi>n</mml:mi><mml:mrow><mml:mo>{</mml:mo> <mml:mi>i</mml:mi> <mml:mo>}</mml:mo></mml:mrow><mml:mrow><mml:mo>(</mml:mo>
    <mml:mrow>
     <mml:mi>x</mml:mi><mml:mrow><mml:mo>{</mml:mo> <mml:mn>1</mml:mn> <mml:mo>}</mml:mo></mml:mrow><mml:mtext>&#x2009;</mml:mtext><mml:mi>t</mml:mi><mml:mi>o</mml:mi><mml:mtext>&#x2009;</mml:mtext><mml:mi>x</mml:mi><mml:mrow><mml:mo>{</mml:mo> <mml:mi>n</mml:mi> <mml:mo>}</mml:mo></mml:mrow>
    </mml:mrow>
   <mml:mo>)</mml:mo></mml:mrow><mml:mtext>&#x2009;</mml:mtext><mml:mo>&#x007C;</mml:mo><mml:mi>i</mml:mi>
  </mml:mrow> <mml:mo>]</mml:mo></mml:mrow>
 </mml:mrow>
</mml:math></disp-formula>
</p>
						<p>In this case, the likelihood function is given by,<disp-formula id="FD4">
					 <label>[4]</label>
				<mml:math id="M4" display='block'>
 <mml:mrow>
  <mml:mrow><mml:mo>[</mml:mo> <mml:mrow>
   <mml:mi>L</mml:mi><mml:mrow><mml:mo>(</mml:mo>
    <mml:mi>i</mml:mi>
   <mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>[</mml:mo> <mml:mrow>
    <mml:mi>f</mml:mi><mml:mi>n</mml:mi><mml:mrow><mml:mo>{</mml:mo> <mml:mi>i</mml:mi> <mml:mo>}</mml:mo></mml:mrow><mml:mrow><mml:mo>(</mml:mo>
     <mml:mrow>
      <mml:mi>x</mml:mi><mml:mrow><mml:mo>{</mml:mo> <mml:mn>1</mml:mn> <mml:mo>}</mml:mo></mml:mrow><mml:mtext>&#x2009;</mml:mtext><mml:mi>t</mml:mi><mml:mi>o</mml:mi><mml:mtext>&#x2009;</mml:mtext><mml:mi>x</mml:mi><mml:mrow><mml:mo>{</mml:mo> <mml:mi>n</mml:mi> <mml:mo>}</mml:mo></mml:mrow>
     </mml:mrow>
    <mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x007C;</mml:mo><mml:mi>i</mml:mi>
   </mml:mrow> <mml:mo>]</mml:mo></mml:mrow>
  </mml:mrow> <mml:mo>]</mml:mo></mml:mrow>
 </mml:mrow>
</mml:math></disp-formula>
</p>
				</sec>
				<sec>
					<title>Density Estimation</title>
						<p>Density Estimation is the construction of an estimate based on an un-observed data. This is again based upon an un-observed probability density function (<xref ref-type="bibr" rid="r11">Estivill-Castro et al., 2001</xref>).</p>
				</sec>
				<sec>
					<title>Helmholtz Machine (HM)</title>
						<p>Helmholtz Machines are neural networks that learn the hidden structure of a set of data one being trained to create a generative model, producing the original set of data. Thus, by learning the various representations of the data, the underlying structure of the generative model approximates the hidden structure of the data set (<xref ref-type="bibr" rid="r11">Estivill-Castro et al., 2001</xref>; <xref ref-type="bibr" rid="r12">Estivill-Castro et al., 2001</xref>). These are categorized as Autoencoders, Deterministic HM and Stochastic HM. Autoencoders reconstructs its best guess of the input on the basis of the code that it sees, whereas Deterministic HM is inspired by mean-field methods and Stochastic HM captures the correlation between the activities in different hidden layers (<xref ref-type="bibr" rid="r15">Han et al., 2000</xref>).</p>
				</sec>
				<sec>
					<title>Latent Variable Models (LVM)</title>
						<p>Latent Variable Models relates a set of manifest variables to set of latent variables, which are grouped according to whether the manifest and latent variables are categorical or continuous. It provides a means to parse out measurement error by combining across observed variables and allow for the estimation of complex causal models. Furthermore, these are well developed for metric and discrete observed variables. Also, these account for clustering random effects (<xref ref-type="bibr" rid="r31">Tonella, 2001</xref>). </p>
				<fig id="g3">
					<label>Figure 3</label>
					<caption>
						<title>Figure shows LVM.</title>
					</caption>
					<graphic xlink:href="JPB-01-281-g003.tif"/>
				</fig>		
				</sec>
				<sec>
					<title>Generative Topographic Mapping (GTM)</title>
						<p>In Generative Topographic Mapping (GTM), the training data is assumed to arise by first picking a point probabilistically in a low-dimensional space, then mapping the point to the high-dimensional input space that is observed. This is done by a smooth function and then adding noise in the high dimensional input space. The Expectation-Maximization (EM) algorithm is used to make a training set that can be used to train the parameters of the low-dimensional probability distribution (<xref ref-type="bibr" rid="r6">Cormen et al., 2001</xref>).</p>
				</sec>
				<sec>
					<title>Hidden Markov Model (HMM)</title>
						<p>In a Hidden Markov Model, a state is not directly visible, but variables influenced by the state are visible. Each state has a probability distribution over the possible output tokens. This model is a finite set of states, each of which is associated with a probability distribution (<xref ref-type="bibr" rid="r9">Demetrescu et al., 2003</xref>). Transitions among the states are governed by a set of probabilities called transition probabilities. In a particular state an outcome or observation can be generated, according to the associated probability distribution. The three main problems of HMM include Evaluation Problem, Decoding Problem and Learning Problem (<xref ref-type="bibr" rid="r9">Demetrescu et al., 2003</xref>). </p>
				<fig id="g4">
					<label>Figure 4</label>
					<caption>
						<title>Figure showing HMM.</title>
					</caption>
					<graphic xlink:href="JPB-01-281-g004.tif"/>
				</fig>		
				</sec>
		</sec>
		<sec>
			<title>Application of Graphical Probabilistic Models</title>
				<sec>
					<title>Application to Metabolic Pathway Modeling</title>
						<p>A machine learning system is introduced for gene functions determination from heterogeneous data sources using a Weighted Naive Bayesian network (WNB). The aim is to infer functions of putative genes or Open Reading Frames (ORFs) from existing databases using computational methods. While integrating evidence from multiple and complementary sources significantly improves the prediction accuracy. The experimental results suggest that the stated hypothesis is valid and provide guidelines for using the WNB system for data collection, training and predictions. Furthermore, the combined training data sets consists results from gene expressions, clustering outputs and sequence homology from public databases. It is also used to analyze the contribution of each source of information toward the prediction performance through the weight training process (<xref ref-type="bibr" rid="r10">Deng et al., 2006</xref>).</p>
						<p>Searching for peptide hormones that signals via membrane receptors is often hampered by their small size, and lack of sequence similarity. A search tool based on the hidden Markov model is developed that uses various peptide hormone sequence features for estimating the likelihood that a protein contains a processed and secreted peptide of this class. Analysis of the top scoring hypothetical and poorly annotated human proteins identifies two candidate peptide hormones. Their analysis shows that both are localized to secretory granules in a transfected pancreatic cell line. The findings demonstrate the utility of a bioinformatics approach to identify novel biologically active peptides (<xref ref-type="bibr" rid="r26">Mirabeau et al., 2007</xref>).</p>
						<p>Multivariate methods are used for the analysis of molecular data including genotypic data and clinical phenotypes. These methods include latent variable models and joint multivariate modeling techniques. Thus, given the wide variety in the data considered, the objectives of the analysis and the methods applied, direct comparison of the results are discussed (<xref ref-type="bibr" rid="r3">Beyene et al., 2007</xref>).</p>
						<p>Major stem cell species are studied using a co-clustering latent variable model (LVM). It helps to explain cell type-specific transcription factors, using expression profiles. The LVM-based study also helps to analyze regulatory modules for each stem cell cluster. Furthermore, the identities of the stem cell clusters are revealed by the constituent genes that are directly targeted by the modules (<xref ref-type="bibr" rid="r19">Joung et al., 2006</xref>).</p>
				</sec>
				<sec>
					<title>Application to Signal Transduction Modeling</title>
						<p>A primer on the use of Bayesian networks is introduced for analyzing the connectivity of signaling networks. Bayesian networks are used to derive causal influences among biological signaling molecules. An automatically derive a Bayesian network model is introduced from proteomic data and to interpret the resulting model (<xref ref-type="bibr" rid="r28">Pe’er, 2005</xref>).</p>
						<p>Stochastic biochemical systems are used for modeling transcriptional regulation in single cells. Transcriptional regulation is easily modeled using a hidden Markov model (HMM). It is used to mathematically and computationally study transcriptional regulation in single cells. Furthermore, analysis by Monte Carlo simulation is computationally laborious. Several simulations are employed based on a transcriptional regulatory system for showing the relative merits and limitations of various approximation techniques (<xref ref-type="bibr" rid="r13">Goutsias, 2006</xref>).</p>
						<p>Graphical models are very well used for analyzing GProtein coupled receptors (GPCRs). Most of signaling networks in cells are mediated through the interaction of GPCRs with heterotrimeric GTP-binding proteins (G-proteins). Experimental data suggest that heterotrimeric G-proteins interact with parts of the activated receptor at the transmembrane helix-intracellular loop interface. An exploratory approach is designed to generate a refined library of Hidden Markov Models that predict the coupling preference of GPCRs to heterotrimeric G-proteins. It predicts the coupling preferences of GPCRs to Gs, Gi/o and Gq/11, but not G12/13 subfamilies (<xref ref-type="bibr" rid="r29">Sgourakis et al., 2005</xref>).</p>
						<p>A Hidden Markov model library is designed for classifying protein kinases into 12 families. This classification is also coupled with a mis-classification rate of zero on the characterized kinomes of H. sapiens, M. musculus, D. melanogaster, C. elegans, S. cerevisiae, D. discoideum, and P. falciparum. This is applied to 38 unclassified kinases of yeast including AGC (5), CAMK (17), CMGC (4), and STE (1). It also facilitates the annotation of kinomes and provides data regarding early evolution and subsequent adaptations of the various protein kinase families (<xref ref-type="bibr" rid="r27">Miranda-Saavedra et al., 2007</xref>).</p>
				</sec>
				<sec>
					<title>Application to Gene Regulatory Networks</title>
						<p>Gene regulatory networks are modeled using probabilistic Boolean network methods and dynamic Bayesian network methods. These methods are compared using certain biological time-series dataset from the Drosophila Interaction Database for designing Drosophila gene network. Also, a subset of time points and gene samples from the whole dataset is used to evaluate the performance of these two approaches (<xref ref-type="bibr" rid="r22">Li et al., 2007</xref>).</p>
						<p>A hierarchical hidden Markov regression model is introduced for determination of gene regulatory networks from genomic sequence and gene expression microarray data. A hybrid Monte Carlo methodology is devised to estimate parameters under 2 classes of latent structure. One is arising due to the unobservable state identity of genes and the other is due to the unknown set of covariates influencing the response within a state (<xref ref-type="bibr" rid="r14">Gupta et al., 2007</xref>).</p>
						<p>A comparative gene predictor, called Conrad is proposed, based on semi-Markov conditional random fields (SMCRFs). It is trained to maximize annotation accuracy. It encodes information as features and treats all features equally in the training and inference algorithms. On Cryptococcus neoformans, configuring Conrad to reproduce the predictions of a two-species phylo-GHMM closely matches the performance of Twinscan. Furthermore, it produces similar results on Aspergillus nidulans comparing Conrad versus Fgenesh (<xref ref-type="bibr" rid="r7">DeCaprio et al., 2007</xref>).</p>
						<p>Hidden Markov Models are compared with genotyping to determine the transmission characteristics of sporadic vancomycin-resistant enterococci (VRE). For this, a structured continuous-time hidden Markov model (HMM) is developed. Two parameters are estimated, one to quantify the cross-transmission of VRE and the other to quantify the level of VRE colonization from sporadic sources. Some evidence is found, based on model selection criteria that the cross-transmission parameter changed throughout the study period. This model estimates that cross-transmission increases at week 120 and declines after week 135, coinciding with environmental decontamination. HMMs are also applied to serial prevalence data to estimate the characteristics of acquisition of nosocomial pathogens and distinguish between epidemic and sporadic acquisition (<xref ref-type="bibr" rid="r24">McBryde et al., 2007</xref>).</p>
				</sec>
		</sec>
		<sec>
			<title>Current Research</title>
				<p>Bayesian networks are used for predicting interaction partners using multiple alignments of interacting protein domains sequences without the need for any training examples. This also accurately predicts interaction partners in datasets of polyketide synthases. Also, analysis of the predicted genome- wide two-component signaling networks shows that interacting kinase/regulator pairs, which lie adjacent on the genome and which lie isolated form two relatively independent components of the signaling network in each genome (<xref ref-type="bibr" rid="r4">Burger et al., 2008</xref>).</p>
				<p>A hidden Markov model is used for predictive modeling of nuclear hormone receptor response elements coupled with chromatin microarray technology explains a binding site in the Type I human hepatic 3alpha-hydroxysteroid dehydrogenase (AKR1C4) promoter for the nuclear hormone receptor liver X receptor alpha. It also suggests that LXRalpha modulate the bile acid biosynthetic pathway at a unique site downstream of CYP7A1 (<xref ref-type="bibr" rid="r30">Stayrook et al., 2008</xref>).</p>
				<p>The probable state path of three nucleotides sequences of cis-regulatory region of target genes are identified using a Hidden Markov Model (HMM). These regions are key elements in the transcriptional regulation of gene expression. These computations are also used to predict C(2)H(2) zinc finger transcription factor binding sites in cis-regulatory regions of their target genes (<xref ref-type="bibr" rid="r5">Cho et al., 2008</xref>).</p>
				<p>Certain Markov matrix (MMM) values are used to characterize numerically 81 sequences of type III RNases and 133 proteins of a control group. Also one MMM-QSAR and one classic hidden Markov model (HMM) is developed based on the same data. The MMM-QSAR shows a discrimination power of RNAses from other proteins of 97.35% without using alignment, which is a result as good as for the known HMM techniques. Furthermore, the MMM-QSAR model predicts the new RNase III with the same accuracy as other classical alignment methods (<xref ref-type="bibr" rid="r2">Agüero-Chapín et al., 2008</xref>).</p>
		</sec>
		<sec>
			<title>Conclusion</title>
				<p>Graphical probabilistic models are of much importance in Systems biology, especially in analyzing and modeling biological networks. Bayesian Networks have large applications in almost every field of life science ranging from gene expression analysis, genetic/metabolic network analysis and pathway modeling. Gaussian Networks are applied to analyze various interaction networks like protein-protein, genegene and gene-protein. Pathway modeling is also done based on this method. Maximum Likelihood is used in phylogenetic estimates, study genetic cross-over, pathway modeling and gene expression analysis. Density Estimation is useful for certain immunological or clinical trials, metabolic network analysis and pathway modeling. Helmholtz Machine (HM) is used in studying metabolic activities of brain and nervous system. Latent Variable Models (LVM) is used for studying various regulatory networks, pathway modeling and gene expression profiles. Generative Topographic Mapping (GTM) is used in microarray analysis, gene expression level analysis and pathway modeling. Lastly, Hidden Markov Models (HMM) are used in protein structure analysis, sequence analysis, metabolic pathway analysis, gene expression analysis and promoter region identification.</p>
		</sec>	
	</body>
	<back>
		   <ref-list>
			<title>References</title>
			<ref id="r1">
				<citation citation-type="confproc">
					<person-group>
						<name>
							<surname>Agarwal</surname>
							<given-names>R</given-names>
						</name>
						<name>
							<surname>Bayardo</surname>
							<given-names>RJ</given-names>
						</name>
						<name>
							<surname>Srikant</surname>
							<given-names>R</given-names>
						</name>									
					</person-group>
					<year>2000</year>
					<article-title>Athena:Mining-Based Interactive Management of Text Database</article-title>
					<conf-name>Proceedings of the 7th International Conference on Extending Database Technology</conf-name>
					<source>Advances in Database Technology</source>					
					<fpage>365</fpage>
					<lpage>379</lpage>
				</citation>
			</ref>
			<ref id="r2">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>Agüero-Chapín</surname>
							<given-names>G</given-names>
						</name>
						<name>
							<surname>Gonzalez-Díaz</surname>
							<given-names>H</given-names>
						</name>
						<name>
							<surname>de la Riva</surname>
							<given-names>G</given-names>
						</name>
						<name>
							<surname>Rodríguez</surname>
							<given-names>E</given-names>
							</name>
							<name>
							<surname>Sanchez-Rodríguez</surname>
							<given-names>A</given-names>					
						</name><etal/>					
					</person-group>
					<year>2008</year>
					<article-title>MMM-QSAR recognition of ribonucleases without alignment: comparison with an HMM model and isolation from Schizosaccharomyces pombe, prediction,and experimental assay of a new sequence</article-title>
					<source>J Chem Inf Model</source>
					<volume>48</volume>
					<fpage>434</fpage>
					<lpage>448</lpage>
				</citation>
			</ref>
			<ref id="r3">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>Beyene</surname>
							<given-names>J</given-names>
						</name>
						<name>
							<surname>Tritchler</surname>
							<given-names>D</given-names>
						</name>
						<name>
							<surname>Bull</surname>
							<given-names>SB</given-names>
						</name>
						<name>
							<surname>Cartier</surname>
							<given-names>KC</given-names>
							</name>
							<name>
							<surname>Jonasdottir</surname>
							<given-names>G</given-names>					
						</name><etal/>					
					</person-group>
					<year>2007</year>
					<article-title>Multivariate analysis of complex gene expression and clinical phenotypes with genetic marker data</article-title>
					<source>Genet Epidemiol 31 Suppl</source>
					<volume>1</volume>
					<fpage>S103</fpage>
					<lpage>S109</lpage>
				</citation>
			</ref>
			<ref id="r4">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>Burger</surname>
							<given-names>L</given-names>
						</name>
						<name>
							<surname>van Nimwegen</surname>
							<given-names>E</given-names>
						</name>								
					</person-group>
					<year>2008</year>
					<article-title>Accurate prediction of protein-protein interactions from sequence alignments using a Bayesian method.</article-title>
					<source>Mol Syst Biol</source>
					<volume>4</volume>
					<fpage>165</fpage>				
				</citation>
			</ref>
			<ref id="r5">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>Cho</surname>
							<given-names>SY</given-names>
						</name>
						<name>
							<surname>Chung</surname>
							<given-names>M</given-names>
						</name>
						<name>
							<surname>Park</surname>
							<given-names>M</given-names>
						</name>	
						<name>
							<surname>Park</surname>
							<given-names>S</given-names>
						</name>	
						<name>
							<surname>Lee</surname>
							<given-names>YS</given-names>
						</name>									
					</person-group>
					<year>2008</year>
					<article-title>ZIFIBI: Prediction of DNA binding sites for zinc finger proteins</article-title>
					<source>Biochem Biophys Res Commun</source>
					<volume>369</volume>
					<fpage>845</fpage>
					<lpage>848</lpage>					
				</citation>
			</ref>
			<ref id="r6">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>Cormen</surname>
							<given-names>TH</given-names>
						</name>
						<name>
							<surname>Leiserson</surname>
							<given-names>CE</given-names>
						</name>
						<name>
							<surname>Rivest</surname>
							<given-names>RL</given-names>
						</name>	
						<name>
							<surname>Stein</surname>
							<given-names>C</given-names>
						</name>																	
					</person-group>
					<year>2001</year>
					<article-title>Introduction to Algorithms</article-title>
					<source>McGraw-Hill</source>								
				</citation>
			</ref>
			<ref id="r7">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>DeCaprio</surname>
							<given-names>D</given-names>
						</name>
						<name>
							<surname>Vinson</surname>
							<given-names>JP</given-names>
						</name>
						<name>
							<surname>Pearson</surname>
							<given-names>MD</given-names>
						</name>
						<name>
							<surname>Montgomery</surname>
							<given-names>P</given-names>
							</name>
							<name>
							<surname>Doherty</surname>
							<given-names>M</given-names>					
						</name><etal/>					
					</person-group>
					<year>2007</year>
					<article-title>Conrad: gene prediction using conditional random fields</article-title>
					<source>Genome Res</source>
					<volume>17</volume>
					<fpage>1389</fpage>
					<lpage>1398</lpage>
				</citation>
			</ref>
			<ref id="r8">
				<citation citation-type="confproc">
					<person-group>
						<name>
							<surname>Demetrescu</surname>
							<given-names>C</given-names>
						</name>
						<name>
							<surname>Emiliozzi</surname>
							<given-names>S</given-names>
						</name>
						<name>
							<surname>Italiano</surname>
							<given-names>GF</given-names>
						</name>									
					</person-group>
					<year>2004</year>
					<article-title>Experimental analysis of dynamic all pairs shortest path algorithms</article-title>
					<conf-name>In Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’04)</conf-name>					
				</citation>
			</ref>
			<ref id="r9">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>Demetrescu</surname>
							<given-names>C</given-names>
						</name>
						<name>
							<surname>Finocchi</surname>
							<given-names>I</given-names>
						</name>
						<name>
							<surname>Italiano</surname>
							<given-names>GF</given-names>
						</name>									
					</person-group>
					<year>2003</year>
					<article-title>Algorithm engineering</article-title>
					<source>Bulletin of the EATCS</source>
					<volume>79</volume>
					<fpage>48</fpage>
					<lpage>63</lpage>
				</citation>
			</ref>
			<ref id="r10">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>Deng</surname>
							<given-names>X</given-names>
						</name>
						<name>
							<surname>Geng</surname>
							<given-names>H</given-names>
						</name>
						<name>
							<surname>Ali</surname>
							<given-names>HH</given-names>
						</name>									
					</person-group>
					<year>2006</year>
					<article-title>Joint learning of gene functions—a Bayesian network model approach</article-title>
					<source>J Bioinform Comput Biol</source>
					<volume>4</volume>
					<fpage>217</fpage>
					<lpage>239</lpage>
				</citation>
			</ref>
			<ref id="r11">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>Estivill-Castro</surname>
							<given-names>V</given-names>
						</name>
						<name>
							<surname>Houle</surname>
							<given-names>ME</given-names>
						</name>													
					</person-group>
					<year>2001</year>
					<article-title>Data Structures for Minimization of Total Within-Group Distance for Spatiotemporal Clustering</article-title>
                     <conf-name>Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery</conf-name>					
					<fpage>91</fpage>
					<lpage>102</lpage>
				</citation>
			</ref>
			<ref id="r12">
				<citation citation-type="confproc">
					<person-group>
						<name>
							<surname>Estivill-Castro</surname>
							<given-names>V</given-names>
						</name>
						<name>
							<surname>Houle</surname>
							<given-names>ME</given-names>
						</name>								
					</person-group>					
					<edition>J. Fong and M. Ng</edition>
					<year>2001</year>
					<article-title>Fast minimization of total within-group distance</article-title>
					<conf-name>Proceedings of the International Workshop on Mining Spatial and Temporal Data in conjunction with the fifth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-2001)</conf-name>					
					<fpage>72</fpage>
					<lpage>81</lpage>
				</citation>
			</ref>
			<ref id="r13">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>Goutsias</surname>
							<given-names>J</given-names>
						</name>							
					</person-group>
					<year>2006</year>
					<article-title>A hidden Markov model for transcriptional regulation in single cells</article-title>
					<source>IEEE/ACM Trans Comput Biol Bioinform</source>
					<volume>3</volume>
					<fpage>57</fpage>
					<lpage>71</lpage>
				</citation>
			</ref>
			<ref id="r14">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>Gupta</surname>
							<given-names>M</given-names>
						</name>
						<name>
							<surname>Qu</surname>
							<given-names>P</given-names>
						</name>
						<name>
							<surname>Ibrahim</surname>
							<given-names>JG</given-names>
						</name>							
					</person-group>
					<year>2007</year>
					<article-title>A temporal hidden Markov regression model for the analysis of gene regulatory networks</article-title>
					<source>Biostatistics</source>
					<volume>8</volume>
					<fpage>805</fpage>
					<lpage>820</lpage>
				</citation>
			</ref>
			<ref id="r15">
				<citation citation-type="book">
					<person-group>
						<name>
							<surname>Han</surname>
							<given-names>J</given-names>
						</name>
						<name>
							<surname>Kamber</surname>
							<given-names>M</given-names>
						</name>											
					</person-group>
					<year>2000</year>
					<source>Data mining: concepts and techniques</source>
					<publisher-name>Morgan Kaufmann Publishers Inc</publisher-name>				
				</citation>
			</ref>
			<ref id="r16">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>Hall</surname>
							<given-names>I</given-names>
						</name>
						<name>
							<surname>Özyurt</surname>
							<given-names>LO</given-names>
						</name>
						<name>
							<surname>Bezdek</surname>
							<given-names>J</given-names>
						</name>								
					</person-group>
					<year>1999</year>
					<article-title>Clustering with a genetically optimized approach</article-title>
					<source>IEEE Transactions on Evolutionary Computation</source>
					<volume>3</volume>
					<fpage>103</fpage>
					<lpage>112</lpage>
				</citation>
			</ref>
			<ref id="r17">
				<citation citation-type="confproc">
					<person-group>
						<name>
							<surname>Hu</surname>
							<given-names>S</given-names>
						</name>											
					</person-group>
					<year>2004</year>
					<article-title>Optimal time points sampling in pathway modeling</article-title>
					<conf-name>Conf Proc IEEE Eng</conf-name>
					<source>Med Biol Soc</source>
					<volume>1</volume>
					<fpage>671</fpage>
					<lpage>674</lpage>
				</citation>
			</ref>
			<ref id="r18">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>Jin</surname>
							<given-names>S</given-names>
						</name>
						<name>
							<surname>Zhang</surname>
							<given-names>Y</given-names>
						</name>
						<name>
							<surname>Yi</surname>
							<given-names>F</given-names>
						</name>
						<name>
							<surname>Li</surname>
							<given-names>PL</given-names>
							</name>										
					</person-group>
					<year>2008</year>
					<article-title>Critical role of lipid raft redox signaling platforms in endostatin-induced coronary 
endothelial dysfunction</article-title>
					<source>Arterioscler Thromb Vasc Biol</source>
					<volume>28</volume>
					<fpage>485</fpage>
					<lpage>490</lpage>
				</citation>
			</ref>
			<ref id="r19">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>Joung</surname>
							<given-names>JG</given-names>
						</name>
						<name>
							<surname>Shin</surname>
							<given-names>D</given-names>
						</name>
						<name>
							<surname>Seong</surname>
							<given-names>RH</given-names>
						</name>
						<name>
							<surname>Zhang</surname>
							<given-names>BT</given-names>
							</name>										
					</person-group>
					<year>2006</year>
					<article-title>Identification of regulatory modules by co-clustering latent variable models: stem cell ifferentiation</article-title>
					<source>Bioinformatics</source>
					<volume>22</volume>
					<fpage>2005</fpage>
					<lpage>1011</lpage>
				</citation>
			</ref>
			<ref id="r20">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>Justenhoven</surname>
							<given-names>C</given-names>
						</name>
						<name>
							<surname>Hamann</surname>
							<given-names>U</given-names>
						</name>
						<name>
							<surname>Schubert</surname>
							<given-names>F</given-names>
						</name>
						<name>
							<surname>Zapatka</surname>
							<given-names>M</given-names>
							</name>	
							<name>
							<surname>Pierl</surname>
							<given-names>CB</given-names>
							</name><etal/>										
					</person-group>
					<year>2008</year>
					<article-title>Breast cancer: a candidate gene approach across the estrogen metabolic pathway</article-title>
					<source>Breast Cancer Res Treat</source>
					<volume>108</volume>
					<fpage>137</fpage>
					<lpage>149</lpage>
				</citation>
			</ref>
			<ref id="r21">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>Levitsky</surname>
							<given-names>VG</given-names>
						</name>
						<name>
							<surname>Ignatieva</surname>
							<given-names>EV</given-names>
						</name>
						<name>
							<surname>Ananko</surname>
							<given-names>EA</given-names>
						</name>
						<name>
							<surname>Turnaev</surname>
							<given-names>II</given-names>													
						</name>	
						<name>
							<surname>Merkulova</surname>
							<given-names>TI</given-names>
						</name><etal/>										
					</person-group>
					<year>2007</year>
					<article-title>Effective transcription factor binding site prediction using a combination of optimization, a genetic algorithm and discriminant analysis to capture distant interactions</article-title>
					<source>BMC Bioinformatics</source>
					<volume>8</volume>
					<fpage>481</fpage>				
				</citation>
			</ref>
			<ref id="r22">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>Li</surname>
							<given-names>P</given-names>
						</name>
						<name>
							<surname>Zhang</surname>
							<given-names>C</given-names>
						</name>
						<name>
							<surname>Perkins</surname>
							<given-names>EJ</given-names>
						</name>
						<name>
							<surname>Gong</surname>
							<given-names>P</given-names>
							</name>	
							<name>
							<surname>Deng</surname>
							<given-names>Y</given-names>
							</name>									
					</person-group>
					<year>2007</year>
					<article-title>Comparison of probabilistic Boolean network and dynamic Bayesian network approaches for inferring gene regulatory networks</article-title>
					<source>BMC Bioinformatics 8 Suppl</source>
					<volume>7</volume>
					<fpage>S13</fpage>				
				</citation>
			</ref>
			<ref id="r23">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>Marashi</surname>
							<given-names>SA</given-names>
						</name>
						<name>
							<surname>Kargar</surname>
							<given-names>M</given-names>
						</name>
						<name>
							<surname>Katanforoush</surname>
							<given-names>A</given-names>
						</name>
						<name>
							<surname>Abolhassani</surname>
							<given-names>H</given-names>
							</name>	
							<name>
							<surname>Sadeghi</surname>
							<given-names>M</given-names>
							</name>									
					</person-group>
					<year>2007</year>
					<article-title>Evolution of ‘ligand-diffusion chreodes’ on protein-surface models: a genetic-algorithm study</article-title>
					<source>Chem Biodivers</source>
					<volume>4</volume>
					<fpage>2766</fpage>	
					<lpage>2771</lpage>				
				</citation>
			</ref>
			<ref id="r24">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>McBryde</surname>
							<given-names>ES</given-names>
						</name>
						<name>
							<surname>Pettitt</surname>
							<given-names>AN</given-names>
						</name>
						<name>
							<surname>Cooper</surname>
							<given-names>BS</given-names>
						</name>
						<name>
							<surname>McElwain</surname>
							<given-names>DL</given-names>
							</name>															
					</person-group>
					<year>2007</year>
					<article-title>Characterizing an outbreak of vancomycin-resistant enterococci using hidden Markov models 		</article-title>
					<source>J R Soc Interface</source>
					<volume>4</volume>
					<fpage>745</fpage>	
					<lpage>754</lpage>				
				</citation>
			</ref>
			<ref id="r25">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>McKinney</surname>
							<given-names>BA</given-names>
						</name>
						<name>
							<surname>Crowe</surname>
							<given-names>JE</given-names>
						</name>
						<name>
							<surname>Voss</surname>
							<given-names>HU</given-names>
						</name>
						<name>
							<surname>Crooke</surname>
							<given-names>PS</given-names>
							</name>
						<name>
							<surname>Barney</surname>
							<given-names>N</given-names>
						</name><etal/>																
					</person-group>
					<year>2006</year>
					<article-title>Hybrid grammar-based approach to nonlinear dynamical system identification from biological
time series</article-title>
					<source>Phys  Rev  E Stat  Nonlin  Soft  Matter  Phys</source>
					<volume>73</volume>
					<fpage>021912</fpage>							
				</citation>
			</ref>
			<ref id="r26">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>Mirabeau</surname>
							<given-names>O</given-names>
						</name>
						<name>
							<surname>Perlas</surname>
							<given-names>E</given-names>
						</name>
						<name>
							<surname>Severini</surname>
							<given-names>C</given-names>
						</name>
						<name>
							<surname>Audero</surname>
							<given-names>E</given-names>
							</name>
						<name>
							<surname>Gascuel</surname>
							<given-names>O</given-names>
						</name><etal/>																
					</person-group>
					<year>2007</year>
					<article-title>Identification of novel peptide hormones in the human proteome by hidden Markov model screening</article-title>
					<source>Genome Res</source>
					<volume>17</volume>
					<fpage>320</fpage>
					<lpage>327</lpage>							
				</citation>
			</ref>
			<ref id="r27">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>Miranda</surname>
							<given-names>SD</given-names>
						</name>
						<name>
							<surname>Barton</surname>
							<given-names>GJ</given-names>
						</name>																				
					</person-group>
					<year>2007</year>
					<article-title>Classification and functional annotation of eukaryotic protein kinases</article-title>
					<source>Proteins</source>
					<volume>68</volume>
					<fpage>893</fpage>
					<lpage>914</lpage>							
				</citation>
			</ref>
			<ref id="r28">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>Peer</surname>
							<given-names>D</given-names>
						</name>																							
					</person-group>
					<year>2005</year>
					<article-title>Bayesian network analysis of signaling networks: a primer</article-title>
					<source>Sci STKE</source>
					<volume>2005</volume>
					<fpage>p14</fpage>										
				</citation>
			</ref>
			<ref id="r29">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>Sgourakis</surname>
							<given-names>NG</given-names>
						</name>
						<name>
							<surname>Bagos</surname>
							<given-names>PG</given-names>
						</name>
						<name>
							<surname>Papasaikas</surname>
							<given-names>PK</given-names>
						</name>	
						<name>
							<surname>Hamodrakas</surname>
							<given-names>SJ</given-names>
						</name>																									
					</person-group>
					<year>2005</year>
					<article-title>A method for the prediction of GPCRs coupling specificity to G-proteins using refined profile
Hidden Markov Models</article-title>
					<source>BMC Bioinformatics</source>
					<volume>6</volume>
					<fpage>104</fpage>										
				</citation>
			</ref>
			<ref id="r30">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>Stayrook</surname>
							<given-names>KR</given-names>
						</name>
						<name>
							<surname>Rogers</surname>
							<given-names>PM</given-names>
						</name>
						<name>
							<surname>Savkur</surname>
							<given-names>RS</given-names>
						</name>	
						<name>
							<surname>Wang</surname>
							<given-names>Y</given-names>
						</name>	
						<name>
							<surname>Su</surname>
							<given-names>C</given-names>
						</name><etal/>																									
					</person-group>
					<year>2008</year>
					<article-title>Regulation of human 3 alpha-hydroxysteroid dehydrogenase (AKR1C4) expression by the liver X receptor alpha</article-title>
					<source>Mol Pharmacol</source>
					<volume>73</volume>
					<fpage>607</fpage>	
					<lpage>612</lpage>										
				</citation>
			</ref>
			<ref id="r31">
				<citation citation-type="journal">
					<person-group>
						<name>
							<surname>Tonella</surname>
							<given-names>P</given-names>
						</name>																													
					</person-group>
					<year>2001</year>
					<article-title>Concept analysis for module restructuring</article-title>
					<source>IEEE Transactions on Software Engineering</source>
					<volume>27</volume>
					<fpage>351</fpage>	
					<lpage>363</lpage>										
				</citation>
			</ref>
</ref-list>		 
	</back>
</article>
