Accompanied by a fundamental need for mathematical frameworks that can compare multiple scale matrices

For example, comparative analyses of global mRNA expression from multiple model organisms promise to enhance fundamental understanding of the universality and specialization of molecular biological mechanisms, and may prove useful in medical diagnosis, treatment and drug design. Existing algorithms limit analyses to subsets of homologous genes among the different organisms, effectively introducing into the analysis the assumption that sequence and functional similarities are equivalent. However, it is well known that this assumption does not always hold, for example, in cases of nonorthologous gene displacement, when nonorthologous proteins in different organisms fulfill the same function. For sequence-independent comparisons, mathematical frameworks are required that can distinguish and separate the similar from the dissimilar among multiple large-scale datasets tabulated as matrices with different row dimensions, corresponding to the different sets of genes of the different organisms. The only such framework to date, the generalized singular value decomposition, is limited to two matrices. It was shown that the GSVD provides a mathematical framework for sequence-independent comparative modeling of DNA microarray data from two organisms, where the mathematical variables and operations represent biological reality. The variables, significant BAY 73-4506 subspaces that are common to both or exclusive to either one of the datasets, correlate with cellular programs that are conserved in both or unique to either one of the organisms, respectively. The operation of reconstruction in the subspaces common to both datasets outlines the biological similarity in the regulation of the cellular programs that are conserved across the species. Reconstruction in the common and exclusive subspaces of either dataset outlines the differential regulation of the conserved relative to the unique programs in the corresponding organism. Recent experimental results verify a computationally predicted genome-wide mode of regulation that correlates DNA replication origin activity with mRNA expression, demonstrating that GSVD modeling of DNA microarray data can be used to correctly predict previously unknown cellular mechanisms. Unlike existing algorithms, a mapping among the genes of these disparate organisms is not required. We find that the common HO GSVD subspace represents the cell-cycle mRNA expression oscillations, which are similar among the datasets. Simultaneous reconstruction in this common subspace, therefore, removes the experimental artifacts, which are dissimilar, from the datasets. Simultaneous sequenceindependent classification of the genes of the three organisms in the common subspace is in agreement with previous classifications into cell-cycle phases. Notably, genes of highly conserved sequences across the three organisms but significantly different cellcycle peak times, such as genes from the ABC transporter superfamily, phospholipase B-encoding genes and even the B cyclin-encoding genes.

Leave a Reply

Your email address will not be published.