Source: UNIVERSITY OF ILLINOIS submitted to
INTERPRETING CATTLE GENOMIC DATA: BIOLOGY, APPLICATIONS AND OUTREACH
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
TERMINATED
Funding Source
Reporting Frequency
Annual
Accession No.
0197783
Grant No.
(N/A)
Project No.
ILLU-538-354
Proposal No.
(N/A)
Multistate No.
NC-1010
Program Code
(N/A)
Project Start Date
Oct 1, 2002
Project End Date
Sep 30, 2007
Grant Year
(N/A)
Project Director
Rodriguez-Zas, S. L.
Recipient Organization
UNIVERSITY OF ILLINOIS
2001 S. Lincoln Ave.
URBANA,IL 61801
Performing Department
ANIMAL SCIENCES
Non Technical Summary
The study of gene expression profiles presents technical, biological and analytical challenges. We aim to offer solutions that will enhance the characterization of gene expression patterns.
Animal Health Component
(N/A)
Research Effort Categories
Basic
100%
Applied
(N/A)
Developmental
(N/A)
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
30434991080100%
Knowledge Area
304 - Animal Genome;

Subject Of Investigation
3499 - Dairy cattle, general/other;

Field Of Science
1080 - Genetics;
Goals / Objectives
1. Determine the location, structure, function and expression of genes affecting health, reproduction, production, and product quality in cattle. 2. Interpret and apply genomics and proteomics information by developing statistical/bioinformatics methods and utilizing molecular tools in cattle. 3. Develop and deliver educational materials about bovine genomics research to consumers and stakeholders.
Project Methods
Our research addresses the NC-1010 objective 1 through the study of the expression of thousands of genes in multiple cattle tissues. Objective 2 was attended to by two means; we implement statistical methods suitable for the data structure observed in gene expression studies and use bioinformatic tools to enhance the interpretation of the results. In particular, we recorded transcript levels from rumen, large intestine, small intestine and reference samples in cDNA microarrays including 7,653 cattle and multiple control sequences. Intensity measurements from a total of six arrays in a dye-swap experiment were analyzed. After data filtering and transformation, the remaining variation was analyzed with a linear mixed effects model. A total of 661 sequences were significant at P<0.0001. The gene expression profiles of rumen, small and large intestine confirmed known differences in function among the tissues and uncovered novel relationships among genes.

Progress 10/01/02 to 09/30/07

Outputs
OUTPUTS: The outputs of our work can be grouped into two main categories: 1) Determination of the function and expression of genes affecting health and reproduction in cattle, and 2) Statistical and bioinformatics tools that provide accurate and precise results, thus supporting the effective application of genomics and proteomics information to the improvement of U.S. agriculture. Among the outputs associated with the determination of the function and expression of genes affecting cattle health, we identified 218 cDNAs that were differentially expressed across diets (NRC requirements, restricted energy, ad libitum diets during the dry period), 66 cDNAs were differentially expressed across time-points (-65, -30, -14, 0, 14, 27, 49 days peripartum) and 1,437 cDNAs were differentially expressed across diet by time-point interaction levels. Among these, multiple cDNAs mapped to genes with transcription factor, metabolic, oxi-reductase activity and inflammatory response function were differentially expressed across time-points and diets. More than 29 cDNAs with nucleo (DNA, RNA, ATP, GTP, etc) -binding and transcription factor Gene Ontology function and 8 cDNAs with oxi-reductase activity had significant differences in expression across time-points and diets. When considering the biological processes, 4 cDNAs with innate immune-response, 18 cDNAs that regulate transcription, 9 cDNAs with lipid metabolism and 8 cDNAs with carbohydrate metabolism function were differentially expressed across time-points and diets. Specific bioinformatic outputs include: a) identification of a comprehensive yet parsimonious linear mixed effects model that appropriately described the complex structure of this data set, b) objective evaluation of the performance of bovine cDNA and long-olignucleotide microarray platforms, c) assessment of the impact of various sources of variation, including subject-to-subject variation, array variation, and treatment differences on microarray experiments. Both sets of outputs are closely connected because the study of approaches that allow the simultaneous modeling of multiple sources of variation in gene expression microarray studies was key to identify genes with linear and non-linear expression patterns across lactation stages and diets. We discovered that diet and lactation stage can have both additive and synergistic impact on the transcriptome of cattle. The same bioinformatics were also critical to obtain an objective comparison of the performance of two commonly used types of gene expression microarray platforms, oligo and cDNA two-dye spotted microarrays. Results from other study demonstrated that for the majority of the genes, oligo and cDNA probes offered comparable and correlated measurements of gene expression. The higher level of variation observed in olio-based microarray measurements is likely to be associated with more variable hybridization of the target to oligo probes. PARTICIPANTS: Sandra Rodriguez-Zas and Harris Lewin. TARGET AUDIENCES: Scientists doing gene expression experiments and researchers responsible for modeling and analysis of microarray data.

Impacts
Microarray technology is a well-established tool to simultaneously measure the expression of thousands of different mRNA molecules. The impact of our research is many fold. First, our study demonstrated that there is a good agreement between the cDNA and oligo bovine microarray platforms studied. The consistency of the results can be further enhanced by concentrating on the results from elements with high differential expression between samples. The differences between platforms may be associated with the differences in the actual genomic sequences portrayed. Data from complementary technologies (e.g. RT-PCR) could help identify whether there is a preferred platform across all elements or whether cDNA and oligo reporters have complementary features that can benefit from the simultaneous consideration of results from both platforms. Second, by using complex models we minimized the bias and maximize the accuracy of our results about the association between physiological stage, nutrition level and gene expression in the bovine liver. Third, a better understanding of the genes with differential expression across physiological stages and diets and characterization of the patterns of gene expression in cattle were obtained. Our results further enhance the understanding of hepatic function and can be used as starting point for more focused studies of the cattle liver transcriptome, metabolome and, proteome. The results from these studies, in term, can be used to develop management and nutrition protocols that will optimize the function of the liver and thus bovine production and health. Liver function has a major influence on production, growth, health, and many other traits of importance for the cattle industry. The liver controls food metabolism, toxin filtering, drug processing, and active bio-molecule production. Multiple factors are known to affect the function of the liver, however there is still an incomplete understanding of the interplay of these factors and their influence on the level of gene expression. The simultaneous study of the effects of multiple factors on the bovine hepatic transcriptome requires the collection of large and complex microarray data sets. Appropriate analyses of these data sets require complex models that account for the numerous sources of variation.

Publications

  • Loor, J.J., Everts, R.E., Bionaz, M., Dann,H.M., Morin, D.E., Oliveira, R., Rodriguez-Zas, S.L., Drackley, J.K. and Lewin, H.A. 2007. Nutrition-induced ketosis alters metabolic and signaling gene networks in liver of periparturient dairy cows. Physiological Genomics 2007 Oct 9 [Epub ahead of print].
  • Ko, Y., Zhai, C. and Rodriguez-Zas, S.L. 2007b. Inference of gene pathways using Gaussian mixture models. In: Proceedings of the 2007 IEEE International Conference on Bioinformatics and Biomedicine. San Jose, CA, Nov 2-4 2007.
  • Ko, Y., Zhai, C. and Rodriguez-Zas, S.L. 2007a An efficient mixture model approach to characterize gene pathways using Bayesian networks. In: Proceedings of the Joint Statistical Meetings. July 28 - August 2, 2007. Alexandria, VA. American Statistical Association (In Press).
  • Everts, R.E., Sommers, A., Green, C.A., Oliveira, R., Rodriguez-Zas, S.L., Sung, L.-Y., Du, F., Evans, A.C.O., Boland, M., Fair, T., Lonergan, P., Renard, J.P., Yang, X., Tian, X. and Lewin, H.A. 2007. Major differences in gene expression profiles revealed in day-25 trophoblast but not embryonic discs collected from cattle clones. In: Proceedings of International Meeting on Mammalian Embryo Genomics, October 17-20, Paris, France.
  • Adams, H.A., Rodriguez-Zas, S.L. and Southey, B.R. 2007. Comparison of meta-analytical approaches for gene expression profiling. In: Proceedings of the Joint Statistical Meetings. Salt Lake City, UT. July 28 - August 2, 2007. American Statistical Association (In Press).


Progress 01/01/06 to 12/31/06

Outputs
We identified a comprehensive yet parsimonious linear mixed effects model that appropriately described the complex structure of this data set. Data: Liver samples obtained at 7 time points pre- and post-calving (-60, -30, -14, 1, 14, 28, 49 days from parturition) from Holstein multiparous cows receiving one of three diets. Five control cows received NRC requirements during the dry period, 4 cows were fed restricted energy and 4 were fed ad libitum. Gene expression measurements were obtained using a bovine double-spotted 7k cDNA microarray (NCBI GEO platform GPL2108). Method: A model including the fixed classification effects of dye, time-point or days from parturition, nutritional plane or diet (control, ad libitum and restricted intake) was used. The variance-covariance structure of the repeated measurements within a subject (cow) was autoregressive of order 1 to model the correlation between successive measurements. The model also incorporated heterogeneity of variance-covariance structure across time-points and among dyes and duplicated features. Gene expression differentials among diets, time-points and interaction levels were considered significant if they surpassed raw P-value < 10-4, the minimum absolute value fold change among factor levels was 2 and test statistics were based on at least 75% of the experimental units. The stringent P-value threshold was used to account for multiple testing among thousands of cDNAs. The minimum fold threshold ensured that there was a substantial change in gene expression across factor levels. The minimum experimental unit count threshold ensured that the estimates were based on a substantial number of units. Hierarchical and k-means clustering were used to identify cDNAs and samples with common expression patterns across diets and time-point levels. Results from selected cDNAs are being confirmed using qPCR. A total of 218 cDNAs were differentially expressed across diets, 66 cDNAs were differentially expressed across time-points and 1,437 cDNAs were differentially expressed across diet by time-point interaction levels. There was complete overlap between the cDNAs differentially expressed for the main effects and interaction. Multiple cDNAs corresponding to genes with transcription factor, metabolic, oxi-reductase activity and inflammatory response function were differentially expressed across time-points and diets. More than 29 cDNAs with nucleo (DNA, RNA, ATP, GTP, etc) -binding and transcription factor Gene Ontology function and 8 cDNAs with oxi-reductase activity had significant differences in expression across time-points and diets. When considering the biological processes, 4 cDNAs with innate immune-response, 18 cDNAs that regulate transcription, 9 cDNAs with lipid metabolism and 8 cDNAs with carbohydrate metabolism function were differentially expressed across time-points and diets.

Impacts
Hepatic function has a major influence on production, growth, health, and many other traits of importance for the cattle industry. The liver controls food metabolism, toxin filtering, drug processing, and active bio-molecule production. Multiple factors are known to affect the function of the liver, however there is still an incomplete understanding of the interplay of these factors and their influence on the level of gene expression. The simultaneous study of the effects of multiple factors on the bovine hepatic transcriptome requires the collection of large and complex microarray data sets. Appropriate analyses of these data sets require complex models that account for the numerous sources of variation. The impact of our research is two fold. First, by using complex models we minimized the bias and maximize the accuracy of our results about the association between physiological stage, nutrition level and gene expression in the bovine liver. Second, a better understanding of the genes with differential expression across physiological stages and diets and characterization of the patterns of gene expression in cattle was obtained. Our results further enhance the understanding of liver function and can be used as a starting point for more focused studies of the bovine hepatic transcriptome, metabolome,and proteome. The results from these studies, in term, can be used to develop management and nutrition protocols that will optimize the function of the liver and thus bovine production and health.

Publications

  • Loor J.J., Dann H.M., Janovick Guretzky, N.A., Everts R.E., Oliveira R., Green C.A., Litherland N.B., Rodriguez-Zas S.L., Lewin H.A. and Drackley J.K. 2006. Plane of nutrition pre-partum alters hepatic gene expression and function in dairy cows as assessed by longitudinal transcript and metabolic profiling. Physiol Genomics. 2006 Jun 6; [Epub ahead of print]. http://physiolgenomics.physiology.org/cgi/reprint/00036.2006v1.
  • Rodriguez-Zas, S.L., Southey, B.R., Whitfield, C.W. and Robinson, G.E. 2006. Semiparametric approach to characterize unique gene expression trajectories across time. BMC Genomics 7(1):233 [Epub ahead of print].
  • Loor, J.J. 2006. Gene expression profiling in bovine liver and mammary gland during the production cycle using cattle-specific microarrays. European Association for Animal Production (EAAP) Antalya, Turkey, September 17-20, 2006 Ph7 # 4.
  • Tegge, A.N., Southey, B.R., Andinet, A., Sweedler, J.V. and Rodriguez-Zas, S.L. 2006. Bioinformatics analysis of bovine neuropeptides. J. Anim. Sci. Vol. 84, Suppl. 1 / J. Dairy Sci. Vol. 89, Suppl. 1. T21 168-169pp.
  • Hong F. and Rodriguez-Zas, S.L. 2006. Bayesian Markov Chain Monte Carlo and Restricted Maximum likelihood study of gene expression patterns across time. 2006 Joint Statistical Meetings Seattle, WA. 62 CC 205.


Progress 01/01/05 to 12/31/05

Outputs
The objectives of this study were to compare the performance of bovine cDNA and long-olignucleotide platforms, and to evaluate the impact of various sources of variation, including subject-to-subject variation, array variation, and treatment differences. The two platforms studied were a double spotted 7,872 element cDNA array (Everts et al 2005) and a double spotted 13,257 element long (70-mere) oligo array (NCBI GEO GPL2853). This study only considered the 4,791 elements with one-to-one matches between platforms. Two experiments were analyzed. In experiment 1, the platforms were compared using a simple experiment comprised of 5 cows to minimize the sources of variation. In experiment 2, the platforms were compared using a more complex experiment with 5 ketosis and 5 control cows to evaluate the platform performance in a more realistic scenario. A reference sample was used and data preparation included the removal of dubious features. Loess normalization and scaling were applied and the normalized ratios were analyzed with a mixed effect model including dye, treatment (ketosis and control in experiment 2) cow and array effects. In experiment 1, the correlation between microarrays with reverse labeled samples ranged between 0.93 and 0.95 and 0.83 and 0.85 for the cDNA and oligo microarrays, respectively. The correlation between log2 sample-to-reference ratio between platforms ranged from 0.71 to 0.74. The oligo elements tended to exhibit more extreme ratios suggesting a higher sensitivity and specificity to report the abundance of mRNA than the cDNA platform. The higher standard error of the oligo estimates suggests that these intensities are more variable. The correlation between cDNA and oligo arrays ranged from 0.76 to 0.80 for all cows for the elements with log2-ratios > 0.25 or < -0.25. This result suggests that the consistency across platforms increases with increased differences in transcriptome abundance between co-hybridized samples. In experiment 2, the correlation between log2 sample-to-reference ratios between platforms were 0.70 and 0.75 for the control and ketosis samples, respectively. Only 19 elements (1%) were differentially expressed at P-value < 0.001 in both platforms, and 12% were nonsignificant in one platform. The correlation between platforms was 0.76 and 0.80 for the control-to-reference and ketosis-to-reference log2-ratios, respectively. For log2-ratio thresholds more extreme than 0.5, the correlation between platforms increased to 0.80 to 0.86 for the control and ketosis log2-ratios. The distribution of the differences (P-value < 0.01) between log2-ratio of control and ketosis samples suggest that the detected differences are consistent between platforms. In agreement with previous studies, all except one of the Cytochrome oxydases studied were over expressed in the normal samples and the majority of the NADH dehydrogenases were over expressed in the normal samples compared to ketosis samples. Independent information about the mRNA levels (e.g. RT-PCR) is required to identify the platform that provides less biased and more consistent results.

Impacts
Microarray technology is a well-established tool to simultaneously measure the expression of thousands of different mRNA molecules. Various microarray platforms are available. Our study demonstrated that there is a good agreement between the cDNA and oligo bovine microarray platforms. The consistency of the results can be further enhanced by concentrating on the results from elements with high differential expression between samples. The differences between platforms may be associated with the differences in the actual genomic sequences portrayed. Data form complementary technology (e.g. RT-PCR) could help identify whether there is a preferred platform across all elements or whether cDNA and oligo reporters have complementary features that can benefit from the simultaneous consideration of results from both platforms.

Publications

  • Everts, R.E., Band, M.R., Liu, Z.L., Kumar, C.G., Liu, L., Loor, J.J., Oliveira R. and Lewin, H.A. 2005. A 7872 cDNA microarray and its use in bovine functional genomics. Vet. Immunol. Immunopathol. 105:235-245.
  • Loor J.J., Dann, H.M., Everts, R.E., Oliveira, R., Green, C.A., Guretzky, N.A., Rodriguez-Zas, S.L., Lewin, H.A. and Drackley, J.K. 2005. Temporal gene expression profiling of liver from periparturient dairy cows reveals complex adaptive mechanisms in hepatic function. Physiol. Genomics 23:217-226.
  • Smith, S.L., Everts, R.E., Tian, X.C., Du, F., Sung, L.-Y., Rodriguez-Zas, S.L., Seon Jeong, B., Renard, J.P. Lewin, H.A. and Yang, X. 2005. Global gene expression profiles reveal significant nuclear reprogramming by the blastocyst stage after cloning. PNAS. 102 17582-17587.
  • Everts, R.E., Chavatte-Palmer, P., Razzak, A., Hue, I., Rodriguez-Zas, S., Cindy Tian, X., Yang, X., Renard, J.P. and Lewin, H.A. 2005. Global gene-expression profiling of placentomes of term pregnancies of artificially inseminated, in vitro fertilized and nuclear transfer-derived cattle. Plant and Animal Genome XIII Conference January 15-19, 2005. San Diego, CA P699. http://www.intl-pag.org/13/abstracts/.
  • Mesquita, F.S., Robl, J.M., Kasinathan, P., Rodriguez-Zas, S.L. and Nowak, R.A. 2005. A comparison of the placental gene expression profile in chromatin transfer and in vitro fertilization-derived bovine fetuses during early gestation. The Society for the Study of Reproduction. 38th Annual Meeting, July 24-27, 2005, Quebec City, Quebec, CA. http://abstracts.co.allenpress.com/pweb/ssr2005/program/. #83.
  • Loor, J.J., Everts, R.E., Dann, H.M., Morin, D.E., Rodriguez-Zas, S.L., Lewin, H.A. and Drackley, J.K. 2005. Hepatic gene expression profiling in cows with early postpartum ketosis using a bovine 13,000 oligonucleotide microarray. J. Anim. Sci. 83(Suppl. 1): abs. # 195.
  • Loor, J.J., Piperova, L., Everts, R.E., Rodriguez-Zas, S.L., Drackley, J.K., Erdman, R.A. and Lewin, H.A. 2005. Mammary gene expression profiling in cows fed a milk fat-depressing diet using a bovine 13,000 oligonucleotide microarray. J. Anim. Sci. 83(Suppl. 1): abs. # 196.


Progress 01/01/04 to 12/31/04

Outputs
The influence of different normalization and parametric models in the detection of genes differentially expressed across conditions using cDNA microarrays was evaluated. Fluorescence intensities were recorded in three cattle tissues (rumen, large intestine and small intestine) using cDNA microarrays, each including over 7,000 (double-spotted) cattle ESTs. A total of six microarrays were used in a reference design with reverse labeling. The fluorescence intensity normalization methods used were Log2 transformation, Loess transformation, linlog transformation and the combination of Loess and linlog transformations. Normalized data were then analyzed using three response variable-model combinations, tissue versus reference intensity (RATIO), reference intensity treated as a response variable (ABSOLUTE) and reference intensity included in the model as a covariate (COVARIATE). A linear mixed effects model including the effects of array, dye, gene, and gene by tissue was used. Across all normalization methods, the RATIO and COVARIATE models provided a similar fit and the ABSOLUTE model provided the worst fit. The ABSOLUTE model provided the most significant sequences and the COVARIATE model the fewest significant sequences. The log2 transformation provided the worst fit and the combined transformation provided the best fit across all the models studied. The combination transformation provided the most significant results and the linlog provided the fewest. We also evaluated the potential of random regression models to describe the fluctuations in the gene transcription levels recorded at successive time points. The data consisted of fluorescence intensities on more than 6,000 unique genes recorded using spotted cDNA microarray technology. Liver samples were obtained at -65d, -30d, -14d, +1d, +14d, +28d and +49d relative to calving on 8 Holstein cows. A reference design was implemented with each cow-day sample present in two reverse-dye microarrays and each gene double spotted on each microarray. Fluorescence intensity measurements on 106 microarrays were Loess-normalized. The random regression model included linear to quartic polynomials on days and accounted for heteroscedasticity between days. Three percent of the genes had at least one significant (P < 0.0001) regression coefficient. The majority of these genes had significant quadratic trends alone or in combination with a significant quartic trend. Hierarchical and disjoint clustering of the coefficients suggested less than ten clusters. Four of these clusters were approximately characterized by significant (positive and negative) quadratic regression coefficients in combination with significant (positive and negative) quartic regression coefficient within each signed quadratic group. Another cluster was characterized by significant linear and cubic regression coefficients. Complementary analysis of this data including time as a discrete variable identified patterns consistent with those detected by random regression.

Impacts
cDNA microarrays allow us to profile the expression of thousands of genes simultaneously across multiple conditions. Our study demonstrated that multiple normalization approaches need to be evaluated to identify the most appropriate transformation to remove technical bias within experiments. In addition to normalization of the fluorescence intensity data, different models to describe the transformed data must be considered. Gene expression patterns across continuous conditions including time can be described using regression models in addition to treating time in a discrete manner. The common patterns can be identified using data mining clustering approaches. Results from this study indicate that regression models are flexible enough to accommodate the variation in patterns that can be observed in genomic studies. Both multi-stage approaches provide complementary description of the data, thus enhancing the interpretation of the fluctuation of the levels of gene expression. Different normalizations and models should be considered when analyzing microarray data.

Publications

  • Rodriguez-Zas, S.L., Loor, J.J., Drackley, J.K. and Lewin, H.A. 2004. Application of a random regression model to gene expression profiling. Annual Meeting of the American Society of Animal Sciences. J. Anim. Sci. 82(Suppl. 1): 242, abs. 365.
  • Loor, J.J., Drackley, J.K., Dann, H.M., Everts, R.E., Rodriguez-Zas, S.L. and Lewin, H.A. 2004. Gene expression patterns in liver of dairy cows from dry-off through early lactation using a bovine cDNA microarray. Federation of American Societies for Experimental Biology Experimental Biology April 2004, Abstract # 822.4 (http://select.biosis.org/faseb/eb2004_data/FASEB004847.html).
  • Loor, J.J., Dann, H.M., Everts, R.E., Rodriguez-Zas, S.L., Lewin, H.A. and Drackley, J.K. 2004. Mammary and hepatic gene expression analysis in peripartal dairy cows using a bovine cDNA microarray. Annual Meeting of the American Society of Animal Sciences. J. Anim. Sci. 82(Suppl. 1):103, T134.
  • Loor, J.J., Carlson, D.B., Everts, R.E., Rodriguez-Zas, S.L., Lewin, H.A. and Drackley, J.K. 2004. Gene expression profiles in liver of dairy cows in response to feed restriction using a bovine cDNA microarray. Annual Meeting of the American Society of Dairyl Sciences. J. Dairy Sci. 87(Suppl. 1):103, T136.
  • Loor, J.J., Janovick, N.A., Everts, R.E., Rodriguez-Zas, S.L., Lewin, H.A. and Drackley, J.K. 2004. Adipose, mammary, and hepatic gene expression profiling in lactating dairy cows using a bovine cDNA microarray. Annual Meeting of the American Society of Dairy Sciences. J. Dairy Sci. 87(Suppl. 1):103, T135.
  • Loor, J.J., Janovick, N.A., Dann, H.M., Everts, R.E., Rodriguez-Zas, S.L., Lewin, H.A. and Drackley, J.K. 2004. Microarray analysis of hepatic gene expression from dry-off through early lactation in dairy cows fed at two intakes during the dry period. J. Dairy Sci. 87(Suppl. 1):103, T133.
  • Rodriguez-Zas, S.L., Band, M.R., Everts, R.E., Southey, B.R., Liu, Z.L. and Lewin, H.A. 2004. Comparison of normalization and models for the analysis of gene expression data. Annual Meeting of the American Society of Animal Sciences. J. Anim. Sci. 82(Suppl. 1):377, W256.


Progress 01/01/03 to 12/31/03

Outputs
Although most functional genomics studies collect the same or equivalent indicators of the amount of gene transcript in different conditions, there are multiple ways that this data can be studied. Strategies based on models have been applied to two types of indicators, absolute and relative measurements of fluorescence intensity of genes. No study has directly compared the impact of both strategies on the estimates, hypothesis testing and final conclusions. The objectives of our study were to evaluate two alternative representations of the transcript level and associated models and to use these models to characterize the levels of expression of genes in three key cattle digestive tissues. The expression levels of approximately 6,000 unique sequences in the rumen, large intestine and small intestine were recorded in duplicate in cattle cDNA microarrays. Each array was hybridized to samples from one tissue and a universal control reference. These samples received alternative dyes in pairs of microarrays following a dye-swap design. Intensity measurements from a total of six arrays (two arrays per tissue) were analyzed. Prior to the analysis the median foreground intensity values were background subtracted, normalized using the linear logarithmic shift transformation and logarithmic (base 2) transformed. The model describing the absolute intensity included the effects of dye, tissue and a random residual. In this model the reference sample was considered as a fourth tissue. The model describing the ratio between tissue and reference sample included the effects of dye, tissue and a random residual. In both models, the adequacy of the assumption of homogeneity and heterogeneity of variance across tissues was evaluated. Two estimation frameworks were studied, most frequent (i.e. restricted maximum likelihood) and Bayesian with flat priors implemented using an independence chain algorithm to sample the posterior sample. Estimates from the model including heterogeneity of variance did not differ from the homogeneity of variance model. All model adequacy criteria consistently indicated that the absolute intensity model was slightly better supported by the data than the relative intensity model. Revealing posterior distributions of contrasts between gastrointestinal tissues were obtained. Point estimates from the restricted maximum likelihood and Bayesian approaches were consistent. Most of the sequences studied were not differentially expressed across the three tissues considered. The approaches were able to detect sequences with similar pattern and biological implication. For example, four genes coding for Pepsinogen sequences exhibited a common pattern with higher levels of expression in the rumen compared to the intestines.

Impacts
A major challenge in functional genomics studies is the analysis of the large volume of gene expression data generated. The most adequate models and methods for a particular gene, experimental design or data structure fail for other cases. We propose that a comprehensive analysis of gene expression data must be conducted to gain a better understanding of the gene expression profiles. The use of absolute or relative measurements of gene expression is an aspect that has not yet been evaluated. These two measurements were evaluated using linear models that included the effect of tissue and dye and likelihood and Bayesian approaches. Even though the results were mostly consistent among models and with prior biological knowledge, differences in tissue contrasts between models highlighted the need to use complementary descriptions when analyzing large and complex data sets like the ones stemming from functional genomic studies.

Publications

  • Band, M.R., Everts, R.E., Liu, Z.L., Morin, D.E., Peled, J.U., Rodriguez-Zas, S.L. and Lewin, H.L. 2003. Gene expression profiling of 17 cattle tissues reveals unique patterns related to tissue function. Plant, Animal and Microbe Genomes XI Conference. Jan. 11-15, 2003. Microarrays section.
  • Rodriguez-Zas, S.L., Band, M.R., Everts, R.E., Southey, B.R., Liu, Z.L. and Lewin, H.A. 2003. Analysis of gene expression patterns in the cattle digestive system. Journal Animal Science, 81(Suppl. 1): 628.
  • Loor, J.J., Drackley, J.K., Dann, H.M., Everts, R.E., Rodriguez-Zas, S.L. and Lewin, H.A. 2003. Mammary gene expression analysis in peripartal dairy cows using a bovine cDNA microarray. Journal Animal Science, 81(Suppl. 1): W8.
  • Loor, J.J., Drackley, J.K., Dann, H.M., Everts, R.E., Rodriguez-Zas, S.L. and Lewin, H.A. 2003. Hepatic gene expression analysis in peripartal dairy cows using a bovine cDNA microarray. Journal Animal Science, 81(Suppl. 1): W9.