Source: MICHIGAN STATE UNIV submitted to
RESOURCES FOR COMPARATIVE GENOME ANALYSES WITHIN THE POACEAE
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
TERMINATED
Funding Source
Reporting Frequency
Annual
Accession No.
0225265
Grant No.
(N/A)
Project No.
MICL02231
Proposal No.
(N/A)
Multistate No.
(N/A)
Program Code
(N/A)
Project Start Date
Apr 1, 2011
Project End Date
Mar 31, 2016
Grant Year
(N/A)
Project Director
Buell, C. R.
Recipient Organization
MICHIGAN STATE UNIV
(N/A)
EAST LANSING,MI 48824
Performing Department
Plant Biology
Non Technical Summary
Maize, wheat, rice and sorghum, all Poaceae species, are significant contributors to world and U.S. agriculture. In addition, other Poaceae species including Miscanthus and switchgrass are being considered for biofuel feedstock production due to their lignocellulosic biomass production. Major increases in the yield, quality, and disease/pest resistance of maize, wheat, and rice have occurred in the last 50 years due to conventional breeding, improved agricultural management practices, and transgenic technologies. However, to meet growing worldwide food and potentially biomass needs, production needs to be further increased. One approach to improve breeding practices is through accelerated or targeted breeding approaches such as marker assisted selection which relies on molecular markers to facilitate selection of improved lines in a breeding program. With the advent of genomics, including next generation sequencing technologies and bioinformatics, genome sequences and molecular markers can be generated in not only a high throughput but also in a cost effective manner. Providing data and research tools to plant breeders and biologists that enable genomic-based plant breeding will be essential to improved agriculture in the 21st century.
Animal Health Component
(N/A)
Research Effort Categories
Basic
100%
Applied
(N/A)
Developmental
(N/A)
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
2011530108060%
2011510108020%
2011520108020%
Goals / Objectives
Using rice as the foundation species, we will provide annotation for Poaceae species to permit cross-species analyses and data-mining thereby providing a computational resource for Poaceae researchers to data-mine their species of interest, the majority of which lack robust annotation and research tools. Even though genomes can be readily sequenced using next generation sequencing methods, the annotation and curation of these genomes is minimal. As a consequence, researchers rely heavily on transitive annotation methods and due to the "gold standard" sequence and annotation, coupled with the high degree of sequence conservation among the Poaceae, the rice genome has and will continue to have a pivotal role in informing researchers the function of genes within this agriculturally important clade. We have three main objectives in this project: Obj 1: Provide the rich annotation resources for rice and other Poaceae species through a set of databases and webpages (http://rice.plantbiology.msu.edu/). This will extend the provision of data resources established through an NSF award for which funding ends in Spring 2011. Obj 2: Expand annotation for rice genes through expression correlation analyses using publicly available expression datasets. Obj 3: Expand annotation of Poaceae genomes by classification of genes into orthologous/paralogous families and single copy genes to enable comparative studies across the Poaceae and leverage the rich functional annotation for the rice genome to other Poaceae species for which annotation is limited.
Project Methods
Obj 1: Provision of annotation resources for rice and other Poaceae species. Funding for our current rice annotation efforts ends in Spring 2011. In this project, we will continue to provide the data as well as the computational and bioinformatics tools in place through our NSF Rice Genome Annotation project. The data will be made available through search/query pages, downloads, and through graphical interfaces in the well developed MSU Rice Annotation Project site (http://rice.plantbiology.msu.edu). Throughout the project, we will update key tracks on the browser such as the RNA-seq track and alignments to gene models from sequenced Poaceae species as data are made publicly available. Furthermore, we will add new annotation tracks to the Genome Browser (http://rice.plantbiology.msu.edu/cgi-bin/gbrowse/rice/) provided through external collaborations. These will include epigenetic datasets, quantitative trait loci, and other functional annotation datatypes which can be related to sequence features. Obj 2: Expression correlation analyses of rice. As expression datasets are made available for rice, we will perform Weighted Gene Correlation Network Analyses (Langfelder and Horvath, 2008) and add these new datasets to the project website. As these analyses are performed on a per experiment basis, these can be added in an incremental manner as they are made available. We will use existing bioinformatics pipelines to download, process, and analyze Affymetrix datasets and add these to the Rice Coexpression Page on the project site (http://rice.plantbiology.msu.edu/coexpression.shtml). We will adapt our computational methods for RNA-seq derived expression data to permit continued incorporation of expression data into our project. Obj 3: Annotate paralogous and orthologous families within the Poaceae. We will identify paralogous and orthologous families in each of the Poaceae species for which genome sequence is available. As new Poaceae genomes are made available, orthologous families will be constructed between all sequenced Poaceae species using OrthoMCL (Li et al., 2003). Orthologous family composition will be provided through the Rice Genome Annotation project page as well as a track on the MSU Rice Genome Browser. These linkages will be valuable in interpreting gene function within and across Poaceae species. Loci will be linked to the Rice Genome Annotation report page (rice), MaizeGDB (maize), Phytozome (Sorghum and Brachypodium) and the primary annotation of the newly available Poaceae genome. Two new features will be added for the comparison of Poaceae proteomes. First, OrthoMCL (Li et al., 2003) will be used to construct paralogous families within single genomes and the families will be provided through new pages to be developed on the Rice Genome Annotation project page. Second, single copy genes, as defined by the single species OrthoMCL clustering, will be provided through new pages to be developed on the project site. As with the orthologous family clustering, these data will be updated as new Poaceae genomes and/or updated annotation for the genomes becomes available.

Progress 04/01/11 to 03/31/16

Outputs
Target Audience:The target audience for this project is the international community of researchers involved in cereal genetics, genomics and bioinformatics. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?The project provides the community with genome sequence and annotation datasets for the rice genome. The use of the website has been highly accessed with nearly 2 million page views per year. Several postdoctoral fellows developed datasets and received training in the release of the updated annotation. How have the results been disseminated to communities of interest?The database is publicly available through the URL: http://rice.plantbiology.msu.edu/ and through a publication. What do you plan to do during the next reporting period to accomplish the goals? Nothing Reported

Impacts
What was accomplished under these goals? Throughout the project, we maintained resources generated from prior National Science Foundation funding to support our well developed, highly accessed web-based resource (http://rice.plantbiology.msu.edu/) for plant genome sequence and annotation focused on rice and other cereal/grass species. We released new annotation for the rice genome to provide users with the most current set of annotations on a revised set of pseudomolecules representing the 12 rice chromosomes. We aligned the rice gene models with gene models from the complete gene complements of maize, Brachypodium, and sorghum. We also aligned transcript assemblies of 24 other Poaceae species to the rice genome to provide insight into the gene function of these homologous sequences. We added new expression datasets to the functional annotation as well as the genome browser. On the project web site, we corrected annotation download files and generated orthologous groups from multiple Poaceae species. During the project, we upgraded the database server that supports the project to increase performance due to the heavy accessing of the database; as a consequence, the speed and reliability of hosting the site improved. We installed a NCBI-BLAST search page to replace our previous WU-BLAST page that provides a more robust set of search results, including the ability to download the BLAST results. Our website was highly accessed throughout the project, averaging over 2 million page views from >200,000 visits from >60,000 visitors per year. We have answered queries from researchers throughout the project, typically 2-6 per week throughout the project.

Publications

  • Type: Websites Status: Published Year Published: 2011 Citation: http://rice.plantbiology.msu.edu/


Progress 10/01/14 to 09/30/15

Outputs
Target Audience:The target audience for this project is the international community of researchers involved in cereal genetics, genomics and bioinformatics. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?The project provides the community with genome sequence and annotation datasets for the rice genome. The use of the website has been strong with nearly 2 million page views. How have the results been disseminated to communities of interest?The database is publicly available through the URL: http://rice.plantbiology.msu.edu/ What do you plan to do during the next reporting period to accomplish the goals?We will continue to support the overall database and website, and answer queries from the public. We will add new expression data to the database.

Impacts
What was accomplished under these goals? Using rice as the foundation species, we have provided annotation for Poaceae species to permit cross-species analyses and data-mining thereby providing a computational resource for Poaceae researchers to data-mine their species of interest, the majority of which lack robust annotation and research tools. As researchers rely heavily on transitive annotation methods and due to the "gold standard" sequence and annotation, coupled with the high degree of sequence conservation among the Poaceae, the rice genome has and will continue to have a pivotal role in informing researchers the function of genes within this agriculturally important clade. For this project, we support a rich annotation resource for rice and other Poaceae species through a set of databases and webpages (http://rice.plantbiology.msu.edu/). In the last year, we have maintainedresources generated from prior National Science Foundation funding to support our well developed, highly accessed web-based resource (http://rice.plantbiology.msu.edu/) for plant genome sequence and annotation focused on rice and other cereal/grass species. Our website remains highly accessed with nearly 2 million page views from approximately 250,000 visits from approximately 67,500 visitors in the last year. We have answered queries from researchers throughout the year, typically 2-6 per week.

Publications


    Progress 10/01/13 to 09/30/14

    Outputs
    Target Audience: The target audience for this project is the international community of researchers involved in cereal genetics, genomics and bioinformatics. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided? The project provides the community with genome sequence and annotation datasets for the rice genome. The use of the website has been strong with over 2 million page views. An undergraduate has been trained in programming relevant to bioinformatics. How have the results been disseminated to communities of interest? The database is publicly available through the URL: http://rice.plantbiology.msu.edu/ What do you plan to do during the next reporting period to accomplish the goals? We will continue to support the overall database and website, answer queries from the public, and implement a new version of Gbrowse. We will also add new expression datasets to the genome browser.

    Impacts
    What was accomplished under these goals? Using rice as the foundation species, we have provided annotation for Poaceae species to permit cross-species analyses and data-mining thereby providing a computational resource for Poaceae researchers to data-mine their species of interest, the majority of which lack robust annotation and research tools. As researchers rely heavily on transitive annotation methods and due to the “gold standard” sequence and annotation, coupled with the high degree of sequence conservation among the Poaceae, the rice genome has and will continue to have a pivotal role in informing researchers the function of genes within this agriculturally important clade. For this project, we support a rich annotation resource for rice and other Poaceae species through a set of databases and webpages (http://rice.plantbiology.msu.edu/). In the last year, we have maintained and continued to build on resources generated from prior National Science Foundation funding to support our well developed, highly accessed web-based resource (http://rice.plantbiology.msu.edu/) for plant genome sequence and annotation focused on rice and other cereal/grass species. Our website remains highly accessed with over 2 million page views from 228,000 visits from 62,000 visitors in the last year. We have answered queries from researchers throughout the year, typically 2-6 per week. On the technical side, we have upgraded the database server that supports this project to increase performance due to the heavy accessing of the database. As a consequence, the speed and reliability of hosting the site has improved. We are in the final testing stage of a revised sequence search interface (BLAST search tool) for the site that will support a graphical interface. We are also finalizing a major update to the genome browser.

    Publications


      Progress 01/01/13 to 09/30/13

      Outputs
      Target Audience: The target audience for this project is the international community of researchers involved in cereal genetics, genomics and bioinformatics. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided? Nothing Reported How have the results been disseminated to communities of interest? The database is publicly available and reported in a published journal article. What do you plan to do during the next reporting period to accomplish the goals? We will continue to support the overall database and website, answer queries from the public, and implement a new version of Gbrowse and a new version of NCBI-BLAST.

      Impacts
      What was accomplished under these goals? We have maintained and continued to build on resources generated from prior National Science Foundation funding to support our well developed, highly accessed web-based resource (http://rice.plantbiology.msu.edu/) for plant genome sequence and annotation focused on rice and other cereal/grass species. Our website remains highly accessed with nearly 2 million page views from ~230,000 visits from ~57,000 visitors in the last year. We have answered queries from researchers throughout the year, averaging 2-3 responses per week. We have provided to researchers custom datasets to facilitate their research. We have begun to implement GBrowse2 to upgrade our genome browser for the site. We are developing a NCBI-BLAST search page to replace our current WU-BLAST page.

      Publications

      • Type: Journal Articles Status: Published Year Published: 2013 Citation: Kawahara, Y., de la Bastide, M., Hamilton, J.P., Kanamori, H., McCombie, W.R., Ouyang, S., Schwartz, D., Tanaka, T., Wu, J., Zhou, S., Childs, K.L., Davidson, R.M., Lin, H., Quesada-Ocampo, L., Vaillancourt, B., Sakai, H., Lee, S.S., Kim, J., Numa, H., Itoh, Buell, C.R., and Matsumoto, T. 2013. Improvement of the Oryza sativa Nipponbare reference genome using next generation sequence and optical map data. Rice 6:4 doi:10.1186/1939-8433-6-4


      Progress 01/01/12 to 12/31/12

      Outputs
      OUTPUTS: Maize, wheat, rice and sorghum, all Poaceae species, are significant contributors to world and U.S. agriculture. In addition, other Poaceae species including Miscanthus and switchgrass are being considered for biofuel feedstock production due to their lignocellulosic biomass production. Major increases in the yield, quality, and disease/pest resistance of maize, wheat, and rice have occurred in the last 50 years due to conventional breeding, improved agricultural management practices, and transgenic technologies. However, to meet growing worldwide food and potentially biomass needs, production needs to be further increased. One approach to improve breeding practices is through accelerated or targeted breeding approaches such as marker assisted selection which relies on molecular markers to facilitate selection of improved lines in a breeding program. With the advent of genomics, including next generation sequencing technologies and bioinformatics, genome sequences and molecular markers can be generated in not only a high throughput but also in a cost effective manner. Providing data and research tools to plant breeders and biologists that enable genomic-based plant breeding will be essential to improved agriculture in the 21st century. Using rice as the foundation species, we have provided annotation for Poaceae species to permit cross-species analyses and data-mining thereby providing a computational resource for Poaceae researchers to data-mine their species of interest, the majority of which lack robust annotation and research tools. Even though genomes can be readily sequenced using next generation sequencing methods, the annotation and curation of these genomes is minimal. As a consequence, researchers rely heavily on transitive annotation methods and due to the "gold standard" sequence and annotation, coupled with the high degree of sequence conservation among the Poaceae, the rice genome has and will continue to have a pivotal role in informing researchers the function of genes within this agriculturally important clade. For this project, we support a rich annotation resource for rice and other Poaceae species through a set of databases and webpages (http://rice.plantbiology.msu.edu/) that were initially developed through funding from the National Science Foundation. PARTICIPANTS: C Robin Buell is the Principal Investigator fo the project. TARGET AUDIENCES: The target audience for this project is the international community of researchers involved in cereal genetics, genomics and bioinformatics. PROJECT MODIFICATIONS: Nothing significant to report during this reporting period.

      Impacts
      We have maintained and continued to build on resources generated from prior National Science Foundation funding to support our well developed, highly accessed web-based resource (http://rice.plantbiology.msu.edu/) for plant genome sequence and annotation focused on rice and other cereal/grass species. On the project web site, we have corrected annotation download files. We have generated orthologous groups from multiple Poaceae species and will add these to the genome browser in the coming months. We have downloaded and processed epigenetic and small RNA datasets and will be preparing them for release to the project website following quality assessment. We have answered queries from researchers throughout the year, averaging 2-3 responses per week. We have provided to researchers custom datasets to facilitate their research.

      Publications

      • No publications reported this period


      Progress 04/01/11 to 12/31/11

      Outputs
      OUTPUTS: Maize, wheat, rice and sorghum, all Poaceae species, are significant contributors to world and U.S. agriculture. In addition, other Poaceae species including Miscanthus and switchgrass are being considered for biofuel feedstock production due to their lignocellulosic biomass production. Major increases in the yield, quality, and disease/pest resistance of maize, wheat, and rice have occurred in the last 50 years due to conventional breeding, improved agricultural management practices, and transgenic technologies. However, to meet growing worldwide food and potentially biomass needs, production needs to be further increased. One approach to improve breeding practices is through accelerated or targeted breeding approaches such as marker assisted selection which relies on molecular markers to facilitate selection of improved lines in a breeding program. With the advent of genomics, including next generation sequencing technologies and bioinformatics, genome sequences and molecular markers can be generated in not only a high throughput but also in a cost effective manner. Providing data and research tools to plant breeders and biologists that enable genomic-based plant breeding will be essential to improved agriculture in the 21st century. Using rice as the foundation species, we will provide annotation for Poaceae species to permit cross-species analyses and data-mining thereby providing a computational resource for Poaceae researchers to data-mine their species of interest, the majority of which lack robust annotation and research tools. Even though genomes can be readily sequenced using next generation sequencing methods, the annotation and curation of these genomes is minimal. As a consequence, researchers rely heavily on transitive annotation methods and due to the "gold standard" sequence and annotation, coupled with the high degree of sequence conservation among the Poaceae, the rice genome has and will continue to have a pivotal role in informing researchers the function of genes within this agriculturally important clade. For this project, we will continue to provide the rich annotation resources for rice and other Poaceae species through a set of databases and webpages (http://rice.plantbiology.msu.edu/) that were initially developed through funding from the National Science Foundation. PARTICIPANTS: Not relevant to this project. TARGET AUDIENCES: Breeders, geneticists, and biologists are the targets of this work. PROJECT MODIFICATIONS: Not relevant to this project.

      Impacts
      We have continued to build on resources generated from prior National Science Foundation funding to support our well developed, highly accessed web-based resource (http://rice.plantbiology.msu.edu/) for plant genome sequence and annotation focused on rice and other cereal/grass species. We have aligned the rice gene models with gene models from the complete gene complements of maize, Brachypodium, and sorghum. We have also aligned transcript assemblies of 24 other Poaceae species to the rice genome to provide insight into the gene function of these homologous sequences. We have classified orthologous/paralogous genes across rice, sorghum, maize, and Brachypodium and included the key model dicot species Arabidopsis for comparative purposes. These data are available through a public website (http://rice.plantbiology.msu.edu/) to facilitate research in not only rice but also other agronomically important cereal/grass species. We will maintain the project website and respond to user responses for the remainder of the project period. As new datasets become available, we will add comparative alignments to maximize leveraging the high quality annotation of the rice genome to other grasses and cereals.

      Publications

      • No publications reported this period