Source: AUBURN UNIVERSITY submitted to
DEVELOPING GENOMIC TOOLS FOR THE ENIGMATIC CYPERUS ESCULENTUS- WEED, WILDLIFE FORAGE, AND ORPHANED CROP.
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
NEW
Funding Source
Reporting Frequency
Annual
Accession No.
1011041
Grant No.
(N/A)
Project No.
ALA012-1-16022
Proposal No.
(N/A)
Multistate No.
(N/A)
Program Code
(N/A)
Project Start Date
Oct 14, 2016
Project End Date
Sep 30, 2018
Grant Year
(N/A)
Project Director
McElroy, JO, SC.
Recipient Organization
AUBURN UNIVERSITY
108 M. WHITE SMITH HALL
AUBURN,AL 36849
Performing Department
Agronomy & Soils
Non Technical Summary
Cyperus esculentus exists as a problematic weed (yellow nutsedge) around the world, forage (chufa) for wildlife, and burgeoning minor food crop (tigernuts). Depending on the whether it is viewed negatively or positively, research can take a different approach to understanding this species. For those who desire to control weedy Cyperus esculentus, they seek to understand how it adapts to weed management practices and how it interacts with desirable plant species. Development of herbicide resistant populations has increased the problematic nature of this weed species, with need to understand how Cyperus esculentus populations spread to other areas. For those who desire to propagate Cyperus esculentus for either wildlife or human consumption, they seek to understand the nutritive value, how best to propagate the species, and which selections have the best nutrition and adaptions for growth. Regardless of the desired use, genomic research tools for Cyperus esculentus would benefit researchers. Our goal is to develop a near-complete reference transcriptome that incorporates various tissues and growth stages of Cyperus esculentus using the RNA sequencing on the Illumina HiSeq platform. Using the reference transcriptome, our second goal is to collect both weedy and cultivated selections to identify single nucleotide polymorphisms (SNPs) for use in marker-assisted breeding studies or population genetics. The genomic resource tools we developed could also be used to develop other marker technologies such as simple sequence repeats, to identify quantitative trait loci of desirable traits, and to construct genetic linkage maps. From this research we will develop resources that will allow us to expand our research into the study of diversification and evolution of Cyperus esculentus and the Cyperus genus. Further, our research can expand into selection of improved types of Cyperus esculentus with greater benefit for wildlife and human consumption. This research will serve as a basis for the development of the first Cyperus esculentus breeding program in the US.
Animal Health Component
0%
Research Effort Categories
Basic
80%
Applied
20%
Developmental
(N/A)
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
20123001040100%
Knowledge Area
201 - Plant Genome, Genetics, and Genetic Mechanisms;

Subject Of Investigation
2300 - Weeds;

Field Of Science
1040 - Molecular biology;
Goals / Objectives
Objective 1: Develop a reference transcriptome for Cyperus esculentus. A reference transcriptome is essential for basic molecular biology research and discovery on gene function, expression, and mutation. With the advent of massively parallel sequencing of fragmented polypeptides, expressed transcripts can now be assembled and annotated for any species of interest. Genomic research tools can now be developed for the study of non-model organisms and minor crop species at relative low cost. For assembly of Cyperus esculentus transcriptomes, we will utilize methodology we developed for the sequencing of the Eleusine indica leaf transcriptome (Chen et al., 2015). RNA will be extracted from various tissue to maximize the potential assembly of tissue specific genes. RNA will be extracted from seed, germinating seedlings, developing and mature leaves, roots, tubers, stems, and flowers. All tissues will extracted three times and pooled into a single sample. RNA will be extracted using the Trizol reagent methodology (Trizol, Invitrogen, Carlsbad, CA) or similar based on success of extraction methodology for each tissue. RNA-seq library preparation and sequencing will be conducted by the Genomic Service Laboratory at the Hudson-Alpha Institute for Biotechnology (Cummings Research Park, Huntsville, AL). cDNA will be sequenced using an Illumina HiSeq 2000 with the goal of generating 100 million 100 base pair paired-end reads. Raw reads will be processed and evaluated using FastX-toolkit (http://hannonlab.cshl.edu/fastx_toolkit and FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Following read trimming and quality check, reads will be assembled using three de novo assembler - Trinity (https://github.com/trinityrnaseq/trinityrnaseq/wiki), SOAPdenovo-Trans (http://soap.genomics.org.cn/soapdenovo.html), and CLC Genomics Workbench (CLC, http:// www.clcbio.com/products/clc-genomics-workbench/) using various kmer sizes and assembly settings to maximize assembly of low-expressed contigs. Assemblies will be pooled into a single merged assembly and redundancy reduced using CD-HIT-EST (http://weizhongli-lab.org/cd-hit/). The redundancy reduced assembly will be subjected to EvidentialGene (http://arthropods.eugenes.org/genes2/about/EvidentialGene_trassembly_pipe.html) to select the best set of assembled transcripts based on coding potential. Final assemblies will be annotated using Blast2GO (https://www.blast2go.com/). Final assembly Cyperus esculentus will be compared for transcriptome size, gene ontology classifications, and differences in number of expressed genes. Assembled transcriptomes will be made available as a BioProject on NCBI (http://www.ncbi.nlm.nih.gov/bioproject/). The assembled transcriptome will be utilized as reference assembly for Objective 2.Objective 2: Identify SNP markers from leaf transcriptome of wild and cultivated.Single nucleotide polymorphisms (SNPs) are increasing in use as molecular markers for studies of ecological diversity in natural ecosystems and for marker-assisted plant breeding. Our goal will be to identify high quality polymorphic contigs utilizing ten different Cyperus esculentus selections. Including the reference assembled in Objective 1, nine additional selections, six weedy and three commercially available, will be utilized. Weedy types will be collected from diverse agricultural and horticultural areas across the southeastern US. Cultivated types utilized will be selected from those used for chufa wildlife plots and those produced from human consumption.The reference transcriptome described above will be de novo assembled, which requires greater read depth to insure assembly quality and completeness. Utilizing the reference assembly developed in Objective 1 will allow a read mapping to reference assembly approach to transcriptome assembly for the nineadditional selections which requires less upfront sequencing, thus reducing costs. Therefore the nine additional selections will be sequenced on the Illumina Platform with the goal of generating 25 million 100 bp paired-end reads. Reads will be mapped to the reference transcriptome using Bowtie2 (http://bowtie-bio.sourceforge.net/bowtie2/index.shtml) and assembled transcripts extracted using Samtools and BCFtools (http://samtools.sourceforge.net/). SNPs will be identified using QualitySNPng (Nijveen et al., 2013; http://www.bioinformatics.nl/QualitySNPng/). SNP validation will be conducted by sequencing amplicons of selected SNP regions based on a diverse panel of Cyperus esculentus collection.Objective 3: Collection and Evaluation of Yellow Nutsedge Throughout the SoutheastBefore a breeding and selection program can begin, it is necessary to develop a germplasm collection from across the Southeast. Such a collection would also have the added benefit of quantifying the diversity that already exists in the Southeastern United States. Our goal in year one is to collect 50 to 80 populations primarily focused on Alabama and surrounding states. Alabama has diverse ecosystems and soil types that could allow for diverse ecotypes of Cyperus esculentus within our state. Further, diverse agroecosystems, from vegetable crops, turf, and row-crops, as well as non-crop areas, could have provided the necessary selection pressure to aid Cyperus esculentus diversification. These populations will be sampled and propagated in both field and greenhouse facilities in year one.In year two, populations will be planted into a randomized garden plot experiment with three replications per population. Populations will be planted in to 20 L pots buried to soil level. Pots will be filled with the surface horizon of Marvyn Sandy Loam soil collected locally. Three tubers will be planted in each pot of an individual population starting May 1. Populations will be fertilized monthly with a complete fertilizer and watered as needed to limited drought stress. Data will be collected monthly for plant height, leaf number, leaf width, inflorescence height and number. Populations will be harvested between August 15 and September 1. Total fresh plant biomass will assessed as above and below ground weight. Above ground mass will be air-dried for 72 h at 60 C followed by processing to determine total carbohydrate and digestible proteins. Below ground biomass will first be washed free of soil followed by root/rhizome and tuber separation, with the separate sections assessed for fresh weight. Tubers will be counted and a randomly selected subset will be quantified for shape, color, and weight. Tubers will be dried for storage and assessed for total carbohydrate, total fibers, crude protein, and total lipid contents. Commercially available types utilized for wildlife food and human consumption will be utilized in this research as a comparison.
Project Methods
Objective 1: Develop a reference transcriptome for Cyperus esculentus. A reference transcriptome is essential for basic molecular biology research and discovery on gene function, expression, and mutation. With the advent of massively parallel sequencing of fragmented polypeptides, expressed transcripts can now be assembled and annotated for any species of interest. Genomic research tools can now be developed for the study of non-model organisms and minor crop species at relative low cost. For assembly of Cyperus esculentus transcriptomes, we will utilize methodology we developed for the sequencing of the Eleusine indica leaf transcriptome (Chen et al., 2015). RNA will be extracted from various tissue to maximize the potential assembly of tissue specific genes. RNA will be extracted from seed, germinating seedlings, developing and mature leaves, roots, tubers, stems, and flowers. All tissues will extracted three times and pooled into a single sample. RNA will be extracted using the Trizol reagent methodology (Trizol, Invitrogen, Carlsbad, CA) or similar based on success of extraction methodology for each tissue. RNA-seq library preparation and sequencing will be conducted by the Genomic Service Laboratory at the Hudson-Alpha Institute for Biotechnology (Cummings Research Park, Huntsville, AL). cDNA will be sequenced using an Illumina HiSeq 2000 with the goal of generating 100 million 100 base pair paired-end reads. Raw reads will be processed and evaluated using FastX-toolkit (http://hannonlab.cshl.edu/fastx_toolkit and FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Following read trimming and quality check, reads will be assembled using three de novo assembler - Trinity (https://github.com/trinityrnaseq/trinityrnaseq/wiki), SOAPdenovo-Trans (http://soap.genomics.org.cn/soapdenovo.html), and CLC Genomics Workbench (CLC, http:// www.clcbio.com/products/clc-genomics-workbench/) using various kmer sizes and assembly settings to maximize assembly of low-expressed contigs. Assemblies will be pooled into a single merged assembly and redundancy reduced using CD-HIT-EST (http://weizhongli-lab.org/cd-hit/). The redundancy reduced assembly will be subjected to EvidentialGene (http://arthropods.eugenes.org/genes2/about/EvidentialGene_trassembly_pipe.html) to select the best set of assembled transcripts based on coding potential. Final assemblies will be annotated using Blast2GO (https://www.blast2go.com/). Final assembly Cyperus esculentus will be compared for transcriptome size, gene ontology classifications, and differences in number of expressed genes. Assembled transcriptomes will be made available as a BioProject on NCBI (http://www.ncbi.nlm.nih.gov/bioproject/). The assembled transcriptome will be utilized as reference assembly for Objective 2.Objective 2: Identify SNP markers from leaf transcriptome of wild and cultivated.Single nucleotide polymorphisms (SNPs) are increasing in use as molecular markers for studies of ecological diversity in natural ecosystems and for marker-assisted plant breeding. Our goal will be to identify high quality polymorphic contigs utilizing ten different Cyperus esculentus selections. Including the reference assembled in Objective 1, nine additional selections, six weedy and three commercially available, will be utilized. Weedy types will be collected from diverse agricultural and horticultural areas across the southeastern US. Cultivated types utilized will be selected from those used for chufa wildlife plots and those produced from human consumption.The reference transcriptome described above will be de novo assembled, which requires greater read depth to insure assembly quality and completeness. Utilizing the reference assembly developed in Objective 1 will allow a read mapping to reference assembly approach to transcriptome assembly for the nineadditional selections which requires less upfront sequencing, thus reducing costs. Therefore the nine additional selections will be sequenced on the Illumina Platform with the goal of generating 25 million 100 bp paired-end reads. Reads will be mapped to the reference transcriptome using Bowtie2 (http://bowtie-bio.sourceforge.net/bowtie2/index.shtml) and assembled transcripts extracted using Samtools and BCFtools (http://samtools.sourceforge.net/). SNPs will be identified using QualitySNPng (Nijveen et al., 2013; http://www.bioinformatics.nl/QualitySNPng/). SNP validation will be conducted by sequencing amplicons of selected SNP regions based on a diverse panel of Cyperus esculentus collection.Objective 3: Collection and Evaluation of Yellow Nutsedge Throughout the SoutheastBefore a breeding and selection program can begin, it is necessary to develop a germplasm collection from across the Southeast. Such a collection would also have the added benefit of quantifying the diversity that already exists in the Southeastern United States. Our goal in year one is to collect 50 to 80 populations primarily focused on Alabama and surrounding states. Alabama has diverse ecosystems and soil types that could allow for diverse ecotypes of Cyperus esculentus within our state. Further, diverse agroecosystems, from vegetable crops, turf, and row-crops, as well as non-crop areas, could have provided the necessary selection pressure to aid Cyperus esculentus diversification. These populations will be sampled and propagated in both field and greenhouse facilities in year one.In year two, populations will be planted into a randomized garden plot experiment with three replications per population. Populations will be planted in to 20 L pots buried to soil level. Pots will be filled with the surface horizon of Marvyn Sandy Loam soil collected locally. Three tubers will be planted in each pot of an individual population starting May 1. Populations will be fertilized monthly with a complete fertilizer and watered as needed to limited drought stress. Data will be collected monthly for plant height, leaf number, leaf width, inflorescence height and number. Populations will be harvested between August 15 and September 1. Total fresh plant biomass will assessed as above and below ground weight. Above ground mass will be air-dried for 72 h at 60 C followed by processing to determine total carbohydrate and digestible proteins. Below ground biomass will first be washed free of soil followed by root/rhizome and tuber separation, with the separate sections assessed for fresh weight. Tubers will be counted and a randomly selected subset will be quantified for shape, color, and weight. Tubers will be dried for storage and assessed for total carbohydrate, total fibers, crude protein, and total lipid contents. Commercially available types utilized for wildlife food and human consumption will be utilized in this research as a comparison.

Progress 10/14/16 to 09/30/17

Outputs
Target Audience:Producers and stakeholders in agronomic crop production with problems in herbicide resistant weeds. Researchers focused on herbicide resistant weed development and weed genomics. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided? Nothing Reported How have the results been disseminated to communities of interest?The transcriptome is not publically available at http://www.weedgenomics.com and will be submitted to the NCBI repository prior to publication. What do you plan to do during the next reporting period to accomplish the goals?Genetic markers, SNPs and SSR, are being developed from the transcriptome sequencing. Completion of this will finalize objective 2 of the project.

Impacts
What was accomplished under these goals? Transcriptome has been developed and is not in progress of being published in a scientific journal. The transcriptome is not publically available at http://www.weedgenomics.com. This completes objective 1 with objective 2 to be completed in the coming year.

Publications