Dataset title: Leaf litter, soil, and periphyton gene expression along freshwater to marine gradients in Everglades National Park (FCE LTER), Florida, USA, January 2021 and April 2021 Dataset ID: doi:10.6073/pasta/6957cf577776d39845ffef077a590cfc Dataset Creator Name: Kenneth Anderson Position: Post-doctoral Scholar Organization: Kent State University Email: kanderson624@gmail.com Name: John Kominoski Position: Professor Organization: Florida International University Email: jkominos@fiu.edu Name: Chang Jae Choi Position: Post-doctoral Scholar Organization: University of Florida Email: changjae.choi@ufl.edu Name: Ulrich Stingl Position: Assistant Professor Organization: University of Florida Email: ustingl@ufl.edu Metadata Provider Organization: Florida Coastal Everglades LTER Address: Florida International University 11200 SW 8th Street, OE 148 Miami, FL 33199 USA Phone: 305-348-6054 Email: fcelter@fiu.edu URL: https://fcelter.fiu.edu Dataset Abstract We collected leaf litter, periphyton, and soil along freshwater to marine gradients at SRS-2, SRS-4, SRS-6, TS/PH-2, TS/Ph-3, TS/Ph-7a, and TS/Ph-10. Samples were collected in January and April of 2021 to understand how microbial communities respond to and influence the breakdown of organic matter along freshwater to marine transects. Data collection for this project is complete. For each site and litter pair we collected a subset of 2-3 g wet mass of litter, a grab sample of soil, and a grab sample of periphyton for each site. All subsamples were preserved at -20°C until extraction, which took place up to a year after initial collection. Samples were sent to Novogene (Novogene Co. Ltd., Beijing, China) for the total RNA extraction followed by metatranscriptome sequencing. We selected n = 12 genes/gene families encoding for focal enzymes to investigate which are important to the breakdown of organic matter: Dioxygenases (associated with aerobic respiration), Sulfatases (associated with the release of sulfates from complex molecules), sulfite reductases (associated with sulfite reduction), methyl coenzyme M reductase and formylmethanofuran (associated with methanogenesis), nitrite reductases (associated with nitrite reduction), cellobiosidase, glucosidase, and xylosidase (associated with cellulose breakdown), phenol oxidase (associated with lignin breakdown), acid phosphatase (associated with phosphate acquisition in acidic environments), and alkaline phosphatase (associated with phosphate acquisition in basic environments). For each gene/family of interest, we searched all annotated transcripts for all entries corresponding to that gene/family and combined all values for a total expression. We selected n = 6 monophyletic microbial functional groups, representing sulfate reducers, sulfate oxidizers, methane oxidizers, methanogens, nitrite oxidizers, and ammonia oxidizers associated with sulfate and methane cycling. We filtered all annotated transcripts for all species with the following in the name: in the name: ‘desulfo’ for sulfate reducers, ‘sulfito’ for sulfite oxidizers, ‘methylo’ for methyl/methane oxidizers, ‘methano’ for methanogens, ‘nitro’ for nitrite oxidizers, and ‘nitroso’ for aerobic ammonia oxidizers. Geographic Coverage Bounding Coordinates Geographic description: SRS2 West bounding coordinate: -80.78520692 East bounding coordinate: -80.78520692 North bounding coordinate: 25.54972811 South bounding coordinate: 25.54972811 Geographic description: SRS4 West bounding coordinate: -80.96431016 East bounding coordinate: -80.96431016 North bounding coordinate: 25.40976421 South bounding coordinate: 25.40976421 Geographic description: SRS6 West bounding coordinate: -81.07794623 East bounding coordinate: -81.07794623 North bounding coordinate: 25.36462994 South bounding coordinate: 25.36462994 Geographic description: TS/Ph2 West bounding coordinate: -80.60690341 East bounding coordinate: -80.60690341 North bounding coordinate: 25.40357188 South bounding coordinate: 25.40357188 Geographic description: TS/Ph3 West bounding coordinate: -80.66271768 East bounding coordinate: -80.66271768 North bounding coordinate: 25.25240534 South bounding coordinate: 25.25240534 Geographic description: TS/Ph7a West bounding coordinate: -80.63910514 East bounding coordinate: -80.63910514 North bounding coordinate: 25.19080491 South bounding coordinate: 25.19080491 Geographic description: TS/Ph10 West bounding coordinate: -80.68097374 East bounding coordinate: -80.68097374 North bounding coordinate: 25.02476744 South bounding coordinate: 25.02476744 Temporal Coverage Start Date: 2021 End Date: 2021 Data Table Entity Name: FCE1270_FocalGenedata Entity Description: Transcriptomic gene expression from deployed litter samples, soil samples, and periphyton samples. Object Name: FCE1270_FocalGenedata.csv Data Format Number of Header Lines: 1 Attribute Orientation: column Field Delimiter: , Number of Records: Attributes Attribute Name: SITENAME Attribute Label: SITENAME Attribute Definition: Name of LTER site Storage Type: string Measurement Scale: SRS2= Site SRS-2 SRS4= Site SRS-4 SRS6= Site SRS-6 TS/Ph10= Site TS/Ph-10 TS/Ph2= Site TS/Ph-2 TS/Ph3= Site TS/Ph-3 TS/Ph7a= Site TS/Ph-7a Missing Value Code: Attribute Name: Species Attribute Label: Species Attribute Definition: Species of Vegetation microbes were isolated from Storage Type: string Measurement Scale: Eleocharis= Eleocharis cellulosa Mangrove= Rhizophora mangle Periphyton= Mixed periphyton was collected for these samples. Sawgrass= Cladium jamaicense Soil= Soil from the top 10cm of the soil was collected for these samples Thalassia= Thalassia testudinum Missing Value Code: Attribute Name: Month Attribute Label: Month Attribute Definition: Month during which samples were collected Storage Type: string Measurement Scale: 1= January 4= April Missing Value Code: Attribute Name: Type Attribute Label: Type Attribute Definition: This column groups by whether samples are from deployed leaf litter packs, periphyton, or soil. Storage Type: string Measurement Scale: Litter= Samples from deployed leaf litter packs Peri= Samples where periphyton was collected if present. Soil= Samples where soil was collected from the top 10cm of the soil surface Missing Value Code: Attribute Name: Dioxygenases Attribute Label: Dioxygenases Attribute Definition: Summed total relative expression of genes associated with dioxygenases Storage Type: float Measurement Scale: Units: dimensionless Number Type: real Missing Value Code: Attribute Name: Sulfatases Attribute Label: Sulfatases Attribute Definition: Summed total relative expression of genes associated with sulfatases Storage Type: float Measurement Scale: Units: dimensionless Number Type: real Missing Value Code: Attribute Name: Sulfite_Reductases Attribute Label: Sulfite Reductases Attribute Definition: Summed total relative expression of genes associated with sulfite reductases Storage Type: float Measurement Scale: Units: dimensionless Number Type: real Missing Value Code: Attribute Name: Formylmethanofuran Attribute Label: Formylmethanofuran Attribute Definition: Summed total relative expression of genes associated with formylmethanofuran Storage Type: float Measurement Scale: Units: dimensionless Number Type: real Missing Value Code: Attribute Name: Nitrite_reductases Attribute Label: Nitrite reductases Attribute Definition: Summed total relative expression of genes associated with nitrite reductases Storage Type: float Measurement Scale: Units: dimensionless Number Type: real Missing Value Code: Attribute Name: Cellobiosidase Attribute Label: Cellobiosidase Attribute Definition: Summed total relative expression of genes associated with cellobiosidases Storage Type: float Measurement Scale: Units: dimensionless Number Type: real Missing Value Code: Attribute Name: Glucosidase Attribute Label: Glucosidase Attribute Definition: Summed total relative expression of genes associated with glucosidases Storage Type: float Measurement Scale: Units: dimensionless Number Type: real Missing Value Code: Attribute Name: Xylosidase Attribute Label: Xylosidase Attribute Definition: Summed total relative expression of genes associated with xylosidases Storage Type: float Measurement Scale: Units: dimensionless Number Type: real Missing Value Code: Attribute Name: Phenol_Oxidase Attribute Label: Phenol Oxidase Attribute Definition: Summed total relative expression of genes associated with phenol oxidases Storage Type: float Measurement Scale: Units: dimensionless Number Type: real Missing Value Code: Attribute Name: Alkaline_phosphatase Attribute Label: Alkaline phosphatase Attribute Definition: Summed total relative expression of genes associated with alkaline phosphatases Storage Type: float Measurement Scale: Units: dimensionless Number Type: real Missing Value Code: Attribute Name: Acid_Phosphatase Attribute Label: Acid Phosphatase Attribute Definition: Summed total relative expression of genes associated with acid phosphatases Storage Type: float Measurement Scale: Units: dimensionless Number Type: real Missing Value Code: Attribute Name: k.dd Attribute Label: k.dd Attribute Definition: leaf litter breakdown rate per degree day Storage Type: float Measurement Scale: Units: k.dd Number Type: real Missing Value Code: NA (Soil and periphyton samples have no breakdown rate as they were environmental samples and not deployed leaf litter.) Attribute Name: Salinity Attribute Label: Salinity Attribute Definition: water salinity Storage Type: float Measurement Scale: Units: PSU Precision: 0.1 Number Type: real Missing Value Code: Attribute Name: TP Attribute Label: TP Attribute Definition: Total phosphorus in the water column Storage Type: float Measurement Scale: Units: micromolePerLiter Precision: 0.01 Number Type: real Missing Value Code: Methods Method Step Description For each site and litter pair we collected a subset of 2-3 g wet mass of litter, a grab sample of soil, and a grab sample of periphyton for each site. All subsamples were preserved at -20°C until extraction, which took place up to a year after initial collection. Samples were sent to Novogene (Novogene Co. Ltd., Beijing, China) for the total RNA extraction followed by metatranscriptome sequencing. Briefly, the total RNA was extracted using TRIzol reagent (Rio et al. 2010) and the quality and quantity of the RNA were assessed using the Agilent 2100 bioanalyzer (Agilent Technologies, Santa Clara, CA, USA) and Nanodrop ND-1000 (ThermoScientific, Waltham, MA, USA), respectively. After the total RNA samples passed the quality check, cDNA libraries were prepared from total RNA using poly(A) enrichment of the mRNA to remove rRNA resulting in the construction of 250-300 bp insert cDNA libraries and sequenced by paired-end (PE) sequencing (PE 2 × 150 bp) using an Illumina NovaSeq 6000 platform (NovaSeq Reagent Kits, Illumina, Inc., San Diego, CA, USA). Raw reads were processed using the Simple Annotation of Metatranscriptomes by Sequence Analysis 2.0 (SAMSA2) pipeline (Westreich et al. 2018) with slight modification. Briefly, low quality bases were trimmed using Trimmomatic v0.39 (Bolger et al. 2014) and overlapping paired-end reads were merged into single sequences using PEAR v0.9.11 (Zhang et al. 2014). Ribosomal RNA reads were removed with SortMeRNA v2.1 (Kopylova et al. 2012) and the cleaned transcripts were annotated by DIAMOND v0.9.36 (Buchfink et al. 2021) against the National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) database (O'leary et al. 2016) for taxonomic and functional characterization. The resulting annotation files were aggregated and merged with custom Python and R scripts included in the SAMSA2 pipeline (Westreich et al. 2018). We selected n = 12 genes/gene families encoding for focal enzymes to investigate which are important to the breakdown of organic matter: Dioxygenases (associated with aerobic respiration), Sulfatases (associated with the release of sulfates from complex molecules), sulfite reductases (associated with sulfite reduction), methyl coenzyme M reductase and formylmethanofuran (associated with methanogenesis), nitrite reductases (associated with nitrite reduction), cellobiosidase, glucosidase, and xylosidase (associated with cellulose breakdown), phenol oxidase (associated with lignin breakdown), acid phosphatase (associated with phosphate acquisition in acidic environments), and alkaline phosphatase (associated with phosphate acquisition in basic environments; Table 1). For each gene/family of interest, we searched all annotated transcripts for all entries corresponding to that gene/family and combined all values for a total expression. We selected n = 6 monophyletic microbial functional groups, representing sulfate reducers, sulfate oxidizers, methane oxidizers, methanogens, nitrite oxidizers, and ammonia oxidizers associated with sulfate and methane cycling. We filtered all annotated transcripts for all species with the following in the name: in the name: ‘desulfo’ for sulfate reducers, ‘sulfito’ for sulfite oxidizers, ‘methylo’ for methyl/methane oxidizers, ‘methano’ for methanogens, ‘nitro’ for nitrite oxidizers, and ‘nitroso’ for aerobic ammonia oxidizers. References: Bolger, A. M., M. Lohse, and B. Usadel. 2014. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics 30: 2114-2120. Buchfink, B., K. Reuter, and H.-G. Drost. 2021. Sensitive protein alignments at tree-of-life scale using DIAMOND. Nature Methods 18: 366-368. Kopylova, E., L. Noé, and H. Touzet. 2012. SortMeRNA: Fast and accurate filtering of ribosomal RNAs in metatranscriptomic data. Bioinformatics 28: 3211-3217. O'leary, N. A. and others 2016. Reference sequence (RefSeq) database at NCBI: Current status, taxonomic expansion, and functional annotation. Nucleic Acids Research 44: D733-D745. Rio, D. C., M. Ares, G. J. Hannon, and T. W. Nilsen. 2010. Purification of RNA using TRIzol (TRI reagent). Cold Spring Harbor Protocols 2010: pdb. prot5439. Westreich, S. T., M. L. Treiber, D. A. Mills, I. Korf, and D. G. Lemay. 2018. SAMSA2: A standalone metatranscriptome analysis pipeline. BMC bioinformatics 19: 1-11. Zhang, J., K. Kobert, T. Flouri, and A. Stamatakis. 2014. PEAR: A fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics 30: 614-620. Distribution Online distribution: https://pasta.lternet.edu/package/data/eml/knb-lter-fce/1270/1/1cc3807f30fe631a87fecf117bb49b44 Intellectual Rights This information is released under the Creative Commons license - Attribution - CC BY (https://creativecommons.org/licenses/by/4.0/). The consumer of these data ("Data User" herein) is required to cite it appropriately in any publication that results from its use. The Data User should realize that these data may be actively used by others for ongoing research and that coordination may be necessary to prevent duplicate publication. The Data User is urged to contact the authors of these data if any questions about methodology or results occur. Where appropriate, the Data User is encouraged to consider collaboration or co-authorship with the authors. The Data User should realize that misinterpretation of data may occur if used out of context of the original study. While substantial efforts are made to ensure the accuracy of data and associated documentation, complete accuracy of data sets cannot be guaranteed. All data are made available "as is." The Data User should be aware, however, that data are updated periodically and it is the responsibility of the Data User to check for new versions of the data. The data authors and the repository where these data were obtained shall not be liable for damages resulting from any use or misinterpretation of the data. Thank you. Dataset Keywords FCE LTER LTER Florida Coastal Everglades LTER microbes decomposition litter decomposition disturbance Maintenance knb-lter-fce.1270.1: Initial publication of dataset. Data collection is complete. Dataset Contact Position: Information Manager Organization: Florida Coastal Everglades LTER Address: Florida International University 11200 SW 8th Street, OE 148 Miami, FL 33199 USA Email: fcelter@fiu.edu URL: https://fcelter.fiu.edu Name: Kenneth Anderson Position: Post-doctoral Scholar Organization: Kent State University Email: kanderson624@gmail.com Project permits EVER-2019-SCI-0055