Dataset title: Diatom composition and environmental data from the Greater Everglades, Florida, USA (2013-2020) Dataset ID: doi:10.6073/pasta/3ed2c57c99b5b866a9ba2090f0d55aa4 Dataset Creator Name: Kelsey Solomon Organization: Florida International University Email: ksolomon@fiu.edu Name: R. Stevenson Organization: Michigan State University Email: rjstev@msu.edu Name: Donatto Surratt Organization: National Park Service Email: donatto_surratt@nps.gov Name: Kevin Whelan Organization: South Florida / Caribbean Inventory and Monitoring Network Email: kevin_r_whelan@nps.gov Name: Franco Tobias Organization: Florida International University Email: tobiasf@fiu.edu Name: Katherine Johnson Organization: Florida International University Email: kajohnso@fiu.edu Name: Evelyn Gaiser Organization: Florida International University Email: gaisere@fiu.edu Dataset Abstract Environmental and diatom data were collected from sites in the Big Cypress National Preserve (BICY) by the South Florida/Caribbean Inventory and Monitoring Network of the National Park Service and from sites in the Everglades Protection Area (EPA) as part of the Monitoring and Assessment Program of the Comprehensive Everglades Restoration Plan. Samples from years 2012, 2013, 2019, 2019, and 2020 are included in this dataset. Environmental data include drier variables that have been found to influence diatom assemblage composition in the greater Everglades ecosystem, including periphyton mat total phosphorus (a proxy for phosphorus in the environment), water column pH, water column conductivity, water depth, days since last dry, and hydroperiod. Diatom data include diatom species composition as percent relative abundances. Code included is pertinent to the methods described in "Robust species optima estimates from non-uniformly sampled environmental gradients" by Solomon et al. 2025, Journal of Paleolimnology. Geographic Coverage Bounding Coordinates Geographic description: This study took place in the Greater Everglades ecosystem of South Florida, USA within the Big Cypress National Preserve and Everglades Protection Areas. West bounding coordinate: -81.339244 East bounding coordinate: -80.307383 North bounding coordinate: 26.3875456 South bounding coordinate: 25.2787254 Temporal Coverage Start Date: 2013 End Date: 2020 Data Table Entity Name: FCE_1276_Solomon_Comparison_Data Entity Description: CSV file containing diatom composition data and associated environmental data. Object Name: FCE_1276_Solomon_Comparison_Data.csv Data Format Number of Header Lines: 1 Attribute Orientation: column Field Delimiter: , Number of Records: Attributes Attribute Name: TAG_ID Attribute Label: TAG_ID Attribute Definition: Sample tag ID Storage Type: string Measurement Scale: text Missing Value Code: Attribute Name: BC_HARMONIZEDNAME Attribute Label: BC_HARMONIZEDNAME Attribute Definition: Harmonized taxon name Storage Type: string Measurement Scale: text Missing Value Code: Attribute Name: SPPCODEBC Attribute Label: SPPCODEBC Attribute Definition: Diatom nine character taxon code Storage Type: string Measurement Scale: text Missing Value Code: Attribute Name: RELATIVE_PCT_ABUND Attribute Label: RELATIVE_PCT_ABUND Attribute Definition: Relative percent abundance of the diatom taxon within the sample Storage Type: float Measurement Scale: Units: percent Number Type: real Missing Value Code: Attribute Name: SURFACE_WATER_PH Attribute Label: SURFACE_WATER_PH Attribute Definition: pH of water sample Storage Type: float Measurement Scale: Units: dimensionless Number Type: real Missing Value Code: -9999 (pH meter not working) Attribute Name: PERI_TP Attribute Label: PERI_TP Attribute Definition: Concentration of total phosphorus per dried gram of periphyton; unit = µg/g mat dry mass Storage Type: float Measurement Scale: Units: microgramPerGram Precision: 0.01 Number Type: real Missing Value Code: Attribute Name: SURFACE_WATER_COND Attribute Label: SURFACE_WATER_COND Attribute Definition: Conductivity of water sample; unit = µS cm-1 Storage Type: float Measurement Scale: Units: microsiemensPerCentimeter Precision: 1 Number Type: real Missing Value Code: -9999 (conductivity meter not working) Attribute Name: DEPTH Attribute Label: DEPTH Attribute Definition: Average water depth of 3 measurements representative of replicate; unit = cm Storage Type: float Measurement Scale: Units: centimeter Precision: 0.1 Number Type: real Missing Value Code: Attribute Name: DSLDD Attribute Label: DSLDD Attribute Definition: Days since last dry down <=5cm; unit = days Storage Type: integer Measurement Scale: Units: nominalDay Precision: 1 Number Type: integer Missing Value Code: Attribute Name: HYDROPERIOD_MEAN Attribute Label: HYDROPERIOD_MEAN Attribute Definition: Mean hydroperiod length for each site across the entire time series of that site (i.e., mean annual hydroperiod value from 1991 through the year that the site was last sampled); unit = days Storage Type: float Measurement Scale: Units: nominalDay Precision: 0.1 Number Type: real Missing Value Code: Attribute Name: PROJECT Attribute Label: PROJECT Attribute Definition: BICY (Big Cypress National Preserve) or EVER (Everglades Protection Area) dataset Storage Type: string Measurement Scale: BICY= Big Cypress National Preserve EVER= Eveglades Protection Area Missing Value Code: Attribute Name: MANUSCRIPT_TAXON_NAME Attribute Label: MANUSCRIPT_TAXON_NAME Attribute Definition: Harmonized taxon name used in the associated manuscript which reflects the most current taxonomic nomenclature Storage Type: string Measurement Scale: text Missing Value Code: Data Table Entity Name: FCE_1276_Solomon_Comparison_GPS Entity Description: CSV data file containing GPS locations for each site that was sampled Object Name: FCE_1276_Solomon_Comparison_GPS.csv Data Format Number of Header Lines: 1 Attribute Orientation: column Field Delimiter: , Number of Records: Attributes Attribute Name: SITE_ID Attribute Label: SITE_ID Attribute Definition: Site ID Storage Type: string Measurement Scale: text Missing Value Code: Attribute Name: EASTING_UTM Attribute Label: EASTING_UTM Attribute Definition: Latitude coordinates of site in UTM Storage Type: float Measurement Scale: Units: meter Number Type: real Missing Value Code: Attribute Name: NORTHING_UTM Attribute Label: NORTHING_UTM Attribute Definition: Longitude coordinates of site in UTM Storage Type: float Measurement Scale: Units: meter Number Type: real Missing Value Code: Attribute Name: PROJECT Attribute Label: PROJECT Attribute Definition: BICY (Big Cypress National Preserve) or EVER (Everglades Protection Area) dataset Storage Type: string Measurement Scale: BICY= Big Cypress National Preserve EVER= Everglades Protection Area Missing Value Code: Attribute Name: ZONE_UTM Attribute Label: ZONE_UTM Attribute Definition: UTM Zone Storage Type: string Measurement Scale: 17N= 17 North Missing Value Code: Attribute Name: LONGITUDE Attribute Label: LONGITUDE Attribute Definition: Longitude coordinates of site in decimal degrees (WGS84) Storage Type: float Measurement Scale: Units: degree Number Type: real Missing Value Code: Attribute Name: LATITUDE Attribute Label: LATITUDE Attribute Definition: Latitude coordinates of site in decimal degrees (WGS84) Storage Type: float Measurement Scale: Units: degree Number Type: real Missing Value Code: Methods Method Step Description The EPA periphyton samples were collected annually as part of the Monitoring and Assessment Program of the Comprehensive Everglades Restoration Plan (CERP 2020). Generalized random-tessellation stratification (Stevens and Olsen 2004) was used to determine the location of three unique, sampleable Primary Sampling Units (PSUs) each year within 800 × 800 m grid cells (Philippi 2005). To be considered sampleable habitat, the PSUs had to contain vegetation that was not too dense for the sample device to enclose one-m3 of the water column and less than one m deep, which were primarily wet prairies and sloughs. Samples were collected from each PSU during a four-month window (September–December) representing the peak wet season in the EPA. At each PSU, a one-m3 enclosure with mesh sides that was open on the top and bottom was used to collect samples. All periphyton within the enclosure was collected and measured for biovolume using a graduated cylinder. If no calcareous benthic, epiphytic, or metaphytic periphyton was present, flocculent detritus or filamentous green algae were collected. A 120 mL homogenized subsample of the periphyton was returned to the lab on ice, frozen, and then thawed before processing. The BICY periphyton samples were collected annually by the South Florida/Caribbean Inventory and Monitoring Network (SFCN) of the National Park Service in wetlands in the northwest corner of the Preserve because surface water monitoring in that area indicated high concentrations of P (Urgelles et al. 2019). The monitoring area was delineated into seven hydrologically distinct basins separated by natural and artificial barriers. Some of these basins had anthropogenically elevated surface water P concentrations, while other basins had ambient surface water P concentrations. A restricted stratified sampling design was used to establish six to seven permanent, random sampling sites per basin, targeting freshwater broadleaf marsh habitat. To be considered sampleable habitat, site selection was constrained by the following criteria: the site must be located within 250 m of freshwater broadleaf marsh habitat, at least 1000 m from other sites within the same basin, located greater than 100 m from road or trail; the site must be accessible by road or helicopter and without obvious human disturbance or vegetation that is too dense to sample. Samples were collected from each site during October–March. For samples collected between hydrologic years 2013 and 2019, the preferred substrate collection order for calcareous periphyton was (1) floating mat, (2) epiphytic, (3) benthic, and (4) epidendric. At each site, a minimum of five grab samples of periphyton were collected within a 5 m radius and composited into two 120-mL homogenized samples. For samples collected in the hydrologic year 2020, the periphyton samples did not have a preferred substrate collection order, but instead were collected in relative proportion of the substrates represented at the site. Each sample collected was divided into two 125-mL Nalgene (Thermo Fisher Scientific Inc., Waltham, Massachusetts) bottles on ice and returned to the lab. For each sample, a subsample (one 125-mL Nalgene bottle) was preserved in a 3% formalin solution for diatom assemblage analysis and the other subsample (one 125-mL Nalgene bottle) was frozen and thawed before further processing. Water depth was measured at each site using a one-m measuring stick (cm), and surface water conductivity (µS cm-1), temperature (°C), and pH were measured using a multimeter probe. Days since the last dry down (number of days since flooding of the marsh surface after the latest drying even when water levels were <5 cm) and hydroperiod (days flooded) of the sample sites were estimated by calibration measured water depths to nearby continuous water level gauges using digital elevation models provided by the Everglades Depth Estimation Network (“EDEN”; https://sofia.usgs.gov/eden/stationlist.php). Sample processing In the lab, animals, plant matter, and other debris were removed from the periphyton, and subsamples were taken for the measurement of dry mass, periphyton mat total P (TP) concentrations, and diatom taxonomic composition analysis. To obtain periphyton dry mass (g), the biomass subsample was dried at 80°C for >48 h and then weighed. To obtain mat TP, the TP subsample was dried at 80°C, pulverized with a mortar and pestle, and then processed using colorimetric analysis to estimate TP concentration expressed as µg/g mat dry mass (Solórzano and Sharp 1980). We used mat TP as a proxy for TP in the environment because mat TP has been shown to have a stronger correlation with P loading than traditional water column P measurements (Gaiser et al. 2004). Excess P delivered to the ultraoligotrophic Everglades is rapidly assimilated by periphyton and vegetation, making TP almost undetectable in the water column even when the system has been exposed to enriched inputs for years (Gaiser et al. 2004; Gaiser 2009). Diatom samples were cleaned of mineral debris and calcite organic matter using sulfuric acid oxidation methods following Hasle and Fryxell (1970) and a known volume was then permanently affixed to a glass slide using Naphrax (PhycoTech Inc., St. Joseph, Michigan) mounting medium. A minimum of 500 valves was counted and identified per slide (Weber 1980) using a compound light microscope at 1000× magnification under oil immersion. Identifications were made to the lowest taxonomic level possible using Diatoms of North America (diatoms.org), a database of South Florida diatom taxa (https://fce-lter.fiu.edu/data/database/diatom/), and other regional references (Slate and Stevenson 2007; Lee et al. 2014). Raw diatom counts were converted to relative abundance through standardizing by the number of valves counted for each taxon by the total number of valves counted. Taxonomic harmonization While a consistent photo-documented voucher flora was generated for both the BICY and EPA datasets to guide taxonomic decisions, different taxonomic sources and conventions were used for the two wetlands, requiring a taxonomic harmonization step prior to analysis. Frequent discussions between taxonomists responsible for the BICY and EPA floras enabled the harmonization of distinct morphological taxonomic units (MOTUs) for the most common taxa, and a taxonomic assignment for these MOTUs was agreed upon. When agreement on a taxonomic assignment for a MOTU could not be achieved, taxa were lumped into a new composite MOTU (Table S1). Instead of dropping difficult taxa from an analysis, inclusion through lumping can improve transfer functions involving multiple analysts, provided that the lumping decision is consistent throughout the combined datasets (Lee et al. 2019). Harmonization resulted in consistent assignments of 97.2% and 98.8% of diatom MOTUs in the BICY and EPA datasets, respectively, and the remaining non-harmonized taxa and those not identified to the species level or below were removed from the analysis. Since the presence of rare species in data sets can create noise and reduce the clarity of underlying patterns, and because we were interested in comparing the mat TP optima of taxa that were present in both wetlands, we removed taxa occurring in < 1% of the samples in each of the regional datasets (BICY and EPA) and taxa with a mean relative abundance of < 0.5% in each of the regional datasets. When harmonization and screening were complete, we merged the two taxonomic and environmental datasets into a common combined dataset. References: CERP (Comprehensive Everglades Restoration Plan) (2020) 2020 CERP Report to Congress. Department of the Army, Washington DC Gaiser EE, Scinto LJ, Richards JH, Jayachandran K, Childers DL, Trexler JC, Jones RD (2004) Phosphorus in periphyton mats provides the best metric for detecting low-level P enrichment in an oligotrophic wetland. Wat Res 38:507–516 Gaiser EE (2009) Periphyton as an indicator of restoration in the Florida Everglades. Ecol Indic 9:S37–S45 Hasle GR, Fryxell GA (1970) Diatoms: Cleaning and Mounting for Light and Electron Microscopy. T Am Microsc Soc 89:469–474 Philippi T (2005) Adaptive Cluster Sampling for Estimation of Abundances Within Local Populations of Low-Abundance Plants. Ecol 86:1091–1100 Lee SS, Gaiser EE, Van De Vijver B, Edlund MB, Spaulding SA (2014) Morphology and typification of Mastogloia smithii and M. lacustris, with descriptions of two new species from the Florida Everglades and the Caribbean region. Diatom Res 29:325–350 Lee SS, Bishop IW, Spaulding SA, Mitchell RM, Yuan LL (2019) Taxonomic harmonization may reveal a stronger association between diatom assemblages and total phosphorus in large datasets. Ecol Indic 102:166–174 Slate JE, Stevenson RJ (2007) The diatom flora of phosphorus-enriched and unenriched sites in an Everglades marsh. Diatom Res 22:355–386 Solórzano L, Sharp JH (1980) Determination of total dissolved phosphorus and particulate phosphorus in natural waters. Limnol Oceanogr 25:754–758 Stevens DL, Olsen AR (2004) Spatially Balanced Sampling of Natural Resources. J Am Stat Assoc 99:262–278 Urgelles R, Whelan RT, Muxo R, Shamblin RB, Patterson JM (2019) Periphyton Monitoring in Big Cypress National Preserve: Protocol Narrative. Natural Resource Report, National Park Service. Weber CI (1980) Biological Field and Laboratory Methods for Measuring the Quality of Surface Waters and Effluents. US Environmental Protection Agency, Cincinnati Distribution Online distribution: https://pasta.lternet.edu/package/data/eml/knb-lter-fce/1276/2/ce5791c0b85a6028f5a5f6c569f6c2a9 Intellectual Rights Dataset Keywords Big Cypress National Preserve Weighted average Everglades National Park FCE LTER Florida Coastal Everglades LTER diatoms freshwater wetlands periphyton populations populations Dataset Contact Name: Kelsey Solomon Organization: Florida International University Email: ksolomon@fiu.edu Name: Evelyn Gaiser Organization: Florida International University Email: gaisere@fiu.edu Position: Information Manager Organization: Florida Coastal Everglades LTER Address: Florida International University 11200 SW 8th Street, OE 148 Miami, FL 33199 USA Email: fcelter@fiu.edu URL: https://fcelter.fiu.edu Data Table and Format Data Table: CSV file containing diatom composition data and associated environmental data. Entity Name: FCE_1276_Solomon_Comparison_Data Entity Description: CSV file containing diatom composition data and associated environmental data. Object Name: FCE_1276_Solomon_Comparison_Data.csv Number of Header Lines: 1 Attribute Orientation: column Field Delimiter: , Number of Records: 6709 Data Table: CSV data file containing GPS locations for each site that was sampled Entity Name: FCE_1276_Solomon_Comparison_GPS Entity Description: CSV data file containing GPS locations for each site that was sampled Object Name: FCE_1276_Solomon_Comparison_GPS.csv Number of Header Lines: 1 Attribute Orientation: column Field Delimiter: , Number of Records: 545 Metadata Provider Organization: Florida Coastal Everglades LTER Address: Florida International University 11200 SW 8th Street, OE 148 Miami, FL 33199 USA Phone: 305-348-6054 Email: fcelter@fiu.edu URL: https://fcelter.fiu.edu Award(s) Project award(s): Award title: Establishing a Protective Phosphorus Criterion for the Big Cypress National Preserve Funder name: National Park Service Award number: P22AC00276-00 Award title: Aquatic fauna and periphyton production data collection Funder name: United States Army Corps of Engineers Award number: 912HZ-11-2-0048 Award title: Aquatic fauna and periphyton production data collection Funder name: United States Army Corps of Engineers Award number: W912HZ-16-2-0008 Award title: Aquatic fauna and periphyton production data collection Funder name: United States Army Corps of Engineers Award number: W912HZ-20-2-0018 Award title: George M. Barley Jr. Eminent Scholars Chair Endowment Funder name: Florida International University Award title: Periphyton Monitoring in Big Cypress National Preserve Funder name: South Florida Inventory and Monitoring Network, Big Cypress National Preserve Resource and Management and Fire Aviation branch Related project award(s): Award title: LTER: Coastal Oligotrophic Ecosystem Research Funder name: National Science Foundation Award number: 2025954 Award URL: https://www.nsf.gov/awardsearch/showAward?AWD_ID=2025954&HistoricalAwards=false Project permits National Park Service scientific research and collecting permit EVER-2016-SCI-0003 National Park Service scientific research and collecting permit EVER-2018-SCI-0054