Data and Resources


Resource Download

Additional Info

Field Value
Title MicrobiomeMilkMap 2021-2022: Data underlying the publication "Seasonal and geographical impact on the Irish raw milk microbiota correlates with chemical composition and climatic variables"
License CC-BY-NC
Teagasc Department Food Biosciences Research
Téama Food
Cur síos
Language English
Principal Investigator (PI) Dr Orla O'Sullivan
Principal Investigator (PI) email orla.osullivan@teagasc.ie
Principal Investigator (PI) ORCID https://orcid.org/0000-0002-4332-1109
Data creator(s)
  1. Min Yap
Geographic coverage Ireland
Digital Object Identifier (DOI) doi.org/10.82253/ZYQK-V069
Citation O'Sullivan, O. (2025). MicrobiomeMilkMap 2021-2022: Data underlying the publication "Seasonal and geographical impact on the Irish raw milk microbiota correlates with chemical composition and climatic variables" [Data set]. Teagasc - The Irish Agriculture and Food Development Authority. https://doi.org/10.82253/ZYQK-V069
Rights notes This license enables reusers to distribute, remix, adapt, and build upon the material in any medium or format for noncommercial purposes only, and only so long as attribution is given to the creator. CC BY-NC includes the following elements: BY: credit must be given to the creator. NC: Only noncommercial uses of the work are permitted.
Related resources
  1. Yap, M., O'Sullivan, O., O'Toole, P.W., Sheehan, J.J., Fenelon, M.A. and Cotter, P.D., (2024) Seasonal and geographical impact on the Irish raw milk microbiota correlates with chemical composition and climatic variables. mSystems, 9, e01290-23. https://doi.org/10.1128/msystems.01290-23
Equipment used - Sample collection and preparation: Raw bovine milk samples (200 ml) were collected from silos from 9 locations across Ireland weekly from March 2021 to March 2022 (n=241). The samples were collected over 2 days, transported under refrigeration and stored at 4 degrees C, to mimic conditions of their storage in bulk tanks or silos, for a maximum of 48h before sample processing of all samples together. Samples were prepared as follows: 30 ml of the bovine milk sample was centrifuged at 4,500 x g for 20 min at 4 degrees C. After centrifugation, the cream and supernatant were discarded, and the pellets were subjected to two washing steps, whereby the pellets were resuspended in sterile PBS and centrifuged at 13,000 x g for 1 minute, after which the supernatant was discarded, and the pellet was stored at -20 degrees C before DNA extraction. - DNA extraction: Samples were subjected to DNA extraction using the MolYsis complete5 kit (Molzym GmBH & Co. KG, Bremen, Germany), with 50 microlitres of DNA eluted for downstream sequencing. The MolYsis kit was used to improve microbiota characterization by significantly enhancing the microbial sequencing depth of milk samples. gDNA was quantified using the Qubit dsDNA HS assay kit (Invitrogen) and stored at -20 degrees C before library preparation. - Shotgun metagenomic sequencing: 248 samples (241 samples and 7 controls) were prepared for shotgun metagenomic sequencing according to Illumina Nextera XT library preparation kit guidelines, using unique dual indexes for multiplexing with the Nextera XT index kit (Illumina). Following indexing and clean-up, samples were pooled to an equimolar concentration of 1 nM. Samples were sequenced in two pools, the first pool containing 98 samples on an Illumina NextSeq 550 sequencing platform with a V2 kit, and the second containing 150 samples on an Illumina NextSeq 2000 sequencing platform with a P3 chip, at the Teagasc DNA Sequencing Facility, using standard Illumina sequencing protocols. - Bioinformatic processing: Default parameters were applied for all the bioinformatic tools unless otherwise specified. Quality checks and adapter trimming were performed with FastQC (0.11.8) and cutadapt (2.6) and host reads were aligned to the bovine genome (Bos taurus) and removed with Bowtie2 (2.4.4). Taxonomic classification was performed with Kraken2 (2.0.7) (32) using the Genome Taxonomy Database (release 89) which contains Bacteria and Archaea. SUPER-FOCUS was used to predict the microbiological functional potential of shotgun reads, through the alignment of reads against a reduced SEED database using DIAMOND, with results classified into subsystems (sets of protein families with similar function). Resistome analysis was done using Resistance Gene Identifier (RGI 4.2.2), with the strict cut-off. Assembly of Metagenome Assembled Genomes (MAGs) was done using metaSPAdes (3.13), followed by binning with MetaBAT2 (2.12.1) and quality assessment with checkM (1.0.18). High-quality MAGs, of at least 90% completeness and less than 5% contamination were assigned taxonomy with GTDB-tk (2.1.1). - Chemical analysis: The chemical composition of the 100 ml of milk samples was determined by DPTC analytical staff at the Technical Services lab at the Teagasc Food Research Centre. Kjeldahl analysis was used to determine protein and nonprotein nitrogen (NPN) contents. Rose Gottlieb method was used to determine fat content, and the CEM SMART Trac II (CEM, Matthews, NC, USA) was used to measure the total solids content. Polarimetry was used to determine the lactose content, and titration was used to determine titratable acidity (TA) in raw milk samples. - Climactic data: Monthly climate data for the sampling locations relating to mean temperature (degrees C), total rainfall (mm), grass minimum temperature (degrees C), mean wind speed (knots) and sunshine duration (daily hours of sun) was retrieved from the Irish Meteorological Service website (www.met.ie). The months of March, April and May were classified as Spring, June, July and August as Summer, September, October and November as Autumn and December, January and February as Winter. - Statistical analysis: Statistical analysis and data visualization was performed in R (4.1.2). All data was cleaned, analyzed and visualised in R with ggplot2, tidyverse and ggpubr packages (44, 45). Kruskal-Wallis and pairwise Wilcoxon rank sum tests with Benjamini-Hochberg P-value correction were used to compare sampling seasons and locations. Microbiota diversity analysis was performed with the vegan package (46), and beta diversity was calculated as Bray-Curtis metrics, visualised in a principal coordinate analysis plot. The adonis function from the vegan package was used to calculate the permutational analysis of variance (PERMANOVA) to determine differences in composition of the community between groups of samples (number of permutations=999). Redundancy analysis was also done with vegan and visualised using the ggord package. The multiplatt function from the indicspecies package was used to identify taxa that were significantly associated with particular seasons and sampling locations, by calculating Pearson's phi coefficient of association and correcting for unequal group sizes using the parameter r.g. Pearson's correlation was measured with the R base function, cor, and visualised using ggcorrplot.