Data Collection and Analysis Methods for Datacenter Environmental Impact Study
This research utilizes publicly available environmental reporting data from Meta (formerly Facebook), Google, and Microsoft, spanning the period from 2014 to 2024. The data was extracted from official corporate sustainability reports and environmental disclosures.
Source: Meta Environmental Data Index (2015-2024)
Coverage: Datacenter-specific water and energy consumption metrics
Availability: Data available for years 2015, 2016, 2019-2024
Data Gaps: Years 2017 and 2018 are missing from Meta's public reporting
Data Quality: Direct reporting from official sustainability reports (High confidence)
Source: Google Environmental Reports (2018-2024)
Coverage: Water withdrawal data from 2020 onwards; energy consumption from 2018 onwards
Availability: Comprehensive datacenter energy data from 2020; datacenter-specific water data only from 2024
Notable Change: Google began separating datacenter water usage from office water usage in their 2024 report, representing a significant improvement in reporting transparency
Data Quality: 2024 water data is actual datacenter-only (High confidence); 2020-2023 estimated using 70.7% ratio (Medium-High confidence)
Source: Microsoft Environmental Sustainability Reports (2020-2024)
Coverage: Total operational water and energy consumption (datacenter + offices combined)
Availability: Data available for years 2020-2024 (FY2020-FY2024)
Data Limitation: Microsoft does NOT separate datacenter-specific consumption from office operations. All reported figures represent total operational consumption.
Data Quality: Official sustainability reports with audited figures (High confidence for totals), but datacenter-specific breakdown unavailable
Environmental data was manually extracted from PDF sustainability reports and corporate environmental disclosures. Key metrics collected include:
Cross-referenced reported values across multiple years to ensure consistency in reporting methodologies. Identified and documented any changes in measurement or reporting standards between years.
A critical challenge in analyzing Google's environmental impact is that the company did not separate datacenter water usage from office water usage in their public reports prior to 2024. To estimate historical datacenter-specific water consumption, this research employs a ratio-based estimation method derived from Google's 2024 reporting.
In 2024, Google reported for the first time separate water withdrawal figures for datacenters and offices:
The datacenter proportion is calculated as: (29.482 / 41.678) × 100 = 70.7%
This 70.7% ratio was applied retrospectively to Google's total operational water consumption figures from 2020-2023 to estimate datacenter-specific water usage:
This estimation approach carries several important caveats:
While this limitation must be acknowledged, several factors support the reasonableness of this estimation:
Meta: Years 2017 and 2018 are absent from public environmental reporting. This gap represents approximately 18% of the study period for Meta and limits trend analysis during a critical growth phase.
Google: Water withdrawal data only begins in 2020. Years 2018-2019 show energy consumption but no water data, preventing comprehensive analysis of water usage efficiency (WUE) metrics for those years.
Microsoft: Data only available from 2020-2024 (fiscal years). No data for 2015-2019, limiting historical comparison with Meta. Additionally, Microsoft reports total operational consumption without datacenter-specific breakdowns.
The most significant limitation is Google's change in water reporting methodology in 2024. Prior to 2024, Google reported combined operational water (datacenters plus offices), while from 2024 onwards, datacenter water is reported separately. This methodological shift necessitates the estimation approach described in Section 3.
Individual datacenter-level data is only available for Google in 2024, with 40+ locations detailed. Historical data and all Meta and Microsoft data are reported only at the corporate aggregate level, limiting geographic and facility-specific analysis.
This study incorporates data of varying quality levels, which must be considered when interpreting results:
The data primarily reflects water withdrawal (water taken from sources) rather than water consumption (water not returned to the source). Google's 2024 report indicates that approximately 75% of withdrawn water is consumed, but this distinction is not available for earlier years or for Meta's and Microsoft's reporting.
Meta, Google, Microsoft, and xAI report water usage in various units in their original reports. All values have been standardized to liters for consistency:
Conversion factors:
1 U.S. gallon = 3.78541 liters
1 million cubic meters = 1 billion liters
Example calculations:
Energy consumption is reported in megawatt-hours (MWh) or terawatt-hours (TWh). All values standardized to MWh in the dataset. All energy figures represent annual totals.
Water usage efficiency is calculated as liters of water consumed per megawatt-hour of IT energy:
WUE = Total Water (liters) / Total Energy (MWh)
This metric allows for comparison of water intensity relative to computational workload, providing insight into both datacenter efficiency improvements and the water cost of AI infrastructure growth.
Each data point in the analysis has been classified according to its reliability:
High Confidence:
- Meta all years (2015-2024): Direct datacenter-specific reporting
- Google 2024 water: Actual datacenter-separated data
- Google 2020-2024 energy: Direct datacenter reporting
- Microsoft 2021-2023: Official audited sustainability reports (total operational)
Medium-High Confidence:
- Google 2020-2023 water: Estimated using 70.7% ratio with supporting evidence
- Microsoft 2020 & 2024: Calculated/estimated using partial data
The analysis focuses on the period 2015-2024, which captures the transition from traditional cloud computing to the AI-intensive infrastructure era, particularly the rapid growth following 2020 with the emergence of large language models and generative AI systems.
Direct comparisons between Meta and Google must account for fundamental differences in business models:
As corporate environmental reporting standards continue to evolve, future analyses would benefit from: