Training GPT-4 Used as Much Energy as 120 US Homes. Let’s Talk About That

The electricity bill for training a single AI model now rivals a small town’s annual power draw. We break down the numbers behind GPT-4’s energy appetite, compare it to everyday benchmarks, and examine what the industry’s explosive growth means for global carbon budgets.

The Number That Started the Conversation

Somewhere between late 2022 and early 2023, roughly 25,000 Nvidia A100 GPUs hummed around the clock inside a data center for about 90 to 100 days. When they finally stopped, GPT-4 existed. The process consumed an estimated 51,000 to 62,000 megawatt-hours of electricity, according to analysis published in Towards Data Science.

That range is wide, but even the lower bound is staggering. The average American household uses roughly 10,500 kWh per year. Divide 51,000 MWh by 10.5 MWh and you land at about 4,857 homes powered for an entire year. Scale it down to the 90-day training window, and the instantaneous draw still eclipses the annual consumption of well over 120 households running simultaneously.

One training run. One model. Before a single user ever typed a prompt.

These numbers come from semi-official leaks about GPT-4’s architecture and training infrastructure. OpenAI has never published an official energy audit. That silence is itself part of the problem, but more on that later. What matters right now is the scale: training a single frontier language model consumes more electricity than some small towns use in a decade.

Putting the Watts in Context

Numbers in the tens of thousands of megawatt-hours are hard to feel. They belong to utility filings and grid operator dashboards, not dinner-table conversations. So here is a table that translates GPT-4’s training energy into objects and activities most people encounter in ordinary life.

ComparisonEquivalent
US households powered for 1 year~4,900 homes
Electric vehicles charged (60 kWh battery)~850,000 full charges
Flights from New York to London (per passenger)~24,000 round trips
Years of Netflix streaming (per user, ~36 kWh/yr)~1.4 million user-years
Olympic swimming pools of water for coolingEstimated 4 to 6 pools
Compared to GPT-3 training40 to 48 times more energy

The leap from GPT-3 to GPT-4 is perhaps the most revealing row. GPT-3 was already considered energy-intensive when it debuted in 2020, consuming roughly 1,287 MWh and emitting about 500 metric tons of CO2, the equivalent of driving from New York to San Francisco roughly 438 times. GPT-4’s energy requirement grew by a factor of 40 or more, while its parameter count is estimated to be only about 10 times larger.

Bigger models do not scale linearly. They scale super-linearly in energy cost. This is a fundamental property of transformer architectures: doubling the parameters roughly quadruples the compute, and compute maps directly to electricity.

It Doesn’t Stop at Training

Training is a one-time event. Inference, the process of actually answering your questions, runs every second of every day. Epoch AI estimates that ChatGPT’s operational electricity demand reached roughly 391,000 to 463,000 MWh per year as of 2024, driven by an estimated 700 million daily queries routed through GPT-4o alone. That figure does not include queries handled by GPT-3.5, GPT-4 Turbo, or the newer GPT-4.5 and reasoning models.

To put that annual inference figure in perspective, it is six to nine times larger than the one-time training cost. The model that took 90 days and 51,000 MWh to build now consumes that much energy roughly every seven weeks just answering questions.

And the next generation is hungrier still. Early benchmarks suggest that a single 1,000-token response from GPT-5 consumes around 18 Wh on average, nearly nine times more than the 2.12 Wh GPT-4 required for the same task. If query volume stays flat, inference costs alone could triple year over year. Query volume, of course, is not staying flat.

This creates a compounding problem. Each new model generation demands more power to train and more power to serve. The industry is scaling both axes simultaneously. Meanwhile, users are shifting from short Q&A prompts to extended conversations, document analysis, and agentic workflows that require dozens of sequential model calls per task. The energy cost per user session is rising even faster than the per-query average suggests.

The Corporate Carbon Reckoning

For years, the largest technology companies anchored their public image partly on ambitious climate pledges. Google declared carbon neutrality in 2007. Microsoft promised to be carbon-negative by 2030. Then the AI race accelerated, and the math stopped working.

Google’s greenhouse gas emissions grew nearly 50 percent over five years, climbing from 5.8 million metric tons of CO2 in 2020 to over 11.2 million in 2024. The company quietly stopped claiming operational carbon neutrality in 2023. Microsoft’s location-based Scope 2 emissions more than doubled over the same period, from 4.3 million metric tons to nearly 10 million, pushing total emissions 23.4 percent above its 2020 baseline, according to NPR’s reporting on the companies’ sustainability filings.

The International Energy Agency projects that global data center electricity demand will more than double by 2030, reaching around 945 terawatt-hours. Goldman Sachs Research forecasts that roughly 60 percent of that increased demand will be met by burning fossil fuels, adding an estimated 220 million tons of carbon emissions globally. In 2025 alone, global investment in AI-focused data center infrastructure reached an estimated $580 billion.

Ireland already devotes 21 percent of its national electricity supply to data centers, a figure that could reach 32 percent by 2026. In the United States, data centers consumed more than 4 percent of total electricity in 2024, approximately 183 terawatt-hours, and projections suggest this could reach 7 to 12 percent by 2028. The U.S. Energy Information Administration expects data center electricity use to grow by 133 percent by 2030.

These are not hypothetical future costs. They are happening now, visible in every quarterly sustainability report that Big Tech would rather you skim than read.

51,000 – 62,000 MWh
Estimated electricity consumed training GPT-4
~4,900
US homes (1 yr)
40–48x
more than GPT-3
25,000
A100 GPUs used

What Efficiency Gains Actually Look Like

The picture is not entirely bleak. Hardware efficiency is improving. Google reported a 33-fold reduction in energy per unit of AI inference over a single year, alongside a 44-fold reduction in associated carbon emissions, largely through custom TPU chips and more efficient model architectures. Techniques like quantization, distillation, and mixture-of-experts routing mean that newer models can deliver comparable quality with far fewer active parameters per query.

Google’s own measurements show that a median text prompt on Gemini uses just 0.24 watt-hours of energy, emits 0.03 grams of CO2 equivalent, and consumes 0.26 milliliters of water. At the per-query level, these are genuinely small numbers. The problem is volume. Multiply 0.24 Wh by hundreds of millions of daily queries across multiple providers, and the aggregate is enormous.

This is Jevons’ paradox in real time: making AI cheaper to run makes it attractive for more use cases, which drives total consumption upward even as per-unit consumption falls. When every smartphone, search engine, email client, and enterprise workflow begins calling an LLM on every interaction, aggregate demand can overwhelm any per-query savings.

The only way to break the paradox is to pair efficiency with absolute caps or with a genuinely clean energy supply. Some companies are pursuing the latter. Microsoft signed nuclear power agreements with Constellation Energy to restart the Three Mile Island reactor. Amazon invested in small modular reactors through partnerships with X-energy and Talen Energy. Google contracted for geothermal capacity from Fervo Energy.

Whether these investments will scale fast enough to match AI’s growth trajectory is, at this point, an open question without a comfortable answer. Nuclear projects take years to permit and build. Geothermal is geographically constrained. Solar and wind are intermittent and require massive battery storage to provide the 24/7 baseload that data centers demand.

The Transparency Problem

One of the most frustrating aspects of this debate is how little we actually know. OpenAI has never published the energy consumption of GPT-4’s training. The figures cited throughout this article come from independent researchers working with leaked architectural details. Anthropic, Google DeepMind, and Meta have been similarly opaque about the full energy costs of their frontier models.

Reporting is currently voluntary. No government requires AI companies to disclose the carbon footprint of training runs or the cumulative energy cost of inference at scale. The European Union’s AI Act addresses safety and bias but not energy transparency. The United States has no federal reporting mandate for AI energy use.

This opacity matters because it prevents informed public debate. Citizens cannot weigh the benefits of AI against its environmental costs if the costs are hidden behind corporate non-disclosure. Researchers cannot develop accurate climate models for the technology sector without reliable data. And investors cannot properly assess climate risk in their AI portfolios without standardized reporting.

Until mandatory disclosure becomes the norm, every number in every article about AI energy consumption, including this one, carries an asterisk. We are estimating in the dark, and the estimates keep getting larger.

Frequently Asked Questions

How much CO2 did training GPT-4 produce?

Estimates vary depending on the energy grid’s carbon intensity, but analyses using average US grid figures place GPT-4’s training emissions at roughly 5,000 to 8,000 metric tons of CO2, comparable to the annual emissions of about 1,000 passenger cars. The actual figure depends heavily on whether the data center sourced renewable energy. If trained in a region powered primarily by coal, the number could be significantly higher.

Is inference or training the bigger energy concern?

Training is a one-time spike, but inference accumulates continuously. For widely used models like GPT-4o, annual inference energy (391,000+ MWh) already dwarfs the original training cost by a factor of six to nine. As user bases grow and queries get longer and more complex, inference becomes the dominant long-term energy driver by an ever-widening margin.

Can renewable energy fully offset AI’s power demand?

In principle, yes. In practice, current renewable deployment is not keeping pace with AI data center construction. Goldman Sachs estimates 60 percent of new data center demand will be met by fossil fuels through the end of the decade. Corporate power purchase agreements for wind, solar, and nuclear are growing, but matching 24/7 demand with intermittent renewables remains an unsolved engineering and economic challenge that requires massive battery storage infrastructure.

Leave a Comment