Artificial
Intelligence
Index Report
2
0
2
6
1
Contents
Introduction
Top Takeaways
1 Research and Development
2 Technical Performance
3 Responsible AI
4 Economy
5 Science
6 Medicine
7 Education
8 Policy and Governance
9 Public Opinion
Appendix
2
9
12
68
126
171
231
255
288
323
360
385
AI INDEX REPORT 2026
The AI Index is an independent initiative at the Stanford Institute for Human-
Centered Artificial Intelligence (HAI).
The AI Index was conceived within the One Hundred Year Study on Artificial
Intelligence (AI100).
Welcome to the ninth edition of the AI Index report. As AI continues to advance rapidly, the question
becomes whether the systems built around it can keep up. Governance frameworks, evaluation methods,
education systems, and the data infrastructure needed to track AI’s impact are struggling to match the pace
of the technology itself. That gap—between what AI can do and how prepared we are to manage it—runs
through every chapter of this year’s report. New in this edition, the report tracks how AI is being tested
more ambitiously across reasoning, safety, and real-world task execution, and why those measurements are
increasingly difficult to rely on. It also features new estimates of generative AI’s economic value alongside
emerging evidence of its labor market effects, an analytical framework on AI sovereignty, and a science
chapter developed in collaboration with Schmidt Sciences. For the first time, the report features standalone
chapters on AI in science and AI in medicine, reflecting AI’s growing impact across these two domains.
For close to a decade, the AI Index has worked to bring reliable global data to a field that is evolving faster
than most efforts to measure it. The report equips policymakers, researchers, executives, journalists, and the
public with the necessary evidence to make informed decisions about AI. As the technology moves deeper
into classrooms, clinics, and legislatures—and reshapes how people work, learn, and govern—the cost of
incomplete data continues to rise.
In a field where much data is produced by organizations with a stake in the technology’s success, the demand
for neutral and rigorous measurement continues to grow. The AI Index remains independent and focused on
revealing the long-term patterns underneath the headlines. The report is relied on by governments, research
institutions, and companies around the world, and referenced by media outlets and in academic papers.
The pages that follow offer the most comprehensive, independently sourced picture of AI’s trajectory that
is available. They also make clear where that picture remains incomplete—because what we cannot yet
measure matters just as much as what we can.
2
Introduction
AI INDEX REPORT 2026
3
A year ago, this report documented AI’s arrival as a mainstream force. This year’s data shows what
happens after arrival.
This is a technology that has reached mass adoption faster than the personal computer or the internet.
Generative AI hit nearly 53% population-level adoption within three years. Leading AI companies are
reaching meaningful revenue scale in a fraction of the time it took previous technology generations,
and global corporate investment more than doubled in 2025. Organizational adoption rose to 88%, and
early estimates suggest the consumer value of generative AI has grown substantially within a year.
The data does not point
in a single direction. It
reveals a field that is
scaling faster than the
systems around it
can adapt.
“
Message from
the Co-chairs
INTRODUCTION | AI INDEX REPORT 2026
At the technical frontier, leading models are now
nearly indistinguishable from one another.
Open-weight models are more competitive than
ever. But as models converge, the tools used to
evaluate them are struggling to stay relevant.
Benchmarks are saturating, frontier labs are
disclosing less, and independent testing does not
always confirm what developers report.
The chapters that follow trace what this scale
of activity and capability means in practice. In
science, AI shifted from accelerating individual
research steps to attempting full replacement
of entire workflows. In medicine, clinical AI
tools moved from pilot programs to broader
deployment, with systems like ambient AI scribes
scaling across health systems.
Governments around the world acted on AI in 2025, but not in the same direction. The EU AI Act’s first
prohibitions took effect, while the United States shifted toward deregulation. Japan, South Korea, and
Italy each passed national AI laws, and more than half of newly adopted national AI strategies came
from developing countries entering the policy landscape for the first time. AI sovereignty emerged as a
central organizing principle across all of these efforts. The public is also navigating competing signals.
Global optimism about AI rose in 2025, but so did nervousness.
The data does not point in a single direction. It reveals a field that is scaling faster than the systems
around it can adapt. We encourage you to explore and decide for yourself.
Yolanda Gil and Raymond Perrault
Co-chairs, AI Index Report
4
Anthropic, OECD
JACK CLARK
Umeå University
VIRGINIA DIGNUM
Stanford University
RUSS ALTMAN
Northeastern
University
CARLA BRODLEY
Stanford University
ERIK
BRYNJOLFSSON
Steering Committee
INTRODUCTION | AI INDEX REPORT 2026
JPMorgan
Chase & Co.
TERAH LYONS
University of
Minnesota
VIPIN KUMAR
Stanford University
JAMES LANDAY
Google, University
of Oxford
JAMES MANYIKA
Stanford University,
Salesforce
JUAN CARLOS
NIEBLES
Stanford University
VANESSA PARLI
Stanford University,
AI21 Labs
YOAV SHOHAM
Brookings
ELHAM TABASSI
Stanford University
RUSSELL WALD
University of
Washington
DAN WELD
UNSW Sydney
TOBY WALSH
SRI International
RAYMOND PERRAULT
University of Southern California,
Information Sciences Institute
YOLANDA GIL
CHAIR CO-CHAIR
MEMBERS
5
How to Cite This Report
Sha Sajadieh, Loredana Fattorini, Raymond Perrault, Yolanda Gil, Vanessa Parli, Lapo Santarlasci, Juan Pava,
Nestor Maslej, Russ Altman, Erik Brynjolfsson, Carla Brodley, Jack Clark, Virginia Dignum, Vipin Kumar,
James Landay, Terah Lyons, James Manyika, Juan Carlos Niebles, Yoav Shoham, Elham Tabassi, Russell
Wald, Toby Walsh, Dan Weld. “The AI Index 2026 Annual Report,” AI Index Steering Committee, Institute for
Human-Centered AI, Stanford University, Stanford, CA, April 2026.
The AI Index 2026 Annual Report by Stanford University is licensed under
Attribution-NoDerivatives International.
Public Data and Tools
The AI Index 2026 Report is supplemented by raw data and an interactive tool. We invite each reader to use
the data and the tool in a way most relevant to their work and interests.
• Raw data and charts: The public data and high-resolution images of all the charts in the report are
available on Google Drive.
• Global AI Vibrancy Tool: Compare the AI ecosystems of 36 countries. The Global AI Vibrancy tool will be
updated by the end of 2026.
AFFILIATED RESEARCHERS
UNDERGRADUATE
RESEARCHER
Stanford University
HENRY
ZHANG
GRADUATE
RESEARCHER
Stanford University
SUKRUT
OAK
Stanford University
NESTOR
MASLEJ
Stanford University
JUAN NICOLAS
PAVA
IMT School for
Advanced Studies Lucca
LAPO
SANTARLASCI
INTRODUCTION | AI INDEX REPORT 2026
Staff and Researchers
LEAD AND EDITOR-IN-CHIEF RESEARCH MANAGER
Stanford University
SHA SAJADIEH
Stanford University
LOREDANA FATTORINI
6
INTRODUCTION | AI INDEX REPORT 2026
Supporting Partners
Analytics and
Research Partners
The AI Index welcomes feedback and new ideas for next year. Contact us at shahed@.
The AI Index is written by a team of human researchers. The authors used ChatGPT and Claude to help refine
and copy edit drafts. All images in this publication were generated with AI by Johanna Friedman (2026),
Gemini (W-Nanobanana2), Gemini 3 (W-Nanobanana Pro).
7
INTRODUCTION | AI INDEX REPORT 2026
The AI Index would like to acknowledge the following individuals by chapter and section for their
contributions of data, analysis, advice, and expert commentary included in the AI Index Report 2026:
Introduction
Loredana Fattorini, Yolanda Gil, Vanessa Parli, Ray
Perrault, Sha Sajadieh
Research and Development
Usman Anwar, Theo Burn, Emily Chen, Rachel Cook,
Jean-Stanislas Denain, Meredith Ellison, Loredana
Fattorini, Nicole Finn, Isabella Florez, Yolanda Gil, Tom
Hurd, Nabeel Khan, James Landay, Shayne Longpre,
Nestor Maslej, Magdalena Ortiz, Khalifa Oyebanji,
Orestis Papakyriakopoulos, Vanessa Parli, Ray Perrault,
Tom Piercey, Jennifer Rachford, Thomas Richadson,
Vesna Sabljakovic-Fritz, Sha Sajadieh, Lapo Santarlasci,
Sebastian Sardina, Andrew Shi, Yoav Shoham, Seth
Polsley, Daniel Weld, Kevin Xu, Meg Young
Chapter 1
Science
Michael Clear, Steven Dillmann, Loredana Fattorini, Yolanda Gil,
James Manyika, Vipin Kumar, Minjoon Kouh, Suhas Mahesh,
Vanessa Parli, Ray Perrault, Sha Sajadieh
Chapter 5
Medicine
Russ Altman, Peter Brodeur, Akshay Chaudhari, Jonathan
Chen, Matthew DeVerna, Abdoul Jalil Djiberou Mahamadou,
Loredana Fattorini, Ethan Goh, Yolanda Gil, Jeff Hancock,
Tina Hernandez-Boussard, Yeon Mi Hwang, Arman Koul,
Rohan Koodli, Alejandro Lozano, Danielle Luz, David Magnus,
Stephen P. Ma, Bethel Mieso, Fateme Nateghi Haredasht,
Natalie Pageler, Ayush Pandit, Vanessa Parli, Ray Perrault, Sean
Riordan, Ronald Robertson, Austin Schoeffler, Sha Sajadieh,
Kotoha Togami, Dennis Wall, David Wu
Chapter 6
Education
Carla Brodley, Joshua Childs, Lisa Cruz Novohatski, Loredana
Fattorini, Yolanda Gil, Rachel Goins, Laura Hinton, Sonia Koshy,
James Landay, Kirsten Lundgren, Jacqueline McCune, Vanessa
Parli, Ray Perrault, Sha Sajadieh, Bryan Twarek, Rebecca Zarch
Chapter 7
Economy
Tara Balakrishnan, Bharat Chandar, Erik Brynjolfsson, Ruyu
Chen, Michael Chui, Heather English, Loredana Fattorini,
Yolanda Gil, Bryce Hall, Heather Hanselman, Rosie Hood,
Akash Kaura, Elena Magrini, Nestor Maslej, James Manyika,
Rebecca Milde, David Nguyen, Katherine Ottenbreit, Vanessa
Parli, Ray Perrault, Courtney Prabhakar, Brittany Presten, Roger
Roberts, Sha Sajadieh, Lapo Santarlasci, Alex Singla, Alex
Sukharevsky, Casey Weston, Henry Zhang
Chapter 4
Technical Performance
Erik Brynjolfsson, Loredana Fattorini, Yolanda Gil, Tasha
Kim, Sanmi Koyejo, Nestor Maslej, Juan Carlos Niebles,
Sukrut Oak, Vanessa Parli, Ray Perrault, Sha Sajadieh,
Yoav Shoham, Toby Walsh, Daniel Weld, Henry Zhang
Chapter 2
Responsible AI
Gabriel Morgan Asaftei, Rishi Bommasani, Virginia
Dignum, Loredana Fattorini, Yolanda Gil, Nestor Maslej,
Katherine Ottenbreit, Vanessa Parli, Juan Nicolas Pava,
Ray Perrault, Brittany Presten, Cécile Prinsen, Roger
Roberts, Sha Sajadieh, Lapo Santarlasci, Abby Sticha,
Elham Tabassi, Yuanhao Zou
Chapter 3
Contributors
8
INTRODUCTION | AI INDEX REPORT 2026
Policy and Governance
Virginia Dignum, Loredana Fattorini, Johannes Fritz, Yolanda
Gil, Nestor Maslej, Vanessa Parli, Juan Nicolas Pava, Ray
Perrault, Sha Sajadieh, Lapo Santarlasci, Kamran Sattary,
Tyler Lenox Smith, Elham Tabassi, Russell Wald
Chapter 8
Public Opinion
Erik Brynjolfsson, Matt Carmichael, Zack Devlin-Foltz,
Loredana Fattorini, Nadja Flechner, Yolanda Gil, Connacher
Murphy, Vanessa Parli, Juan Nicolas Pava, Ray Perrault, Matt
Reynolds, Sha Sajadieh, Russell Wald, Henry Zhang
Chapter 9
The AI Index thanks the following organizations and individuals who provided data for inclusion in this
year’s report:
The AI Index also thanks Jeanina Matias, Nancy King, Carolyn Lehman, Shana Lynch, Jonathan Mindes, and
Johanna Friedman for their help in preparing this report; Christopher Ellis and Madeleine Wright for their
help in maintaining the AI Index website; and Annie Benisch, Marc Gough, Caroline Meinhardt, Drew Spence,
Casey Weston, and Daniel Zhang for their work in helping promote the report.
Epoch AI
GitHub
Lightcast
LinkedIn
Quid
Zeki
McKinsey & Company
Jean-Stanislas Denain
Kevin Xu
Elena Magrini, Rebecca Milde
Rosie Hood, Akash Kaura, Casey Weston
Heather English
Tom Hurd
Heather Hanselman, Katherine Ottenbreit, Brittany
Presten, Cécile Prinsen, Roger Roberts, Abby Sticha
ORGANIZATIONS
9
AI INDEX REPORT 2026
1 AI capability is not plateauing. It is accelerating and reaching more people than
ever. Industry produced over 90% of notable frontier models in 2025, and several of those models
now meet or exceed human baselines on PhD-level science questions, multimodal reasoning, and
competition mathematics. On a key coding benchmark—SWE-bench Verified—performance rose
from 60% to near 100% of meeting the human baseline in a single year. Organizational adoption
reached 88%, and 4 in 5 university students now use generative AI.
2 The .-China AI model performance gap has effectively closed. . and Chinese
models have traded the lead multiple times since early 2025. In February 2025, DeepSeek-R1
briefly matched the top . model, and as of March 2026 Anthropic’s top model leads by just
%. The . still produces more top-tier AI models and higher-impact patents, while China leads
in publication volume, citations, patent output, and industrial robot installations. South Korea
stands out for its innovation density, leading the world in AI patents per capita.
3 The United States hosts the most AI data centers, with the majority of their chips
fabricated by one Taiwanese foundry. The United States hosts 5,427 data centers, more
than 10 times any other country, and it consumes more energy than any other country. A single
company, TSMC, fabricates almost every leading AI chip, making the global AI hardware supply
chain dependent on one foundry in Taiwan—though a . expansion began operations in
2025.
4 AI models can win a gold medal at the International Mathematical Olympiad
but cannot reliably tell time—an example of what researchers call the jagged
frontier of AI. Gemini Deep Think earned a gold medal at IMO, yet the top model reads analog
clocks correctly just % of the time. AI agents made a leap from 12% to ~66% task success on
OSWorld, which tests agents on real computer tasks across operating systems, though they still
fail roughly 1 in 3 attempts on structured benchmarks.
5 Robots still fail at most household tasks, even as they excel in controlled
environments. Robots succeed in only 12% of household tasks, highlighting how far AI is from
mastering the physical world. On RLBench, robotic manipulation in software-based simulations
has reached % success, but the gap between predictable lab settings and unpredictable
household environments is wide.
6 Responsible AI is not keeping pace with AI capability, with safety benchmarks
lagging and incidents rising sharply. Almost all leading frontier AI model developers
report results on capability benchmarks, but reporting on responsible AI benchmarks remains
spotty. Documented AI incidents rose to 362, up from 233 in 2024. Adding to the challenge, recent
research found that improving one responsible AI dimension, such as safety, can degrade another,
such as accuracy.
Top Takeaways
10
TOP TAKEAWAYS | AI INDEX REPORT 2026
7 The United States leads in AI investment, but its ability to attract global talent
is declining. . private AI investment reached $ billion in 2025, more than 23 times
the $ billion invested in China—though looking at just private investment figures likely
understates China’s total AI spending, given its government guidance funds. The . also led in
entrepreneurial activity with 1,953 newly funded AI companies in 2025, more than 10 times the
next closest country. However, the number of AI researchers and developers moving to the .
has dropped 89% since 2017, with an 80% decline in the last year alone.
8 AI adoption is spreading at historic speed, and consumers are deriving
substantial value from tools they often access for free. Generative AI reached 53%
population adoption within three years, faster than the PC or the internet, though the pace
varies by country and correlates strongly with GDP per capita. Some show higher-than-expected
adoption, such as Singapore (61%) and the United Arab Emirates (54%), while the . ranks 24th at
%. The estimated value of generative AI tools to . consumers reached $172 billion annually
by early 2026, with the median value per user tripling between 2025 and 2026.
9 Productivity gains from AI are appearing in many of the same fields where entry-
level employment is starting to decline. Studies show productivity gains of 14% to 26% in
customer support and software development, with weaker or negative effects in tasks requiring
more judgment. AI agent deployment remains in single digits across nearly all business functions.
In software development, where AI’s measured productivity gains are clearest, . developers
ages 22 to 25 saw employment fall nearly 20% from 2024, even as the headcount for older
developers continues to grow.
10 AI’s environmental footprint is expanding alongside its capabilities. Grok 4’s
estimated training emissions reached 72,816 tons of CO2 equivalent. AI data center power
capacity rose to GW, comparable to New York state at peak demand, and annual GPT-4o
inference water use alone may exceed the drinking water needs of 12 million people.
11 AI models for science can outperform human scientists, though bigger models
do not always perform better. Frontier models outperform human chemists on average
on ChemBench, yet they score below 20% on replication in astrophysics and 33% on Earth
observation questions. A 111-million-parameter protein language model, MSAPairformer, beat
previous leading methods on ProteinGym, and a 200-million-parameter genomics model, GPN-
Star, outperformed a model nearly 200 times larger. Most AI foundation models for science come
from cross-sector collaborations, in contrast with the industry-dominated landscape of general-
purpose AI.
12 AI is transforming clinical care, but rigorous evidence remains limited. AI tools
that automatically generate clinical notes from patient visits saw substantial adoption in 2025.
Across multiple hospital systems, physicians reported up to 83% less time spent writing notes and
significant reductions in burnout. Beyond certain tools, however, the evidence base for clinical AI
remains thin. A review of more than 500 clinical AI studies found that nearly half relied on exam-
style questions rather than real patient data, with only 5% using real clinical data.
11
TOP TAKEAWAYS | AI INDEX REPORT 2026
13
14
15
Formal education is lagging behind AI, but people are learning AI skills at every
stage of life. Over 80% of . high school and college students now use AI for school-related
tasks, but only half of middle and high schools have AI policies in place, and just 6% of teachers
say those policies are clear. Outside the classroom, AI engineering skills are accelerating fastest
in the United Arab Emirates, Chile, and South Africa. The number of new AI PhDs in the .
and Canada increased 22% from 2022 to 2024, the PhDs that make up that increase took jobs in
academia, not in industry.
AI sovereignty is becoming a defining feature of national policy, but capabilities
remain uneven, even as open-source development helps to redistribute who
participates. National AI strategies are expanding, particularly among developing economies,
and state-backed investments in AI supercomputing are rising in parallel—a sign of growing
ambitions for domestic control over AI ecosystems. Yet model production remains concentrated
in the . and China. Open-source development is starting to redistribute participation, with
contributions from the rest of the world now outpacing Europe and approaching the United States
on GitHub, fueling more linguistically diverse models and benchmarks.
AI experts and the public have very different perspectives on the technology’s
future, and global trust in institutions to manage AI is fragmented. When it comes
to how people do their jobs, 73% of experts expect a positive impact, compared with just 23%
of the public, a 50-point gap. Similar divides appear for AI’s impact on the economy and medical
care. Globally, trust in governments to regulate AI varies. Among surveyed countries, the United
States reported the lowest level of trust in its own government to regulate AI, at 31%. Globally, the
EU is trusted more than the United States or China to regulate AI effectively.
12
Research and
Development
1
AI INDEX REPORT 2026
Overview
The resources powering AI development continued to grow
in 2025, but fewer notable models were released than the
year before, and the systems at the frontier are increasingly
concentrated among a small number of organizations.
Industry now accounts for over 90% of notable AI models,
and the most capable systems are also the least transparent,
with training code, dataset sizes, and parameter counts
increasingly withheld. The computing power behind these
models has grown roughly times per year since 2022, yet
almost all of it flows through a single chip foundry in Taiwan,
making the global hardware supply chain fragile. Open-
source development and AI publications continued to grow,
and the research landscape is becoming more geographically
distributed. China now leads in publication volume, citation
share, and patent grants, while smaller countries like
Switzerland and Singapore lead in AI researchers per capita.
Yet some dimensions of the field have not changed at all.
Gender gaps in AI talent remain deeply entrenched, with no
meaningful progress in any country since 2010. This chapter
covers the research and development pipeline, from the
landscape of AI models through the compute, data centers,
energy, and open-source software that support them, to the
broader research ecosystem of publications, patents, and talent.
13
Chapter Highlights
Notable AI Models
By National Affiliation
By Sector and Organization
Model Release
Parameter and Compute Trends
Highlight: Will Models Run Out of
Data?
Compute and Infrastructure
Performance and Efficiency
Hardware for Notable Models
Global Computing Capacity
Data Center Power Capacity
Data Centers
AI Infrastructure Beyond GPUs
Geographic Distribution
Energy and Environmental Impact
Training
Inference
Data Center Usage
Open-Source AI Software
AI Development Activity Overview
Projects
14
16
16
18
20
22
25
28
28
29
29
30
31
31
32
33
33
36
39
41
41
41
Contents
1 RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
Stars
Model and Dataset Ecosystem
Publications
Total Number of AI Publications
By Venue
Conference Attendance
By National Affiliation
By Sector
By Topic
Top 100 Publications
By National Affiliation
By Sector and Organization
Patents
Global Trends
Forward Citations Flow
Speed of Knowledge Diffusion
Technological Proximity
Highlight: AI Patent Examples
AI Authors and Inventors
Geographic Distribution
By Education Level
By Gender
By Specialization
Mobility
43
44
47
47
48
48
50
52
53
53
53
54
56
56
58
59
60
61
62
62
63
65
66
66
14
Contents
1
2
3
4
5
6
7
8
Industry produced over 90% of notable AI models in 2025, but the most capable models are
now the least transparent. Training code, parameter counts, dataset sizes, and training duration
are no longer disclosed for several of the most resource-intensive systems, including those from
OpenAI, Anthropic, and Google.
China leads in research, while the . leads in notable model development. China leads in
publication volume, citations, and patent grants, while the . retains higher-impact patents and
produced 50 notable models in 2025 to China’s 30. South Korea leads in AI patents per capita, and
China’s share of the top 100 most-cited AI papers grew from 33 in 2021 to 41 in 2024.
Reported parameters held in the trillions as disclosure dropped. Parameter counts have stayed
near 1 trillion for three years, though reporting from frontier labs has stopped. Training compute,
which can be estimated independently, has continued to rise.
Synthetic data is still not replacing real data in pre-training, but data quality and post-training
techniques are showing promise. OLMo Think 32B, with nearly 90 times fewer parameters
than Grok 4, achieves comparable results on several benchmarks through pruning, deduplication,
and curation alone.
Global AI compute capacity grew per year since 2022, reaching million H100-equivalents.
Nvidia accounts for over 60% of total compute, with Google and Amazon supplying much of the
remainder and Huawei holding a small but growing share. The buildout is being driven by hyperscaler
data center expansion and sustained demand for frontier model training and inference.
The United States leads in AI data centers, and one Taiwanese foundry fabricates the majority
of chips inside them. The United States hosts 5,427 data centers, more than ten times any other
country, consuming more energy than any other region. A single company, TSMC, fabricates
almost every leading AI chip and makes the global AI hardware supply chain dependent on one
foundry in Taiwan, though a . expansion began to operate in 2025.
AI’s environmental footprint increases across power, water, and emissions. In 2025, Grok 4’s
estimated training emissions reached 72,816 tons of CO₂ equivalent. AI data center power capacity
rose to GW, comparable to New York state at peak demand, and annual GPT-4o inference
water use alone may exceed the drinking water needs of 12 million people.
Open-source AI development continues to scale, with million projects on GitHub and
Hugging Face uploads tripling since 2023. .-based projects still attract the most engagement,
with 30 million cumulative GitHub stars across projects that have crossed the 10-star threshold.
1 RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
Chapter Highlights
15
CHAPTER HIGHLIGHTS | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
9
10
The number of AI researchers and developers moving to the United States has dropped 89% since
2017. The decline is accelerating, down 80% in the last year alone. The . is still home to more AI
talent than any other country, but it is attracting new talent at the lowest rate in over a decade.
The AI talent map is shifting, but gender gaps remain deeply entrenched. Switzerland and
Singapore lead the world in AI researchers and developers per capita and some countries show
relatively higher female representation, including Saudi Arabia (%), Canada (%), and
Australia (%), though no country approaches gender parity.
16
Notable AI Models
1 RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
This section starts with the models themselves. Using Epoch AI’s curated dataset of notable models, this
section examines where frontier AI models are coming from, how they are deployed, and what it takes
to build them. Epoch AI designates models as noteworthy based on criteria such as state-of-the-art
advancements, historical significance, or high citation rates. This is a manual curation, so the dataset is not
a census of all AI models or a full map of all model development Trends should be read as patterns
within the domain. The sections that follow track the infrastructure and inputs behind these systems,
including compute, data centers, energy costs, and open-source software, before looking at the broader
research ecosystem through publications, patents, and talent.
This chapter focuses on the research and development pipeline and its inputs. The next chapter, Technical
Performance, reviews model capabilities and benchmark performance in detail.
By National Affiliation2
Notable model production remains
concentrated within a small number of
countries (Figures –). Historically,
the United States has produced the
largest in total output numbers, followed
by China. This pattern continued in 2025
as the United States led with the release
of 50 notable AI models, China with 30,
and South Korea with 5. The number of
new model releases declined year over
year across all major geographic areas.
Figure
1 New and historic models are continually added to the Epoch AI database, so the total year-by-year counts of models included in this year’s AI Index
might not exactly match those published in last year’s report. The data is based on a snapshot taken on February 12, 2026.
2 A machine learning model is associated with a specific country if at least one author of the paper introducing it is affiliated with an institution
based in that country. In cases where a model’s authors come from several countries, double-counting can occur.
3 This chart highlights model releases from a select group of geographic areas. More comprehensive data on model releases by country will be
available in the upcoming AI Index Global Vibrancy Tool.
1
1
1
1
5
30
50
0 5 10 15 20 25 30 35 40 45 50
United Kingdom
Hong Kong
France
Canada
South Korea
China
United States
Number of notable AI models
Number of notable AI models by select geographic
areas, 2025
Source: Epoch AI, 2026 | Chart: 2026 AI Index report
17
NOTABLE AI MODELS | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
20
0
5
20
10
20
15
20
20
20
25
0
10
20
30
40
50
60
70
80
90
N
u
m
b
e
r
o
f
n
o
ta
b
le
A
I m
o
d
e
ls
2, Europe
30, China
50, United States
Number of notable AI models by select geographic
areas, 2003–25
Source: Epoch AI, 2026 | Chart: 2026 AI Index report
Figure
Figure
1–10
11–20
21–60
61–180
181–630
Number of notable AI models by geographic area, 2003–25 (sum)
Source: Epoch AI, 2026 | Chart: 2026 AI Index report
18
NOTABLE AI MODELS | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
By Sector and Organization
The development of notable AI models continues to be predominantly concentrated in industry (Figures
and ). Over the past decade, the share produced by industry has grown steadily and now represents the
largest share by a wide margin (%). In 2025, Epoch AI identified one notable AI model originating from
academia, compared to 87 from industry.
Within industry, a small set of organizations account for a large share of releases (Figures and ). In
2025, the top contributors were OpenAI (19), Google (12), and Alibaba (11). Since 2014, Google has produced
the largest number of notable models, followed by Meta and OpenAI. Within academia, Tsinghua University
(26), Stanford University (26), and Carnegie Mellon University (25) have been the most prolific over the past
decade.
20
0
3
20
0
4
20
0
5
20
0
6
20
0
7
20
0
8
20
0
9
20
10
20
11
20
12
20
13
20
14
20
15
20
16
20
17
20
18
20
19
20
20
20
2
1
20
22
20
23
20
24
20
25
0
10
20
30
40
50
60
70
80
90
N
u
m
b
e
r
o
f
n
o
ta
b
le
A
I m
o
d
e
ls
1, Academia
2, Other
5, Industry-academia collaboration
87, Industry
Number of notable AI models by sector, 2003–25
Source: Epoch AI, 2026 | Chart: 2026 AI Index report
Figure
19
NOTABLE AI MODELS | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
20
0
3
20
0
4
20
0
5
20
0
6
20
0
7
20
0
8
20
0
9
20
10
20
11
20
12
20
13
20
14
20
15
20
16
20
17
20
18
20
19
20
20
20
2
1
20
22
20
23
20
24
20
25
0%
20%
40%
60%
80%
100%
N
o
ta
b
le
A
I m
o
d
e
ls
(
%
o
f
to
ta
l)
%, Academia
%, Other
%, Industry-academia collaboration
%, Industry
Notable AI models (% of total) by sector, 2003–25
Source: Epoch AI, 2026 | Chart: 2026 AI Index report
19
12
11
7
5
4
4
4
3
3
3
3
3
2
2
1
1
1
1
1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Hong Kong Polytechnic University
CUHK Shenzhen Research Institute
Baidu
Ant Group
Allen Institute for AI (Ai2)
Nvidia
MiniMax
(Zhipu AI)
University of Illinois
Moonshot
Meta
ByteDance
Tsinghua University
LG AI Research
DeepSeek
xAI
Anthropic
Alibaba
Google
OpenAI
Academia
Industry
Nonprot
Number of notable AI models
Number of notable AI models by organization, 2025
Source: Epoch AI, 2026 | Chart: 2026 AI Index report
Figure
Figure
4 In the organizational tally figures, research published by DeepMind is classified under Google.
20
NOTABLE AI MODELS | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
191
86
59
42
29
26
26
25
25
20
19
17
15
13
13
12
12
11
10
10
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190
Chinese University of Hong Kong
ByteDance
New York University
Salesforce
Allen Institute for AI (Ai2)
Baidu
Anthropic
MIT
University of Oxford
University of Washington
UC Berkeley
Carnegie Mellon University
Alibaba
Tsinghua University
Stanford University
Nvidia
Microsoft
OpenAI
Meta
Google
Academia
Industry
Nonprot
Number of notable AI models
Number of notable AI models by organization, 2014–25 (sum)
Source: Epoch AI, 2026 | Chart: 2026 AI Index report
Figure
Model Release
Release patterns for notable AI models have continued to shift toward controlled access (Figure ). In
2025, API access was the most common release type, with 45 of 95 models made available this way. and
API-only releases have steadily increased since 2020. The second most common release type was “open
weights (unrestricted),” meaning the models are fully available for use, modification, and redistribution. The
remaining models were released in a mix of access types, including “hosted access (no API),”5 “open weights
(restricted use),”6 and “open weights (noncommercial).” The “unknown” designation refers to models that
have unclear or undisclosed access types, and “unreleased” models remain proprietary, accessible only to
their developers or select partners.
Training code is becoming even less accessible than model code overall (Figure ). In 2025, 80 of 95
notable models were released without their corresponding training code, compared to 4 that made their
code “open source.” In 2020, models with open source and unreleased training code were about the same in
number, but by 2023, the majority were unreleased and the gap has continued to widen. This growing opacity
limits the ability of external researchers to reproduce results, audit development, and validate safety claims.
These challenges are central to the responsible AI and governance discussions in Chapter 3 and Chapter 8.
5 Hosted access refers to using computing resources or services (such as software, hardware, or storage) provided remotely by a third party, rather
than personally owning or managing them. Instead of running software or infrastructure locally, hosted access involves accessing these resources via
the cloud or another remote service, typically over the internet. For example, using GPUs through platforms like AWS, Google Cloud, or Microsoft
Azure—rather than running them on one’s own hardware—is considered hosted access.
6 Open weights models share their architecture at varying levels of restriction, “noncommercial” limits use to research purposes, “restricted use”
permits broader use with some conditions, and “unrestricted” places no limitations on use, modification, or redistribution.
21
NOTABLE AI MODELS | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
20
30
45
13
10
18
9
19
14
22
26
33
30
22 15
26
11
18
24
20
37
21
28
36
36
16
27
13
19 29
33
29
49
55
37
68
50
78
91
119
98
95
2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025
0
20
40
60
80
100
120
140
API access Hosted access (no API) Open weights (noncommercial)
Open weights (restricted use) Open weights (unrestricted) Unreleased
Unknown
N
u
m
b
e
r
o
f
n
o
ta
b
le
A
I m
o
d
e
ls
Number of notable AI models by access type, 2014–25
Source: Epoch AI, 2026 | Chart: 2026 AI Index report
Figure
7 Not all models in the Epoch database are categorized by access type, so the totals in Figures and may not fully align with those reported
elsewhere in the chapter.
14
29
22
34 31 30
12
11 9
13 12
13
24
25
38
54
76
80
80
29 22
29
34
9
11
33
29
49
55
37
68
50
78
91
119
98
95
2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025
0
20
40
60
80
100
120
140 Open source Open (restricted use) Open (noncommercial) Unreleased Unknown
N
u
m
b
e
r
o
f
n
o
ta
b
le
A
I m
o
d
e
ls
Number of notable AI models by training code access type, 2014–25
Source: Epoch AI, 2026 | Chart: 2026 AI Index report
Figure
22
NOTABLE AI MODELS | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
Parameter and Compute Trends
Parameter counts for notable AI models have increased significantly from the early 2010s through 2022,
driven by the growing complexity of model architecture, greater data availability, improvements in hardware,
and proven efficacy of larger models (Figures –). Since then, growth in reported parameter counts
has flattened, but this is likely understating actual growth due to the absence of certain data points. Several
of the most resource-intensive models released in recent years, including those from OpenAI, Anthropic, and
Google, have not publicly disclosed parameter counts, training dataset sizes, or training duration.
Similarly, training dataset sizes and training duration increased through the early 2020s, with leading models
training on tens of trillions of tokens over periods exceeding 100 days. Again, due to limited disclosure from
major frontier labs, the more recent data is incomplete.
8 Several of the figures in this section use a log scale to reflect the exponential growth in AI model parameters and compute in recent years.
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025
100
10k
1M
100M
10B
1T
Academia Industry Industry-academia collaboration Other
Publication date
N
u
m
b
e
r
o
f
p
ar
am
e
te
rs
(
lo
g
s
ca
le
)
Number of parameters of notable AI models by sector, 2003–25
Source: Epoch AI, 2026 | Chart: 2026 AI Index report
Figure
23
NOTABLE AI MODELS | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026
10k
1M
100M
10B
1T
100T
Publication date
Tr
ai
n
in
g
d
at
as
e
t
si
ze
(
to
ke
n
s
-
lo
g
s
ca
le
)
Training dataset size of notable AI models, 2010–25
Source: Epoch AI, 2026 | Chart: 2026 AI Index report
Llama -405B
Transformer
GPT-3 175B (davinci)
DeepSeek-V3
PaLM (540B)
GPT-4 (Mar 2023)
AlexNet
-72B
Qwen3-Max
Olmo 3
2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026
1
10
100
Publication date
Tr
ai
n
in
g
t
im
e
(d
ay
s
-
lo
g
s
ca
le
)
Training time of notable AI models, 2010–25
Source: Epoch AI, 2026 | Chart: 2026 AI Index report
AlexNet
Transformer
BERT-Large
RoBERTa Large
GPT-3 175B (davinci)
Megatron-Turing NLG 530B
PaLM (540B)
GPT-4 (Mar 2023)
Llama -405B
Grok 3
Olmo 3
Figure
Figure
Since compute can be estimated even when not directly reported, training compute trends for notable
models show clear growth over the same period (Figures and ). Compute requirements for
notable models have risen by several orders of magnitude, with industry accounting for the highest values.
When comparing the two countries with highest model output, . models continue to be the most
computationally intensive compared to Chinese models. However, the comparison in recent years cannot be
fully substantiated because . models have not directly reported their training compute.
24
NOTABLE AI MODELS | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025
1μ
1
1000
1M
1B
1T Academia Industry Industry-academia collaboration Other
Publication date
Tr
ai
n
in
g
c
o
m
p
u
te
(
p
e
ta
F
LO
P
-
lo
g
s
ca
le
)
Training compute of notable AI models by sector, 2003–25
Source: Epoch AI, 2026 | Chart: 2026 AI Index report
Figure
9 Estimating training compute is an important aspect of AI model analysis, yet it often requires indirect measurement. When direct reporting is
unavailable, Epoch estimates compute by using hardware specifications and usage patterns or by counting arithmetic operations based on model
architecture and training data. In cases where neither approach is feasible, benchmark performance can serve as a proxy to infer training compute by
comparing models with known compute values. Full details of Epoch’s methodology can be found in the documentation section of their website.
2018 2019 2020 2021 2022 2023 2024 2025 2026
1
10
100
1000
10k
100k
1M
10M
100M
1B
10B
100B
1T
10T United States China
Publication date
Tr
ai
n
in
g
c
o
m
p
u
te
(
p
e
ta
F
LO
P
–
lo
g
s
ca
le
)
GPT-4 (Mar 2023)
GPT-3 175B (davinci)
Grok-2
Claude Sonnet
Grok 4
DeepSeek-V3
Doubao-proERNIE Titan
-72B
Qwen3-Max
Training compute of select notable AI models in the United States and China, 2018–25
Source: Epoch AI, 2026 | Chart: 2026 AI Index report
Figure
25
H I G H L I G H T:
Will Models Run Out of Data?
1 .1 NOTABLE AI MODELS | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
Last year, the AI Index highlighted concerns around data bottlenecks and the sustainability of the scaling
approach as it relates to training data. Leading AI researchers have publicly claimed that the available pool
of high-quality human text and web data for training large models has been exhausted, a state often referred
to as “peak data.” This has continued to raise industry-wide concerns about the sustainability of scaling laws,
which have historically depended on ever-larger datasets. One set of projections from Epoch AI suggests
that, under certain assumptions, the estimated depletion date could fall between 2026 and 2032.
Synthetic Data in Pre-training
Limits on the availability of real-world data may be less consequential if synthetic data (data generated by AI
systems) can be used to improve the performance of subsequent models. Previous editions of the AI Index
found no definitive evidence that synthetic data improves model performance during the pre-training
The 2024 report referenced research suggesting that model performance can collapse when real training
data is replaced with synthetic data. The 2025 report noted more recent findings that such collapse can be
avoided if real data remains part of the training set, but that simply adding more data does not necessarily
lead to performance gains.
The consensus remains largely unchanged. There is still no definitive evidence that synthetic data can fully
offset real-data depletion in pre-training contexts. However, recent research suggests that synthetic data
may offer value in more limited settings. Hybrid training approaches, which combine real and synthetic data,
can significantly accelerate training, sometimes by a factor of five to 10 at scale, without surpassing real
data in final model performance. Training on purely synthetic data has shown promise for smaller models or
narrowly defined tasks, such as classification, code generation, or work in low-resource languages, but these
gains have not generalized to large, general-purpose language models. Where synthetic-only training has
achieved performance comparable to real data, it has typically involved substantially smaller models that are
not directly comparable to current state-of-the-art systems. For example, the SYNTHLLM family of models,
trained entirely on synthetic data, achieves strong results yet still lags behind leading models on major
benchmarks (Figure ).
10 Pre-training refers to the initial phase of model development in which a model is trained (typically via self-supervised learning) on large, general-
purpose datasets to acquire broad linguistic or multimodal representations. Post-training refers to subsequent refinement of the base model, through
techniques such as supervised fine-tuning or reinforcement learning, to specialize behavior, improve alignment, or optimize performance on particular
tasks.
Figure
Source:
Qin et al., 2025
26
H I G H L I G H T:
1 .1 NOTABLE AI MODELS | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
Data-centric Methods
Discussions on data availability often overlook an important shift in recent AI research. Performance gains
are increasingly driven by improving the quality of existing datasets, not by acquiring more. Rather than
scaling data indiscriminately, researchers are spending more effort in pruning, curating, and refining training
inputs. Data-centric methods emphasize performance improvements through practices such as cleaning
labels, deduplicating samples, and constructing higher-quality datasets. A growing body of research shows
that training models on low-quality or polluted data can significantly degrade performance. Likewise, recent
evidence illustrates that data pruning, selecting the most informative training inputs, often outperforms
approaches that train on all available data indiscriminately.
Synthetic Data in Post-training
Recent research shows that synthetically generated data can be effective for improving model performance
in post-training settings, including fine-tuning, alignment, instruction tuning, and reinforcement learning.
A growing body of research released in 2025 supports this finding. Evidence suggests that synthetic
post-training data is effective in few-shot generation settings, for improving long-context capabilities, for
optimizing reinforcement learning workflows, and for strengthening reasoning more broadly.
%
% %
% %
O
L
M
o
3
C
la
u
d
e
O
p
u
s
4
.5
G
ro
k-
4
G
P
T-
5
(
h
ig
h
)
G
e
m
in
i 1
.5
P
ro
0%
20%
40%
60%
80%
100%
Model
S
co
re
Model performance on AIME 2025
Source: Articial Analysis, 2026; Ai2, 2025 | Chart: 2026 AI Index report
Figure
Prevalence of Synthetic Content
Since the launch of ChatGPT in November 2022, there have been predictions that the internet would soon
become overrun by AI-generated content. Recent research from Graphite suggests that beginning in January
2025, over 50% of newly published online content was generated by AI (Figure ). Others have projected
that the share in 2026 could be even higher.
Recent large-scale model development
illustrates this paradigm in practice. Olmo
3 researchers prioritized large-scale
deduplication, quality-aware data selection,
and stage-specific training curricula rather
than indiscriminate data scaling. These
interventions, combined with iterative
feedback loops to evaluate and refine
candidate data mixes, allowed their models
to achieve competitive performance despite
training on substantially fewer tokens than
other leading state-of-the-art models (Figure
). Olmo ’s Think 32B model, for example,
contains roughly 32 billion parameters, nearly
90 times fewer than Grok 4’s 3 trillion, yet it
achieves comparable performance on several
benchmarks, including American Invitational
Mathematics Examination (AIME)11 2025.
11 The American Invitational Mathematics Examination (AIME) is an annual high school math competition widely used as a benchmark for AI
mathematical reasoning, with each year’s exam providing a fresh test set.
27
H I G H L I G H T:
1 .1 NOTABLE AI MODELS | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
Given growing concerns about the suitability of synthetic data for training AI systems, this trend raises
questions about the long-term reliability of current scaling trajectories. In response, many firms that depend
on high-quality training data have increasingly turned to proprietary sources. In May 2025, the New York
Times entered into a licensing agreement with Amazon to allow its content to be used for training purposes.
By mid-2025, Meta was
reportedly engaged in
similar discussions with
news organizations,
while health and life
sciences companies such
as Bristol Myers Squibb
have pursued comparable
strategies. These
developments suggest
that firms training frontier
AI systems are adjusting
their data acquisition
strategies as the volume of
openly available training
data continues to decline.
Ja
n-
20
20
Ju
l-2
02
0
Ja
n-
20
21
Ju
l-2
02
1
Ja
n-
20
22
Ju
l-2
02
2
Ja
n-
20
23
Ju
l-2
02
3
Ja
n-
20
24
Ju
l-2
02
4
Ja
n-
20
25
0%
20%
40%
60%
80%
100%
%
o
f
g
e
n
e
ra
te
d
c
o
n
te
n
t
%, Human
%, AI
AI-generated content vs. human content
Source: , 2025 | Chart: 2026 AI Index report
Figure
28
Compute and Infrastructure
1 RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
The development of AI models requires significant infrastructure investment. As training processes have
expanded in scale and complexity, the underlying hardware has also improved in both speed and efficiency.
In turn, these gains shape what kinds of models researchers and labs can realistically build. The growth in
training compute discussed in the previous section would not have been possible without corresponding
improvements in hardware capabilities. This section leverages data from Epoch AI to track hardware
performance, adoption, and aggregate computing capacity over time.
Performance and Efficiency
Peak computational performance of machine learning hardware has increased exponentially across releases
between 2008 and 2025 (Figure ). The gains are especially visible at lower precision types, where
precision refers to the number of bits used to represent numerical values. Lower precision formats such as
FP16 and Tensor-FP16/BF16 now show the highest performance levels and have become standard in many
training and inference settings.
2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025
10B
100B
1T
10T
100T
10 15
FP32 FP16 TF32 (19-bit) Tensor-FP16/BF16
Publication date
P
e
rf
o
rm
an
ce
(
F
LO
P/
s
-
lo
g
s
ca
le
)
Peak computational performance of ML hardware for dierent precisions, 2008–25
Source: Epoch AI, 2026 | Chart: 2026 AI Index report
Figure
29
COMPUTE AND INFRASTRUCTURE | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
Hardware for Notable Models
Global Computing Capacity
Hardware adoption patterns among notable AI models reflect the gains in performance and efficiency (Figure
). Since 2017, the cumulative number of notable models trained on A100-class hardware has increased,
with 84 models trained in 2025. The previous generation, V100, continues to power a sizable share (69
models). Newer hardware, such as the H100, has seen early rapid adoption (28), while other categories, such
as TPU v3 and TPU v4, show stable curves.
2017 2018 2019 2020 2021 2022 2023 2024 2025
0
10
20
30
40
50
60
70
80
Publication date
C
u
m
u
la
ti
ve
n
u
m
b
e
r
o
f
n
o
ta
b
le
A
I m
o
d
e
ls
4, H800
6, P100
28, TPU v4
28, H100
44, TPU v3
54, Other
69, V100
84, A100
Cumulative number of notable AI models trained by accelerator, 2017–25
Source: Epoch AI, 2026 | Chart: 2026 AI Index report
Figure
The supply of AI computing capacity from major chip designers has continued to increase (Figure ). Total
capacity has increased by an estimated per year since 2022, reaching approximately million H100-
Nvidia AI chips currently account for over 60% of total compute, with Google and Amazon
supplying much of the remainder and Huawei holding a small but growing share. The growth in compute
capacity tracks closely with investment patterns described in Chapter 4, where leading AI companies have
increased their capital expenditure and infrastructure has become the fastest growing focus area of private
AI funding.
12 Since these estimates are inferred from revenue data, financial disclosures and analyst reports, they reflect broader trends rather than exact
counts. Data coverage also varies by manufacturer; Nvidia and Google data starts in 2022, while others start in 2024.
30
COMPUTE AND INFRASTRUCTURE | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
20
22
Q
1
20
22
Q
2
20
22
Q
3
20
22
Q
4
20
23
Q
1
20
23
Q
2
20
23
Q
3
20
23
Q
4
20
24
Q
1
20
24
Q
2
20
24
Q
3
20
24
Q
4
20
25
Q
1
20
25
Q
2
20
25
Q
3
20
25
Q
4
0M
2M
4M
6M
8M
10M
12M
14M
16M
18M
Nvidia Google Amazon AMD Huawei
C
u
m
u
la
ti
ve
c
o
m
p
u
te
c
ap
ac
it
y
(H
10
0
e)
Global computing capacity from AI chips across major designers, 2022–25
Source: Epoch AI, 2026 | Chart: 2026 AI Index report
Figure
Figure
Data Center Power Capacity
The expansion of computing capacity carries a direct energy cost. Total AI data center power capacity
reached approximately GW by Q4 2025, enough to power all of New York state at peak demand (Figure
). AI chip power, measured by thermal design power, accounted for roughly GW of the total, with the
remainder attributed to cooling, networking, and other data center infrastructure. This estimate is based on
the rated power capacity of leading AI chips sold over time, with a multiplier of approximately applied to
account for the additional requirements of powering infrastructure.
20
22
Q
1
20
22
Q
2
20
22
Q
3
20
22
Q
4
20
23
Q
1
20
23
Q
2
20
23
Q
3
20
23
Q
4
20
24
Q
1
20
24
Q
2
20
24
Q
3
20
24
Q
4
20
25
Q
1
20
25
Q
2
20
25
Q
3
20
25
Q
4
0
5
10
15
20
25
30
35
40 Other AI data center power
(cooling, networking, etc.)
AI chip power (TDP)
C
u
m
u
la
ti
ve
p
o
w
e
r
ca
p
ac
it
y
(G
W
) Peak usage in New York state ≈ 31 GW
Netherlands ≈ 19 GW
New Zealand ≈ 7 GW
Global AI data center power capacity, 2022–25
Source: Epoch AI, 2026 | Chart: 2026 AI Index report
31
Data Centers
1 RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
The physical infrastructure underlying
AI development extends beyond
models and compute described in the
previous section. Data centers are
where compute is housed, and their
capacity, geographic distribution,
and underlying supply chains shape
what AI systems can be built and
where. This section draws on data
from Cloudscene to track the global
distribution of data centers and
introduces an overview of the broader
AI infrastructure ecosystem to provide
context for the geographic and supply
chain dynamics.
AI Infrastructure: Beyond GPUs
Modern AI data centers depend on a combination of compute, storage, communications, and specialized
hardware that enables AI systems to run at large scale. GPUs and custom accelerators such as Tensor
Processing Units (TPUs) are the most widely discussed, but they are only one layer of a broader infrastructure
stack. All data processed by these chips is held in high-bandwidth memory (HBM), which supports moving
large volumes of data in and out efficiently. The leading manufacturers of HBM are SK Hynix (South Korea),
Samsung (South Korea), and Micron (USA). During training, GPUs must continuously share data with one
another, which requires fast, high throughput network connectivity achieved with fiber-optic cables running
high-bandwidth networking architectures such as InfiniBand.
The supply chain behind this hardware adds another dimension. Companies like Nvidia and SK Hynix
design but do not manufacture chips. Instead, they provide designs to specialized semiconductor foundries,
primarily the Taiwan Semiconductor Manufacturing Company (TSMC) and Samsung Foundry, which fabricate
the chips at the nanometer scales modern AI hardware requires. The fabricated chips are then packaged and
tested by assembly companies such as ASE Group (Taiwan) and Amkor Technology (United States). TSMC
is a single point of dependency in the global AI supply chain, as it fabricates virtually every leading AI chip,
including Nvidia’s Blackwell GPUs and AMD’s MI300X. There are high barriers to entry at every layer—
requiring decades of accumulated expertise, specialized equipment, and significant capital investment to
overcome.
The infrastructure ecosystem is relevant beyond AI capabilities, as it shapes education priorities and
workforce development. Chapter 7 (Education) distinguishes between AI software-related and AI hardware-
related degrees. That distinction is also relevant here, where different countries play different roles across the
hardware supply chain.
32
DATA CENTERS | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
Geographic Distribution
Most of the world’s data center infrastructure is located in a small number of countries (Figures and
). In 2025, the United States led by a wide margin, with 5,427 data centers, more than 10 times the
count of any other country. Germany (529), the United Kingdom (523), and China (449) followed, while the
majority of the remaining countries each had fewer than 300 facilities. The . may show a clear lead, but
the other country rankings should be assessed with the understanding that data center counts do not capture
differences in facility size, computing capacity, or utilization.
1–9
10–19
20–39
40–59
60–99
100–149
150–249
250–349
350–529
530+
No available data
Global distribution of data centers, 2025
Source: Cloudscene, 2025 | Chart: 2026 AI Index report
Figure
Figure
144
153
168
173
197
222
251
298
314
322
337
449
523
529
5,427
0 300 600 900 1,200 1,500 1,800 2,100 2,400 2,700 3,000 3,300 3,600 3,900 4,200 4,500 4,800 5,100 5,400
Poland
India
Italy
Mexico
Brazil
Japan
Russia
Netherlands
Australia
France
Canada
China
United Kingdom
Germany
United States
Number of data centers
Number of data centers by geographic area, 2025
Source: Cloudscene, 2025 | Chart: 2026 AI Index report
33
Energy and
Environmental Impact
1 RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
As AI systems have scaled and become
more widely deployed, their energy
consumption and environmental
footprint have become very visible.
The compute and infrastructure trends
described in the preceding sections
translate into heavy demands on
energy, water, and carbon emissions.
This section examines those costs
across three areas of AI development:
training, inference, and data center
energy usage. The analysis draws from
Epoch AI’s model-level data, recent
academic benchmarking research
(Jegham et al., 2025), the International
Energy Agency’s reporting on data
centers (IEA, 2025), and de Vries and
Gao (2025).
Training
Leading machine learning hardware has grown more efficient since 2016, as measured in FLOP/s per watt
(Figure ). Leading chips deliver about 10 times more computation per watt than those available a decade
ago, with Nvidia B200 and Google TPU v5e among the most efficient. However, models have scaled faster
than efficiency has improved, so total power required to train frontier systems has continued to increase.
Total power draw for training models has grown by several orders of magnitude since the early 2010s (Figure
). The most compute-intensive models in the data set, such as Grok 3 and Llama 4 Behemoth, required
upward of 100 million watts during training. Due to limited disclosure by their developers, power draw
information is not available for many of the newest models that have been released.
Carbon emissions from training have increased even more sharply (Figure ). Training AlexNet in 2012
produced an estimated tons of CO2 equivalent, while training Grok 4 in 2025 produced about 72,816
tons. To put this into context, that is more than the lifetime carbon emissions of an average car (63 tons).
Larger models generally produce more emissions but not always, as it can also depend on hardware
efficiency, training duration, and the carbon intensity of the energy sources used. DeepSeek v3, for example,
produced approximately 597 tons, which is much less than models of comparable size (Figure ).
34
ENERGY AND ENVIRONMENTAL IMPACT | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
2016 2017 2018 2019 2020 2021 2022 2023 2024 2025
1B
10B
100B
1T
Leading hardware
Non-leading hardware
Publication date
E
n
e
rg
y
e
ci
e
n
cy
(
F
LO
P/
s
p
e
r
w
at
t
-
lo
g
s
ca
le
)
NVIDIA P100
Google TPU v2
Google TPU v3
Google TPU v4
NVIDIA Tesla V100 SXM2 32 GB
Google TPU v4i
NVIDIA A100 Google TPU v5e
NVIDIA B100
NVIDIA H100 SXM5 80GB
NVIDIA B200
NVIDIA GB200
Energy eciency of leading machine learning hardware, 2016–25
Source: Epoch AI, 2026 | Chart: 2026 AI Index report
Llama 4 Behemoth (preview)
Grok 3
Llama -405B
GPT-4
PaLM (540B)
GPT-3 175B (davinci)
2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026
100
1000
10k
100k
1M
10M
100M
Publication date
To
ta
l p
o
w
e
r
d
ra
w
r
e
q
u
ir
e
d
(w
at
ts
-
lo
g
s
ca
le
)
Total power draw required to train frontier models, 2011–25
Source: Epoch AI, 2026 | Chart: 2026 AI Index report
Figure
Figure
35
ENERGY AND ENVIRONMENTAL IMPACT | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
588 1,432 301
2,973
5,184
597
8,930
59,200
72,816
A
le
xN
e
t
V
G
G
16
B
E
R
T-
L
ar
g
e
R
o
B
E
R
Ta
L
ar
g
e
G
P
T-
3
M
e
g
at
ro
n
-T
u
ri
n
g
N
LG
G
L
M
-1
3
0
B
Fa
lc
o
n
-1
8
0
B
G
P
T-
4
D
e
e
p
S
e
e
k
v3
L
la
m
a
3
.1
4
0
5
B
G
ro
k
3
G
ro
k
4
2012 2014 2018 2019 2020 2021 2022 2023 2024 2025
0
10,000
20,000
30,000
40,000
50,000
60,000
70,000
80,000
C
ar
b
o
n
e
m
is
si
o
n
s
(t
o
n
s
o
f
C
O
₂
e
q
u
iv
al
e
n
t)
Estimated carbon emissions from training select AI models and real-life activities, 2012–25
Source: AI Index, 2026; Strubell et al., 2019 | Chart: 2026 AI Index report
Air travel (1 passenger, NY↔SF):
Human life (avg., 1 year):
American life (avg., 1 year):
Car usage (avg., incl. fuel, 1 lifetime): 63
AlexNet
VGG16
BERT-Large
RoBERTa Large
GPT-3
Megatron-Turing NLG
GLM-130B Falcon-180B
GPT-4
DeepSeek v3
Llama 405B
Grok 3
Grok 4
1 10 100 1000 10k 100k
1B
1T
Carbon emissions (tons of CO₂ equivalent - log scale)
N
u
m
b
e
r
o
f
p
ar
am
e
te
rs
(
lo
g
s
ca
le
)
Estimated carbon emissions and number of parameters by select AI models
Source: AI Index, 2026 | Chart: 2026 AI Index report
Figure
Figure
36
ENERGY AND ENVIRONMENTAL IMPACT | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
Inference
Training costs have typically received the most attention, but inference represents a growing share of AI’s
total energy footprint. Once a model is deployed at scale, the cumulative energy required to serve queries
can exceed the one-time cost of training within months.
Recent benchmarking by Jegham et al. (2025) provides per model estimates of inference energy consumption
and carbon emissions for medium-length prompts (defined as approximately 1,000 input tokens and 1,000
output tokens). Among the top 15 models by energy consumption in 2025, DeepSeek Exp and DeepSeek
consumed the most per query (23 Wh), followed by GPT-5 (high) at Wh (Figure ). Models
such as Claude 4 Opus and GPT-5 min (medium) sit at the lower end, consuming between 5 and 6 Wh.
When ranked by carbon emissions, the models also follow a similar pattern (Figure ). DeepSeek
Exp and DeepSeek produced the highest per medium-length prompt, approximately 14 grams of CO2
equivalent each. For comparison, Claude 4 Opus and Mistral Medium 3 were the lowest at and grams,
respectively. There is a wide spread even among models released in the same year, showing not only that
inference efficiency varies but that higher capability is not necessarily proportional to the environmental cost.
G
P
T-
4
G
P
T-
4
T
u
rb
o
G
P
T-
3
.5
T
u
rb
o
D
e
e
p
S
e
e
k
V
3
(
D
e
e
p
S
e
e
k)
L
la
m
a
3
.1
4
0
5
B
S
ta
n
d
ar
d
M
is
tr
al
L
ar
g
e
2
(
A
W
S
)
M
is
tr
al
L
ar
g
e
2
(
A
zu
re
)
o
1
C
la
u
d
e
3
.5
H
ai
ku
L
la
m
a
3
.1
7
0
B
S
ta
n
d
ar
d
L
la
m
a
3
.2
9
0
B
(
V
is
io
n
)
L
la
m
a
3
.1
4
0
5
B
L
at
e
n
cy
O
p
ti
m
iz
e
d
L
la
m
a
3
7
0
B
G
P
T-
4
o
(
M
ay
)
G
P
T-
4
o
m
in
i
C
la
u
d
e
3
H
ai
ku
G
P
T-
4
o
(
N
o
v)
G
P
T-
4
o
(
A
u
g
)
D
e
e
p
S
e
e
k
V
3
.2
E
xp
D
e
e
p
S
e
e
k
V
3
.2
G
P
T-
5
(
h
ig
h
)
o
3
-p
ro
G
P
T-
5
m
in
i (
h
ig
h
)
G
P
T-
5
(
m
e
d
iu
m
)
G
ro
k
4
G
P
T-
5
(
lo
w
)
K
im
i K
2
T
h
in
ki
n
g
G
P
T-
5
n
an
o
(
h
ig
h
)
o
3
-m
in
i (
h
ig
h
)
o
4
-m
in
i (
h
ig
h
)
G
ro
k
3
F
as
t
C
la
u
d
e
4
O
p
u
s
G
P
T-
5
m
in
i (
m
e
d
iu
m
)
2023 2024 2025
0
5
10
15
20
25
E
n
e
rg
y
co
n
su
m
p
ti
o
n
(
av
g
. -
W
h
)
Model energy consumption for medium-length prompts
Source: Jegham et al., 2025 | Chart: 2026 AI Index report
Figure
13 This figure shows the top 15 models by energy consumption for 2024 and 2025. The full set of models is available through the source dashboard.
37
ENERGY AND ENVIRONMENTAL IMPACT | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
G
P
T-
4
G
P
T-
4
T
u
rb
o
D
e
e
p
S
e
e
k
V
3
(
D
e
e
p
S
e
e
k)
L
la
m
a
3
.1
4
0
5
B
S
ta
n
d
ar
d
M
is
tr
al
L
ar
g
e
2
(
A
W
S
)
G
ro
k
3
F
as
t
M
is
tr
al
L
ar
g
e
2
(
A
zu
re
)
o
1
C
la
u
d
e
3
.5
H
ai
ku
L
la
m
a
3
.1
7
0
B
S
ta
n
d
ar
d
L
la
m
a
3
.2
9
0
B
(
V
is
io
n
)
L
la
m
a
3
.1
4
0
5
B
L
at
e
n
cy
O
p
ti
m
iz
e
d
L
la
m
a
3
7
0
B
G
P
T-
4
o
(
M
ay
)
D
e
e
p
S
e
e
k
V
3
(
A
zu
re
)
G
P
T-
4
o
m
in
i
G
P
T-
4
o
(
N
o
v)
D
e
e
p
S
e
e
k
V
3
.2
E
xp
D
e
e
p
S
e
e
k
V
3
.2
G
P
T-
5
(
h
ig
h
)
o
3
-p
ro
G
P
T-
5
m
in
i (
h
ig
h
)
G
ro
k
4
G
P
T-
5
(
m
e
d
iu
m
)
G
P
T-
5
(
lo
w
)
G
P
T-
5
n
an
o
(
h
ig
h
)
K
im
i K
2
T
h
in
ki
n
g
o
3
-m
in
i (
h
ig
h
)
o
4
-m
in
i (
h
ig
h
)
G
P
T-
5
m
in
i (
m
e
d
iu
m
)
C
la
u
d
e
4
O
p
u
s
M
is
tr
al
M
e
d
iu
m
3
2023 2024 2025
0
3
6
9
12
15
18
C
ar
b
o
n
e
m
is
si
o
n
s
(a
vg
. -
g
C
O
₂e
)
Model carbon emissions for medium-length prompts
Source: Jegham et al., 2025 | Chart: 2026 AI Index report
Figure
14 This figure shows the top 15 models by energy consumption for 2024 and 2025. The full set of models is available through the source dashboard.
At the level of a single query, the numbers seem
more modest. A short GPT-4o query consumes
approximately Wh, which is 40% more
than a Google search at Wh (Figure ).
A daily session of eight medium-length queries
uses the energy comparable to charging two
smartphones ( Wh). But across hundreds of
millions of daily queries, the consumption scales
into something much larger.
The same scaling dynamic is true for water
consumption (Figure ). Annual estimates
for GPT-4o inference range from about to
kiloliters, which, at the high end, exceeds the
annual drinking water needs of 12 million people.
38
ENERGY AND ENVIRONMENTAL IMPACT | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
1 Google search
GPT-4o (Mar 2025) – short query
GPT-4o (Mar 2025) – medium query
GPT-4o (Mar 2025) – long query
Daily session
(8 messages, short queries)
Daily session
(8 messages, medium queries)
Charging 2 phones
(energy usage)
0
2
4
6
8
10
E
n
e
rg
y
co
n
su
m
p
ti
o
n
(
W
h
)
Per-query and daily energy consumption: GPT-4o vs. common activities
Source: AI Index, 2025 | Chart: 2026 AI Index report
1 2,500
1,250,000
1,314,000 1,334,991
1,579,680
1 person annual drinking water
(. avg.)
1 Olympic swimming pool
500 Olympic swimming pools
(aggregate)
12 million people annual
drinking water (aggregate)
GPT-4o inference
(minimum estimate)
GPT-4o inference
(maximum estimate)
0
1M
A
n
n
u
al
w
at
e
r
co
n
su
m
p
ti
o
n
(
kL
)
Annual water consumption: GPT-4o vs. real-world baselines
Source: AI Index, 2025 | Chart: 2026 AI Index report
Figure
Figure
39
ENERGY AND ENVIRONMENTAL IMPACT | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
Data Center Usage
The power demands of models and queries add up to a much larger infrastructure footprint. The estimated
power demand from AI accelerator modules reached approximately 5,200 MW cumulatively through 2024
(Figure ). Nvidia accounted for the largest share, which is consistent with the company’s leading position
in global AI chip capacity (as discussed in Section ). When including the full systems supporting those
accelerators (servers, cooling, networking), estimated demand reached approximately 9,400 MW (Figure
). However, these figures from de Vries and Gao (2025) carry uncertainty from variation in utilization
rates and facility-level efficiency, as reflected in the error bars on the charts.
To put that scale in perspective, the cumulative power demand of all-in AI systems is comparable to the
national electricity consumption of Switzerland or Austria, and roughly half that of Bitcoin mining (Figure
). Excluding crypto, global data centers accounted for the highest estimated power demand at around
47,000 MW, with AI hardware making up a growing share of that total.
2023 2024 Cumulative
0
2,000
4,000
6,000
8,000
10,000
Nvidia
AMD
Other AI systems
M
e
g
aw
at
ts
(
M
W
)
Estimated power demand of all-in AI systems
Source: de Vries-Gao, 2025 | Chart: 2026 AI Index report
2023 2024 Cumulative
0
1,000
2,000
3,000
4,000
5,000
6,000 Nvidia
AMD
Other AI accelerator modules
M
e
g
aw
at
ts
(
M
W
)
Estimated power demand of AI accelerator modules
Source: de Vries-Gao, 2025 | Chart: 2026 AI Index report
Figure Figure
Figure
0 5 10 15 20 25 30 35 40 45
Ireland
AI accelerator modules
Switzerland
Austria
Finland
All-in AI systems
The Netherlands
Bitcoin mining
United Kingdom
France
Data centers
(excl. crypto)
Power demand (in thousands - megawatts)
Estimated power demand: AI hardware vs. national consumption, bitcoin mining, and global data centers
Source: de Vries-Gao, 2025 | Chart: 2026 AI Index report
40
ENERGY AND ENVIRONMENTAL IMPACT | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
Cost, however, has been moving in the opposite direction. Since 2006, the cost of GPU computation has
fallen by more than 99% (Figure ). This decline has been key to enabling the scaling trends described
throughout this chapter, making it economically feasible to train and deploy models at levels that would
have been cost prohibitive even a decade ago. At the regional level, data center electricity consumption has
increased across all major regions, and it is projected to continue to rise through 2030 (Figure ). The
United States accounts for the largest share, followed by China, Europe, and the rest of Asia.
2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024
G
P
U
c
o
st
in
d
e
x
(2
0
0
6
=
1)
GPU computation cost, 2006–24
Source: International Energy Agency (2025), Energy and AI, IEA, Paris | Chart: 2026 AI Index report
2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030
0
200
400
600
800
1,000
United States
China
Europe
Asia excluding China
Rest of the world
E
le
ct
ri
ci
ty
c
o
n
su
m
p
ti
o
n
(
T
W
h
)
Data center electricity consumption by region, 2020–30
Source: International Energy Agency (2025), Energy and AI, IEA, Paris | Chart: 2026 AI Index report
Figure
Figure
15 Data in this chart reflects IEA projections rather than observed consumption.
41
1 RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
Open-Source AI Software
The preceding sections have focused on notable frontier models and the infrastructure required to build and
maintain them. Open-source platforms like GitHub and Hugging Face offer a different view that captures the
developer ecosystem experimenting with and building on AI models. Much of this activity is not reflected in
academic publications or frontier model releases. The AI Index analyzes data from both platforms16 to better
understand how open-source AI development is evolving over time.
16 Chinese researchers often use alternatives to GitHub, such as Gitee and GitCode, for code sharing, but the data from those sites is not included in
this report. A full methodological description is available in the Appendix.
The scale of open-source development has grown steadily. The number of AI-related GitHub projects
increased from 1,549 in 2011 to approximately million in 2025, with year-over-year growth accelerating
% from 2024 (Figure ). However, most repositories often consist of personal or experimental work
and receive minimal attention. When filtering for projects with at least 10 stars, a rough proxy for community
engagement, the count drops to 206,880 in 2025 (Figure ). The growth trajectory is similar for both
measures.
AI Development Activity Overview
Projects
2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025
N
u
m
b
e
r
o
f
A
I p
ro
je
ct
s
(i
n
m
ill
io
n
s)
Number of GitHub AI projects, 2011–25
Source: GitHub, 2025 | Chart: 2026 AI Index report
Figure
42
OPEN-SOURCE AI SOF T WARE | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025
0
50
100
150
200
N
u
m
b
e
r
o
f
A
I p
ro
je
ct
s
(i
n
t
h
o
u
sa
n
d
s)
Number of GitHub AI projects with at least 10 stars, 2011–25
Source: GitHub, 2025 | Chart: 2026 AI Index report
Figure
The geographic distribution of more visible
open-source AI projects has shifted over time
(Figure ). Among projects with at least
10 stars, the United States accounted for the
largest share in 2025 (%), though that has
declined steadily from nearly 80% in 2011 as
developers in other regions have increased
their presence on the platform. Europe and
the rest of the world have grown in number
of projects, while China’s share has leveled
off since 2019. India remains a growing
contributor, representing % of projects
with at least 10 stars. Because GitHub data
does not capture Chinese developers who use
domestic platforms such as Gitee or GitCode,
China’s share of global open-source AI activity
is likely understated. The existing geographic
attribution for China uses self-reported
location rather than IP-based geolocation.
43
OPEN-SOURCE AI SOF T WARE | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025
0%
10%
20%
30%
40%
50%
60%
70%
80%
A
I p
ro
je
ct
s
(%
o
f
to
ta
l)
%, India
%, China
%, Europe
%, Rest of the world
%, United States
GitHub AI projects with at least 10 stars (% of total) by geographic area, 2011–25
Source: GitHub, 2025 | Chart: 2026 AI Index report
Figure
Beyond project counts, GitHub stars provide another signal of developer interest and engagement in open-
source communities (Figure ). The total number of stars for AI projects increased from 14 million in 2023
to million in All major geographic regions saw year-over-year increases. However, the geographic
pattern for stars differs from the project share data above. Despite its declining share of projects, the United
States accumulated the highest number of stars at 30 million cumulatively (Figure ). So while open-
source activity becomes more geographically distributed, the projects with the most engagement remain
disproportionately .-based.
17 In previous AI Index reports, project locations for China and Hong Kong were determined using IP-based geolocation. Due to frequent VPN usage
in these regions that resulted in systematic misclassification, self-reported profile locations are now used for China and Hong Kong, while IP-based
geolocation continues to be applied for all other countries.
18 Figure shows new stars given to GitHub projects within a year, not the total accumulated over time.
Stars
2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025
0
5
10
15
N
u
m
b
e
r
o
f
G
it
H
u
b
s
ta
rs
(
in
m
ill
io
n
s)
Number of GitHub stars in AI projects, 2011–25
Source: GitHub, 2025 | Chart: 2026 AI Index report
Figure
44
OPEN-SOURCE AI SOF T WARE | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025
0
5
10
15
20
25
30
N
u
m
b
e
r
o
f
cu
m
u
la
ti
ve
G
it
H
u
b
s
ta
rs
(
in
m
ill
io
n
s)
, India
, China
, Europe
, Rest of the world
, United States
Number of GitHub stars by geographic area, 2011–25
Source: GitHub, 2025 | Chart: 2026 AI Index report
Figure
Model and Dataset Ecosystem
To complement the GitHub view, this section uses metadata from Hugging Face, a widely used community
platform and open repository for AI models and datasets. The analysis focuses on assets created or uploaded
between 2022 and 2025 to understand recent activity and adoption trends (Figures and ). Upload
activity has continued to rise over the last few years, with a marked increase after the second quarter of
2024. From 2023 to 2025, model uploads more than tripled, while dataset uploads grew fourfold. Download
distribution also shifted after 2023. Geographically19 .-developed models lost share to unaffiliated users.
On the developer side, major private actors such as Google and Meta have shifted from being the principal
authors to accounting for a relatively small share of downloads, while communities such as Sentence
Transformers and the BERT community have grown (Figure ). A large share of total model downloads
fell into an “Others” category, reflecting the wider distribution of development activity even as the most
downloaded models were tied to a small number of sources.
19 Data was obtained in collaboration with researchers from Longpre et al. (2025). Their dataset provides Hugging Face model download data that
the authors describe as consistent and relatively complete. It was validated with the Hugging Face team, is reported to be less noisy than raw counts,
and includes cleaned and imputed missing metadata. It is released as a weekly panel rather than an all-time-downloads cross-section. Coverage spans
March 2020 to August 2025 and includes the top 200 most-downloaded Hugging Face models per week. These models account for % of total
normalized, filtered downloads. This restriction focuses the analysis on models with higher observed download volume, reduces long-tail variation,
and may support more stable estimates.
45
OPEN-SOURCE AI SOF T WARE | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
3 4 6
9
15 21
24
34
42 47
53
73
89
82
153
3
19 18
31
53
70
96
102
118
156
260
217
297 295
257
332
20
22
Q
1
20
22
Q
2
20
22
Q
3
20
22
Q
4
20
23
Q
1
20
23
Q
2
20
23
Q
3
20
23
Q
4
20
24
Q
1
20
24
Q
2
20
24
Q
3
20
24
Q
4
20
25
Q
1
20
25
Q
2
20
25
Q
3
20
25
Q
4
0
50
100
150
200
250
300
Models Datasets
N
u
m
b
e
r
o
f
m
o
d
e
ls
a
n
d
d
at
as
e
ts
(
in
t
h
o
u
sa
n
d
s)
Number of models and datasets on Hugging Face, 2022–25
Source: Hugging Face, 2025 | Chart: 2026 AI Index report
73% 72% 76% 68% 67% 68% 70% 68%
61%
54%
48%
36%
6% 5%
10%
14%
20% 18% 17% 14%
12% 12%
11%
16% 16% 14%
15%
14%
20%
21%
14%
24%
51%
60% 59% 58%
54%
54%
50%
34% 35%
27%
6%
21%
19%
23%
15% 17%
13%
8%
6%
9%
24% 42%
6% 7%
6%
25%
15%
10%
7% 9%
11% 10%
11%
12% 5%
5%
8%
8% 8%
8%
6%
5% 5% 6% 6%
6% 7% 7%
2020Q2
2020Q3
2020Q4
2021Q1
2021Q2
2021Q3
2021Q4
2022Q1
2022Q2
2022Q3
2022Q4
2023Q1
2023Q2
2023Q3
2023Q4
2024Q1
2024Q2
2024Q3
2024Q4
2025Q1
2025Q2
2025Q3
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
United States Unaliated user International/Online China United Kingdom Germany
Others Finland France India Japan
D
o
w
n
lo
ad
s
h
ar
e
Global distribution of downloads among top Hugging Face models, Q2 2020–Q3 2025
Source: Longpre et al., 2025; Hugging Face, 2025 | Chart: 2026 AI Index report
Figure
Figure
20 The data shown in this chart comes from the publicly accessible Hugging Face repository. For more details, refer to the Appendix.
21 Data source: Longpre et al. (2025). For more details, refer to the Appendix.
46
OPEN-SOURCE AI SOF T WARE | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
42% 42%
47%
39%
34% 32% 30% 28%
24%
18%
12% 9%
7% 7%
5%
7%
6%
6%
9% 19%
13%
8% 12% 9%
7%
9%
11%
10%
11%
11%
5% 5%
19%
12%
8%
11%
13%
14%
14%
9% 9%
16%
7%
14% 14%
15%
15%
15%
16%
16%
16%
13% 11%
7%
6%
27% 29%
25%
32%
34%
33%
31%
33%
40%
39%
24%
39%
56%
48%
46%
45% 51%
61%
66% 48%
52%
41%
9% 9% 8% 8% 10% 13%
18% 18% 19% 22%
15% 13%
7% 9% 6% 5% 12%
14%
19%
16%
16%
10% 9% 11%
15% 11%
8%
16%
33%
2020Q2
2020Q3
2020Q4
2021Q1
2021Q2
2021Q3
2021Q4
2022Q1
2022Q2
2022Q3
2022Q4
2023Q1
2023Q2
2023Q3
2023Q4
2024Q1
2024Q2
2024Q3
2024Q4
2025Q1
2025Q2
2025Q3
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Google OpenAI timm stable-diusion sd-concepts-library
Bingsu Deepseek Meta Other Top 20 (inclusive)
lllyasviel lmstudio-community
D
o
w
n
lo
ad
s
h
ar
e
Download share by developer among top Hugging Face models, Q2 2020–Q3 2025
Source: Longpre et al., 2025; Hugging Face, 2025 | Chart: 2026 AI Index report
Figure
Figure
22 Data source: Longpre et al. (2025). For more details, refer to the Appendix.
23 Data source: Longpre et al. (2025). For more details, refer to the Appendix.
The most popular model types have shifted over the last three years. Text embedders, classifiers, and audio
models, which together accounted for nearly 70% of downloads in 2022, fell to less than 6% in 2025 (Figure
). Text generation, multimodal, and video generation models have grown in their place. Text generation
led in 2025, accounting for more than 42% of total downloads. Image generation models also increased
steadily, remaining the second most downloaded category. Despite these shifts, downloads remain highly
concentrated, with nearly 80% associated with the top three categories.
2022 2023 2024 2025
Text generation %
Image generation %
Multimodal generation %
Video generation %
Undocumented %
Audio models %
Text embed/class %
Multimodal embedding %
Image embedding %
Tabular models %
% Text embed/class
% Image generation
% Audio models
% Text generation
% Image embedding
% Undocumented
% Multimodal embedding
% Multimodal generation
% Tabular models
Download share by modality among top Hugging Face models, Q3 2022–Q3 2025
Source: Longpre et al., 2025; Hugging Face, 2025 | Chart: 2026 AI Index report
47
Publications
1 RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
The first half of this chapter tracked the models, infrastructure, and energy behind AI development. This
section shifts to research output, specifically English-language AI publications and citations. Publications
offer a longitudinal signal of AI research activity at scale, and the AI Index has tracked them consistently over
time. While publication volume is not a measure of research quality, and not all research appears in indexed
databases, this approach offers a consistent method for tracking the research frontier year over year. The
analysis draws from OpenAlex, a bibliographic database24 the AI Index has used since 2025, and considers
both publication volume and downstream influence through citation patterns.
24 OpenAlex is a fully open catalog of scholarly metadata, including scientific papers, authors, institutions, and more. The AI Index used OpenAlex
as a bibliographic database and automatically classified AI-related research using the latest version of the CSO Classifier. The CSO Classifier () is
an automated text classification system designed to categorize research papers in computer science using a comprehensive ontology of 15,000 topics
and 166,000 relationships, including emerging fields like GenAI, large language models (LLMs), and prompt engineering. It processes metadata (such
as title and abstract) through three modules: a syntactic module for exact topic matches, a semantic module leveraging word embeddings to infer
related topics, and a post-processing module that refines results by filtering outliers and adding relevant higher-level areas.
Total Number of AI Publications
Total AI publication output continues to rise. AI publications more than doubled between 2013 and 2024,
increasing from roughly 102,000 to about 258,000 (Figure ). Growth continued in 2024, though at a
slower rate, with publications increasing % from 2023. AI research now makes up a substantial portion
of the broader computer science ecosystem, accounting for % of all computer science publications in
OpenAlex.
2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024
0
50
100
150
200
250
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
N
u
m
b
e
r
o
f
A
I p
u
b
lic
at
io
n
s
in
C
S
(
in
t
h
o
u
sa
n
d
s)
A
I p
u
b
lic
at
io
n
s
in
C
S
(
%
o
f
to
ta
l)
AI publications in CS worldwide, 2013–24
Source: AI Index, 2025 | Chart: 2026 AI Index report
Figure
48
PUBLICATIONS | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
By Venue
Conference Attendance
In 2024, journals accounted for the largest share of AI publications (%), followed by conferences (%)
(Figure ). Since 2013, both journal and conference publications have increased in absolute numbers,
though their relative shares have shifted. The proportion of AI publications appearing in conferences has
steadily declined from % in 2013 to its current level. The most recent year’s results, however, may also
reflect a lag in venue assignment, as papers often appear first in repositories25 like arXiv before being formally
published in a journal or conference.
25 In this context, ‘repository’ refers to preprint platforms such as arXiv, where researchers post papers prior to or independent of formal peer-
reviewed publication in a journal or conference.
2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024
0
20
40
60
80
100
120
140
N
u
m
b
e
r
o
f
A
I p
u
b
lic
at
io
n
s
in
C
S
(
in
t
h
o
u
sa
n
d
s)
, Dissertation
, Other
, Book
, Conference
, Repository
, Journal
Number of AI publications in CS by venue type, 2013–24
Source: AI Index, 2026 | Chart: 2026 AI Index report
Figure
Publication venue patterns capture where AI research is formally published, while conference attendance
offers a complementary view of research community engagement. Across the 16 major conferences tracked
by the AI Index—AAAI, AAMAS, CVPR, EMNLP, FAccT, ICAPS, ICCV, ICLR, ICML, ICRA, IJCAI, IROS, KR,
NeurIPS, UAI, and IUI—total attendance increased in 2024 from the previous year (Figure ). The largest
conferences, including NeurIPS, CVPR, and ICML, continued to draw the highest attendance, while smaller
ones such as ICAPS, KR, and UAI maintained stable participation levels (Figures and ). This data
should be interpreted with caution, as many conferences have recently switched to virtual or hybrid formats.
Conference organizers report that measuring the exact attendance numbers at virtual conferences is difficult,
as virtual conferences allow for higher attendance by researchers from around the world. The AI Index
reports total attendance figures, encompassing virtual, hybrid, and in-person participation.
49
PUBLICATIONS | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025
20
40
60
80
100
N
u
m
b
e
r
o
f
at
te
n
d
e
e
s
(i
n
t
h
o
u
sa
n
d
s)
Attendance at select AI conferences, 2010–25
Source: AI Index, 2025 | Chart: 2026 AI Index report
2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025
0
5
10
15
20
25
30
N
u
m
b
e
r
o
f
at
te
n
d
e
e
s
(i
n
t
h
o
u
sa
n
d
s)
, IJCAI
, EMNLP
, AAAI
, ACL
, ICRA
, ICCV
, ICML
, IROS
, CVPR
, ICLR
, NeurIPS
Attendance at larger conferences, 2010–25
Source: AI Index, 2025 | Chart: 2026 AI Index report
Figure
Figure
26 The significant spike in ICML attendance in 2021 was likely due to the conference being held virtually that year.
50
PUBLICATIONS | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025
N
u
m
b
e
r
o
f
at
te
n
d
e
e
s
(i
n
t
h
o
u
sa
n
d
s)
, KR
, ICAPS
, IUI
, UAI
, AAMAS
, FaccT
Attendance at smaller conferences, 2010–25
Source: AI Index, 2025 | Chart: 2026 AI Index report
Figure
By National Affiliation28
In 2024, China accounted for % of AI publications in 2024, compared to % from Europe and % from
India (Figure ). Chinese AI publications also accounted for % of all AI citations in 2024, followed
closely by Europe at % and the United States at % (Figure ). The United States saw a decline of
3 percentage points in publication share, though its citation share remained relatively unchanged (% in
2024 vs. % in 2023). The “unknown” share in publication data rose to % in 2024, a spike that likely
reflects changes in metadata coverage. The geographic distribution of publications and citations adds con-
text to the notable model trends discussed earlier in the chapter, where a relatively small number of countries
account for a disproportionate share of activity.
27 IUI 2021 and 2022 were held exclusively virtually.
28 Regions in this chapter are classified according to the World Bank analytical grouping. The AI Index determines an author’s country affiliation
using the “countries” field from the authorship data. This field lists all the countries with which an author is affiliated, as retrieved from OpenAlex
based on institutional affiliations. These affiliations can be explicitly stated in the paper or inferred from the author’s most recent publications. When
counting publications by country, the AI Index assigns one count to each country linked to the publication. For example, if a paper has three authors,
two affiliated with institutions in the United States and one in China, the publication is counted once for the United States and once for China.
29 A publication may have an “unknown” country affiliation when the author’s institutional affiliation is missing or incomplete. This issue arises due
to various factors, including unstructured or omitted institution names, platform functional deficiencies, group authorship practices, unstandardized
affiliation labeling, document type inconsistencies, or the author’s limited publication record. The problem as it relates to OpenAlex is addressed in
this paper; however, the issue of missing institutions pertains to other bibliographic databases as well.
51
Figure
30 For the sake of brevity, the AI Index visualized results for a select group of countries. However, complete results for all countries will be available
on the AI Index’s Global Vibrancy Tool by the end of 2026. For immediate access to country-specific research and development data, please contact
the AI Index team.
PUBLICATIONS | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024
0%
5%
10%
15%
20%
25%
30%
35%
40%
A
I p
u
b
lic
at
io
n
s
in
C
S
(
%
o
f
to
ta
l)
%, United States
%, India
%, Europe
%, Rest of the world
%, China
%, Unknown
AI publications in CS (% of total) by select geographic areas, 2013–24
Source: AI Index, 2026 | Chart: 2026 AI Index report
2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024
0%
5%
10%
15%
20%
25%
30%
35%
A
I p
u
b
lic
at
io
n
s
in
C
S
(
%
o
f
to
ta
l)
%, Sub-Saharan Africa
%, Latin America and the Caribbean
%, Middle East and North Africa
%, North America
%, South Asia
%, Europe and Central Asia
%, East Asia and Pacic
%, Unknown
AI publications in CS (% of total) by region, 2013–24
Source: AI Index, 2026 | Chart: 2026 AI Index report
Figure
52
By Sector
Academia produced the majority of AI publications in 2024 (%), followed by government institutions
(%), industry (%), and nonprofit organizations (%)(Figure ). The sector breakdown does vary
by region (Figure ). In the United States, a higher share of AI publications came from industry (%)
compared to China (18%), where government institutions were more meaningful contributors (%). Europe
had the highest percentage of AI publications originating from academia (%).
PUBLICATIONS | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024
0%
10%
20%
30%
40%
50%
60%
70%
A
I p
u
b
lic
at
io
n
s
in
C
S
(
%
o
f
to
ta
l)
%, Other
%, Nonprot
%, Industry
%, Government
%, Academia
AI publications in CS (% of total) by sector, 2013–24
Source: AI Index, 2026 | Chart: 2026 AI Index report
Figure
31 For Figure and Figure , publications with unknown affiliations were excluded.
%
%
%
%
%
%
%
%
%
%
%
%
0% 10% 20% 30% 40% 50% 60%
Government
Nonprot
Industry
Academia
United States
Europe
China
AI publications (% of total)
AI publications in CS (% of total) by sector and geographic area, 2024
Source: AI Index, 2026 | Chart: 2026 AI Index report
Figure
53
Figure
The AI Index identified the 100 most-cited AI publications from 2021 to 2024 using citation data from
Due to citation lag, this set can shift as citations accumulate over The publication volume
data above captures the scale of research activity, while the top 100 offers a more selective view on which
work is gaining the most recognition and influence.
32 The AI Index categorized papers using its own topic classifier. It is possible for a single publication to be assigned multiple topic labels.
33 The full methodological guide can be accessed in the Appendix, along with the list of the top 100 articles.
34 A publication can have multiple authors from different countries or organizations. If a paper includes authors from multiple countries, each coun-
try is credited once. As a result, some of the totals in this section exceed 100.
By Topic
Top 100 Publications
AI research in 2024 remained concentrated in a small set of core topics, though the breadth of areas
continued to expand. Similar to the previous year, the most prevalent research topic was machine learning
(37%), followed by computer vision (%), pattern recognition (%), and natural language processing
(10%) (Figure ). Publications on generative AI continued to show sharp growth, extending the trend from
previous years. It is also worth noting that the AI Index topic classifier can assign multiple topic labels to a
single publication, so topic totals can be seen as overlapping categories rather than mutually exclusive.
PUBLICATIONS | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024
0
50
100
150
200
N
u
m
b
e
r
o
f
A
I p
u
b
lic
at
io
n
s
(i
n
t
h
o
u
sa
n
d
s)
, Robotics
, Logic and reasoning
, Multi-agent systems
, Evolutionary computation
, Knowledge based systems
, Generative AI
, Natural language processing
, Pattern recognition
, Computer vision
, Machine learning
Number of AI publications by select top topics, 2013–24
Source: AI Index, 2026 | Chart: 2026 AI Index report
By National Affiliation
The geographic distribution of the top 100 has shifted over time (Figure ). The United States still ranks
highest in top-cited publications each year, though its share has gradually declined from 64 in 2021 to 46
in 2024. China’s share increased to 41 in 2024, from 34 in 2023, and Australia increased to 14 highly cited
publications, up from 2 in 2023 and 6 in 2021.
54
PUBLICATIONS | RESEARCH AND DEVELOPMENT | AI INDEX REPORT 2026
46
41
15
14
9
8
7
6
5
4
50
34
7
7
6
5
4
2
2
2
58
34
7
6
6
5
4
3
3
2
64
33