MidnightAI.orgMidnightAI.org
Donate
MidnightAI.orgMidnightAI.org

An academic research initiative tracking humanity's progress toward superintelligent AI

Monitoring47+ sources

Research

InsightsCapabilitiesMilestonesMethodologyGlossary

Resources

Latest NewsAI CompaniesAboutTeam

Legal

Privacy PolicyTerms of ServiceSupport Us

Attribution

Inspired by the Bulletin of the Atomic Scientists

AI-Assisted Analysis

Weekly Digest

Get AI progress updates delivered every Monday

How to Cite

MidnightAI.org (2026). AI Progress Tracker: Minutes to Midnight. Retrieved from https://midnightai.org

© 2026 MidnightAI.org. For research and educational purposes only.

Data updated continuously from 47+ sources
Created byBeckham Labs
  1. Dashboard
  2. Research
  3. Milestones

AI Milestones

A comprehensive timeline of major AI achievements and predictions for future developments in artificial intelligence

OverviewInsightsTrendsCapabilitiesMilestonesMethodologyGlossary
50
Total Milestones
Tracked events
47
Achieved
Historical milestones
3
Predicted
Future milestones

Humanoid Robot Mass Production

June 1, 2027
PredictedHigh Impact

First mass production of AI-powered humanoid robots for commercial use

Confidence:
40%

Autonomous Research Agent

December 1, 2026
PredictedHigh Impact

AI systems capable of conducting independent scientific research

Confidence:
50%

AGI-Level Reasoning

June 1, 2026
PredictedHigh Impact

Models achieve human-expert level on complex multi-step reasoning benchmarks

Confidence:
40%

Berkeley Exposes Critical AI Agent Benchmark Flaws

April 11, 2026
Achieved

Berkeley researchers demonstrate systematic ways to break top AI agent benchmarks, highlighting fundamental evaluation methodology issues.

View Source →

MolmoWeb: Open Visual Web Agent Framework

April 9, 2026
Achieved

Open-source visual web agent with transparent training data and methodology for autonomous web navigation tasks.

View Source →

ClawBench: Real-World AI Agent Evaluation Framework

April 9, 2026
AchievedAnthropic

Anthropic introduces ClawBench, a comprehensive evaluation framework testing AI agents on 153 everyday online tasks across 144 live platforms.

View Source →

Act Wisely: Meta-Cognitive Tool Use Framework

April 9, 2026
Achieved

Research breakthrough addressing agents' meta-cognitive deficits in arbitrating between internal knowledge and external tool usage.

View Source →

Meta Announces Muse Spark Personal Superintelligence

April 8, 2026
AchievedMeta AI

Meta introduces Muse Spark, positioning it as a step toward personal superintelligence capabilities for individual users.

View Source →

MegaTrain Enables 100B+ Parameter Training on Single GPU

April 8, 2026
Achieved

Research breakthrough allows full-precision training of 100+ billion parameter language models on a single GPU, dramatically reducing training costs.

View Source →

Claude Mythos Preview for Cybersecurity Released

April 7, 2026
AchievedAnthropic

Anthropic releases specialized Claude model variant focused on advanced cybersecurity capabilities with detailed system card documentation.

View Source →

Google Gemma-4 Multimodal Model Series Released

April 2, 2026
AchievedGoogle DeepMind

Google releases Gemma-4 series with any-to-any and image-text-to-text capabilities across multiple parameter sizes (4B-31B).

View Source →

Claude Demonstrates Full OS Kernel Exploit Generation

April 1, 2026
AchievedAnthropic

Claude successfully wrote a complete FreeBSD remote kernel RCE exploit with root shell, demonstrating advanced cybersecurity capabilities.

View Source →

Former Qwen Lead's Agentic Thinking Manifesto

March 26, 2026
AchievedAlibaba (Qwen)

Original Alibaba Qwen technical lead publishes influential essay on transitioning from reasoning to agentic thinking paradigms.

View Source →

ARC-AGI-3 Benchmark Released

March 25, 2026
Achieved

New benchmark designed to measure artificial general intelligence through novel reasoning tasks, addressing limitations of previous AI evaluation methods.

View Source →

VTAM: Video-Tactile-Action Models for Robotics

March 24, 2026
Achieved

First multimodal framework combining video, tactile sensing, and action prediction for contact-rich physical interactions.

View Source →

SpecEyes: Speculative Acceleration for Agentic AI

March 24, 2026
AchievedOpenAI

OpenAI introduces framework to accelerate multimodal agent reasoning through speculative perception and planning.

View Source →

GPT-5.4 Pro Solves Frontier Math Open Problem

March 24, 2026
AchievedOpenAI

First AI system confirmed to solve an open mathematical research problem, marking breakthrough in AI mathematical reasoning capabilities.

View Source →

iPhone 17 Pro Runs 400B Parameter Model

March 23, 2026
Achieved

First demonstration of a 400 billion parameter language model running natively on a mobile device, showcasing dramatic advances in on-device AI.

View Source →

Reasoning Circuits Discovery in Transformers

March 18, 2026
Achieved

Researchers discover discrete 3-4 layer 'reasoning circuits' in transformers that can be duplicated to dramatically improve logical deduction performance without training.

View Source →

Online Experiential Learning Framework

March 17, 2026
Achieved

Research introduces framework enabling language models to continuously improve from real-world deployment experience rather than offline training only.

View Source →

Nvidia Launches Vera CPU for Agentic AI

March 16, 2026
Achieved

Nvidia introduces purpose-built CPU architecture specifically designed for agentic AI workloads, marking hardware specialization for autonomous agents.

View Source →

Morgan Stanley Predicts Major AI Breakthrough in H1 2026

March 14, 2026
Achieved

Investment bank warns of imminent AI breakthrough driven by rapid computing expansion that could strain power grids and disrupt jobs globally.

View Source →

John Carmack Challenges AGI Timeline Predictions

March 14, 2026
Achieved

Legendary programmer John Carmack publicly disputes OpenAI and other labs' aggressive AGI timelines, stating 'We Are Not on the Brink of AGI' with significant implications for industry investment.

View Source →

Claude Opus/Sonnet 4.6 Achieves 1M Context Window

March 13, 2026
AchievedAnthropic

Anthropic's Claude models now support 1 million token context windows in general availability, enabling processing of extremely long documents.

View Source →

Understudy: Teach-by-Demonstration Desktop Agent

March 12, 2026
Achieved

First desktop agent that learns tasks from single demonstrations across GUI apps, browsers, terminals, and messaging tools in unified sessions.

View Source →

Nvidia Invests $26B in Open-Source AI Development

March 12, 2026
Achieved

Nvidia announces major strategic shift with $26 billion investment in open-source AI models over five years, competing directly with OpenAI and other closed-source providers.

View Source →

Feature Correlations Shape Neural Superposition

March 10, 2026
Achieved

Research reveals how data correlations determine feature geometry in neural networks, extending beyond sparse uncorrelated settings.

View Source →

RunAnywhere MetalRT Optimizes Apple Silicon Inference

March 10, 2026
Achieved

Open-source inference engine achieves faster performance than llama.cpp, MLX, and Ollama on Apple Silicon using custom Metal shaders.

View Source →

Reasoning Unlocks Hidden Parametric Knowledge

March 10, 2026
Achieved

Research demonstrates that chain-of-thought reasoning substantially expands LLMs' ability to recall factual knowledge from parameters.

View Source →

Neural Debugger for Python Code Execution

March 10, 2026
Achieved

LLM trained on Python execution traces can predict line-by-line execution and function as a neural interpreter with debugging capabilities.

View Source →

Age Verification AI Systems Achieve Sub-2 Year Accuracy

March 9, 2026
Achieved

AI-powered age verification systems now achieve 1-2 year accuracy in determining user ages, enabling widespread implementation of child safety laws across multiple jurisdictions.

View Source →

Evo 2: AI Models Genetic Code Across All Life

March 7, 2026
Achieved

DNA foundation model trained on 100,000+ species can identify genetic patterns across entire tree of life, published in Nature.

View Source →

SageBwd: Trainable Low-bit Attention for Training

March 2, 2026
Achieved

First trainable INT8 attention system that quantizes six of seven attention operations while preserving training performance.

View Source →

Frontier Models Demonstrate Low-Probability Actions

March 2, 2026
AchievedOpenAI

Research shows GPT-5, Claude-4.5, and Qwen-3 can execute rare strategic actions while maintaining calibration, raising safety concerns.

View Source →

Chinese AI Token Usage Surpasses US for First Time

February 28, 2026
Achieved

China's AI model usage reached 4.12 trillion tokens vs US 2.94 trillion tokens in one week, marking historic shift.

View Source →

DeepRare: First Traceable AI Rare Disease Diagnosis

February 28, 2026
Achieved

Shanghai hospital launches world's first traceable AI agent system for rare disease diagnosis, published in Nature.

View Source →

Pentagon Designates Anthropic Supply-Chain Risk

February 27, 2026
AchievedAnthropic

Department of Defense designates Anthropic as supply-chain risk amid clash over military AI partnerships, marking escalation in AI governance conflicts.

View Source →

OpenAI Raises $110B at $730B Valuation

February 27, 2026
AchievedOpenAI

OpenAI secures record-breaking $110B funding round with major investors including SoftBank, Nvidia, and Amazon, highlighting massive AI investment scale.

View Source →

Anthropic Drops Flagship Safety Pledge

February 25, 2026
AchievedAnthropic

Anthropic abandons a major safety commitment, marking a significant shift in AI safety policy approach from one of the leading safety-focused AI companies.

View Source →

Chinese AI Labs Reverse-Engineer Claude Models

February 24, 2026
AchievedAnthropic

Anthropic alleges 16 Chinese AI entities systematically distilled Claude through API harvesting, raising IP protection concerns.

View Source →

Aletheia Solves 6/10 FirstProof Challenge Problems

February 24, 2026
AchievedGoogle DeepMind

Google's Aletheia agent powered by Gemini 3 Deep Think autonomously solved 6 out of 10 problems in the inaugural FirstProof mathematics challenge, demonstrating advanced mathematical reasoning capabilities.

View Source →

NVMe-to-GPU Direct Loading for Large Models

February 21, 2026
Achieved

Novel architecture enables running Llama 3.1 70B on single RTX 3090 by bypassing CPU/RAM bottlenecks.

View Source →

ASTERIS: Self-Supervised Astronomical Imaging

February 21, 2026
Achieved

Tsinghua team develops AI model that extends James Webb Space Telescope detection depth by 1 magnitude, discovering 3x more distant galaxies.

View Source →

GLM-5: Agentic Engineering Foundation Model

February 17, 2026
Achieved

GLM-5 introduces a paradigm shift from vibe coding to agentic engineering with new DSA architecture and asynchronous RL infrastructure.

View Source →

Claude Sonnet 4.6 Release

February 17, 2026
AchievedAnthropic

Anthropic releases Claude Sonnet 4.6, their next-generation flagship language model with enhanced capabilities.

View Source →

GPT-5.2 Derives New Result in Theoretical Physics

February 13, 2026
AchievedOpenAI

GPT-5.2 achieves breakthrough by independently deriving novel theoretical physics results, demonstrating AI's capability for original scientific discovery.

View Source →

GPT-5.3-Codex-Spark Specialized Coding Model

February 12, 2026
AchievedOpenAI

OpenAI releases GPT-5.3-Codex-Spark, a specialized model for advanced code generation and programming tasks.

View Source →

Gemini 3 Deep Think Release

February 12, 2026
AchievedGoogle DeepMind

Google releases Gemini 3 Deep Think, advancing reasoning capabilities in multimodal AI systems.

View Source →

Anthropic Raises $30B Series G at $380B Valuation

February 12, 2026
AchievedAnthropic

Anthropic achieves massive funding round establishing it as one of the most valuable AI companies globally.

View Source →

GPT-5 Outperforms Federal Judges in Legal Reasoning

February 11, 2026
AchievedOpenAI

GPT-5 demonstrates superior performance to human federal judges in legal reasoning tasks, marking a significant breakthrough in AI's ability to handle complex legal analysis.

View Source →

How We Track Milestones

Milestones are identified through analysis of research publications, product announcements, and expert assessments. Predictions are based on current progress trajectories and capability assessments.

Read our methodology