What is the Unified Power Score?

The Unified Power Score (max 200) ranks AI models using the formula: Intelligence × (1 + Value/100). Intelligence is the unweighted average of 11 frontier benchmarks (max 100). Value is a log-normalized cost efficiency score (max 100).

What benchmarks are used to measure AI intelligence?

11 frontier benchmarks: AIME 2025 (Math Olympiad), HMMT 2025 (Math Tournament), GPQA Diamond (PhD Science), ARC-AGI (Reasoning), BrowseComp (Web Agents), ARC-AGI v2 (Advanced Reasoning), HLE (Humanity's Last Exam), MMLU-Pro (Knowledge), LiveCodeBench (Coding), SWE-Bench Verified (Software Engineering), and CodeForces (Competitive Programming).

How is the Value Index calculated?

Value Index is a log-normalized efficiency score based on blended cost per 1M tokens. The formula uses $0.25 as the floor (best value = 100) and $60.00 as the ceiling (worst value = 0).

How is the National Score calculated?

The National Score is the sum of Unified Power Scores for all models from that nation that appear in the Global Top 10. Only top 10 models contribute to a nation's score.

US vs CHINA AI | About - Methodology & Scoring Explained

What this site is

A leaderboard that ranks the world's best AI models by both intelligence and cost efficiency—then shows you which country is winning.

Use it to: Compare models side-by-side, find the best value for your use case, and see how the US-China AI race is unfolding in real time.

Why we built it

AI is a race—one that increasingly looks like a two-front contest between the United States and China. In a fast-moving landscape, debates tend to collapse into benchmark cherry-picking or price-only arguments. We wanted a simple, repeatable way to answer: “How strong is a model?” and “How much leverage do you get per dollar?”

This site turns that race into something you can track: who has the most capable models, who has the best economics, and who can place more models into the global Top 10.

🇺🇸 USA

Frontier capability, research leadership, and premium performance.

🇨🇳 China

Scaling, iteration speed, and cost-efficient deployment at volume.

The dashboard stays explicit about assumptions. If you disagree, you can still use the raw IQ and Value components directly.

Scoring model

Unified Power Score (Max 200)

We use Intelligence-Gated Value: intelligence is the foundation, and value is a multiplier.

Unified = IQ × (1 + Value / 100)

A model with zero intelligence scores zero—no matter how cheap.

National score (USA vs China)

National totals are computed from the Global Top 10 only (by Unified score). This rewards both peak performance and depth of top-tier presence.

Eligibility rule

Only models that appear in the global Top 10 contribute to a country’s national total.

Key assumptions

1) Intelligence is non-negotiable

Cheap output is not “value” if the model cannot solve hard problems.

2) Value is a multiplier, not a substitute

A strong model becomes more powerful when it’s also affordable to deploy at scale.

3) Benchmarks are averaged unweighted

We don’t hand-tune weights. This keeps the IQ index simple and auditable.

4) Pricing is log-normalized

Costs vary by orders of magnitude, so value uses a log scale to avoid extreme distortion.

5) “All” views are for context

The headline national scores are based on the Top 10 filter, by design.

6) This is a snapshot, not a truth oracle

Benchmarks and pricing shift. Treat rankings as a periodic audit, not a permanent leaderboard.

Benchmarks included

The IQ index is an unweighted average across 11 frontier benchmarks:

AIME 2025 HMMT 2025 GPQA Diamond ARC-AGI BrowseComp ARC-AGI v2 HLE MMLU-Pro LiveCodeBench SWE-Bench Verified CodeForces

Data notes

Scores and pricing are sourced from each model's linked reference pages. Vendor pricing can change quickly; this is why the dashboard is presented as an audit at a point in time.

Not affiliated with any model provider. No guarantees are made about correctness, completeness, or fitness for any purpose.