The AIght Score

The one number on this site that matters.

Every tool in the archive carries a single 0–100 score. It’s the average of five axes I rate by hand, after spending real time with the tool. No algorithm. No vendor input. No paid placements.

The score isn’t generated. There’s no API call behind it. I pick five axes that have stayed the same since the archive started — utility, privacy, speed, cost, transparency — and I write a number on each one after I’ve used the tool enough to mean it.

Some tools sit in the archive for weeks before they get a score. That’s on purpose. A number this small needs to carry weight, so I’d rather show scoring in review than publish a guess.

When a tool changes — new pricing, new privacy posture, a model swap — the score moves with it. The last-updated date on each tool page is the last day I sat with it.

The five axes

Each 0–100. The published score is the mean.

Utility

Does it actually do the thing it claims, well enough that you'd choose it over the alternative?

What it’s asking

Real, repeatable usefulness on the job it's marketed for. Demos that survive contact with messy inputs.

Red flags

Polished landing page, broken in week-two use
Works on the demo prompt, fails on yours
Requires a specific prompt incantation to behave

Privacy

What happens to your data once it's in the box?

What it’s asking

Clear policy on training, retention, residency, deletion. EU options when relevant. Self-host as a bonus.

Red flags

Defaults to training on your prompts unless you opt out
Buried retention terms or none at all
Vague "we may share with partners" clauses

Speed

How long do you actually wait for a useful output, in a realistic session?

What it’s asking

Time-to-first-useful-result. Latency at common context lengths. Streaming where it matters.

Red flags

Headline benchmarks measured on toy prompts
Hidden queue times under load
"Fast" tier locked behind enterprise pricing

Cost

What does it really cost in a normal month, including the things they don't put on the pricing page?

What it’s asking

Honest monthly spend for a representative workload. Overages, throughput caps, paywall cliffs disclosed.

Red flags

Free tier that resets daily, not monthly
Token pricing with no visible usage meter
Surprise per-seat minimums on the upgrade path

Transparency

How honest is the team about what the tool can and can't do?

What it’s asking

Public changelogs. Acknowledged failure modes. Real model names, not marketing names. Open weights or open source where claimed.

Red flags

Model identity hidden behind a custom brand name
Quiet feature removals
Marketing speed numbers that don't match the docs

What the bands mean

Plain English for the headline number.

90–100Reach for it.

I keep coming back to this without thinking about it. The reason to use it outweighs the cost of switching to anything else.

75–89Worth trying.

Strong on most axes, real weaknesses on one. Worth your time if the weaknesses don't hit your specific case.

60–74Situational.

A serious answer for a narrow problem. Wrong tool for most people; right tool for some.

45–59Compromise.

There's something fundamentally awkward about it. Use it because the alternative is worse, not because it's good.

0–44Skip it.

Listed in the archive only so I can explain why I don't recommend it.

The score is an opinion, slowly formed. If you disagree, that’s the point — disagree with a person, not a vendor’s landing page.

Browse scored tools →About AIght