← Back to About

The AIght Score

The one number on this site that matters.

Every tool in the archive carries a single 0–100 score. It’s the average of five axes I rate by hand, after spending real time with the tool. No algorithm. No vendor input. No paid placements.

The score isn’t generated. There’s no API call behind it. I pick five axes that have stayed the same since the archive started — utility, privacy, speed, cost, transparency — and I write a number on each one after I’ve used the tool enough to mean it.

Some tools sit in the archive for weeks before they get a score. That’s on purpose. A number this small needs to carry weight, so I’d rather show scoring in review than publish a guess.

When a tool changes — new pricing, new privacy posture, a model swap — the score moves with it. The last-updated date on each tool page is the last day I sat with it.

The five axes

Each 0–100. The published score is the mean.

01

Utility

Does it actually do the thing it claims, well enough that you'd choose it over the alternative?

What it’s asking

Real, repeatable usefulness on the job it's marketed for. Demos that survive contact with messy inputs.

Red flags

  • Polished landing page, broken in week-two use
  • Works on the demo prompt, fails on yours
  • Requires a specific prompt incantation to behave
02

Privacy

What happens to your data once it's in the box?

What it’s asking

Clear policy on training, retention, residency, deletion. EU options when relevant. Self-host as a bonus.

Red flags

  • Defaults to training on your prompts unless you opt out
  • Buried retention terms or none at all
  • Vague "we may share with partners" clauses
03

Speed

How long do you actually wait for a useful output, in a realistic session?

What it’s asking

Time-to-first-useful-result. Latency at common context lengths. Streaming where it matters.

Red flags

  • Headline benchmarks measured on toy prompts
  • Hidden queue times under load
  • "Fast" tier locked behind enterprise pricing
04

Cost

What does it really cost in a normal month, including the things they don't put on the pricing page?

What it’s asking

Honest monthly spend for a representative workload. Overages, throughput caps, paywall cliffs disclosed.

Red flags

  • Free tier that resets daily, not monthly
  • Token pricing with no visible usage meter
  • Surprise per-seat minimums on the upgrade path
05

Transparency

How honest is the team about what the tool can and can't do?

What it’s asking

Public changelogs. Acknowledged failure modes. Real model names, not marketing names. Open weights or open source where claimed.

Red flags

  • Model identity hidden behind a custom brand name
  • Quiet feature removals
  • Marketing speed numbers that don't match the docs

What the bands mean

Plain English for the headline number.

90–100Reach for it.

I keep coming back to this without thinking about it. The reason to use it outweighs the cost of switching to anything else.

75–89Worth trying.

Strong on most axes, real weaknesses on one. Worth your time if the weaknesses don't hit your specific case.

60–74Situational.

A serious answer for a narrow problem. Wrong tool for most people; right tool for some.

45–59Compromise.

There's something fundamentally awkward about it. Use it because the alternative is worse, not because it's good.

0–44Skip it.

Listed in the archive only so I can explain why I don't recommend it.

The score is an opinion, slowly formed. If you disagree, that’s the point — disagree with a person, not a vendor’s landing page.

Browse scored tools →About AIght