Model Cards — AIght

A model card is the document a lab publishes alongside a model release. The format was proposed in 2018 (Mitchell et al.) and has become standard practice — though the rigour varies wildly.

Intended use

General-purpose assistant for text, vision, and audio across consumer and API uses.

Training data

Web text, books, code, and multimodal data up to April 2023; exact sources undisclosed.

Known limitations

·May generate plausible but factually incorrect information (hallucination).
·Uneven performance on low-resource languages and specialised domains.
·Cannot reliably perform real-time reasoning on current events.

Bias evaluation results

Gender

73%

Political

61%

Profession

68%

Higher = lower bias on internal eval set. Methodology varies by lab.

Last updated: 2024-05-13

A model card is the cover letter the model writes about itself. The interesting parts are usually buried.

What a good card includes

Identity. Architecture, parameter count, training data description, training compute.
Intended use. Where the model is meant to be deployed.
Limitations. Failure modes the lab has documented.
Evaluations. Benchmark results — usually selective.
Ethical considerations. Bias evaluations, safety testing.
Training data. Sources, filtering criteria, known issues.

What's usually missing

Full training data. "Web text and books" is the typical disclosure. The specifics — which books, which web sources, what was excluded — are commercially sensitive.
Negative results. Benchmarks the model did poorly on rarely appear.
Behavioural quirks. The model's idiosyncratic voice, refusal patterns, sycophancy tendencies — usually discovered post-launch by users.

How to read one

Search for the bullet points the lab spent the most words on — those are usually where their genuine value lies. Then run your own eval.

What to read next

Evals are how you verify the card's claims on your task. Alignment covers the ethical-considerations section.

What a good card includes

Identity. Architecture, parameter count, training data description, training compute.

Intended use. Where the model is meant to be deployed.

Limitations. Failure modes the lab has documented.

Evaluations. Benchmark results — usually selective.

Ethical considerations. Bias evaluations, safety testing.

Training data. Sources, filtering criteria, known issues.

What's usually missing

Full training data. "Web text and books" is the typical disclosure. The specifics — which books, which web sources, what was excluded — are commercially sensitive.

Negative results. Benchmarks the model did poorly on rarely appear.

Behavioural quirks. The model's idiosyncratic voice, refusal patterns, sycophancy tendencies — usually discovered post-launch by users.