Methodology

How we score jobs

Every listing gets an AI-Agency Score (0–100) and two employer-openness flags. This is the rubric, transparently.

TL;DR

Claude reads every job description, scores two axes (Build + Use), computes a blended AI-Agency Score, and flags whether the employer affirmatively welcomes AI-assisted workflows and AI tools in interviews. The split mirrors the Lightcast Open Skills taxonomy dichotomy between "AI engineering" (builds AI) and "AI literacy" (uses AI).

The two axes

AI-Build Score

Does this role BUILD AI systems?

High scores for ML research, LLM infra, training/inference engineering, foundation model work, agentic architecture. Low for sales / ops roles that barely touch AI.

AI-Use Score

Does this role USE AI tools day-to-day?

High scores for roles explicitly centered on Claude / Cursor / Copilot / agentic workflows. Low for legacy / process-heavy roles with minimal AI exposure.

Blending formula

ai_agency_score =
round( 0.6 × ai_build_score + 0.4 × ai_use_score )

Build is weighted heavier because in 2026 "using AI tools" is becoming universal — it's a weaker differentiator. "Building AI" is still the rarer, more distinctive skill. That said, we think both matter, so we don't zero out Use. The 60/40 split lets a high-Use / low-Build candidate still land in AI-fluent (e.g. a growth marketer who lives in Claude + n8n scoring ~65).

Build Score rubric (0–100)

Range What scores here
90–100ML research · foundation model training · LLM infra eng · training & inference optimization · core AI engineering
70–89Applied ML · AI product eng · RAG / agent infra · ML eval systems · data platform for ML
50–69Engineering directly contributing to an AI product's core value — LLM serving infra, eval systems, ML training pipelines, observability for LLM products
30–49ML-adjacent work without direct contribution — data platform feeding ML, auth/billing for an ML product
10–29Generalist SWE at a tech company (AI or not) · eng role silent on AI work
0–9Non-engineering role · no AI-building component

Use Score rubric (0–100)

Range What scores here
90–100Role explicitly centers on AI tooling — prompt engineer · AI-powered growth marketer using agents daily · AI-first product designer · AI DevRel
70–89Description lists specific AI tools (Claude, Cursor, Copilot, n8n, LangChain, agentic workflows) the hire will use daily
50–69Description explicitly names AI tools or LLM workflows as part of the candidate's day-to-day work
30–49Generalist role, silent on AI tools · "tech-forward vibe" alone does not qualify for 50+
10–29Legacy / process-heavy role · minimal AI exposure
0–9No plausible day-to-day AI use

Tier labels

85
AI-core (75–100)
The role IS about AI. You build or live inside it.
62
AI-fluent (50–74)
The role expects AI fluency day-to-day — without being exclusively about AI.

Default public view shows only AI-fluent (50+) and AI-core (75+). Jobs scoring 20–49 ("AI-touching") still exist at /ai-touching or via ?min_agency=20, but aren't promoted — they're tech jobs adjacent to AI, not AI jobs, and the site's promise is the latter.

Employer-openness flags (orthogonal to the score)

Research surfaced a critical nuance: on-the-job AI policy and interview AI policy diverge in practice. Anthropic welcomes AI on the job but says "applicants should not use AI assistants". Amazon forbids AI in interviews but tolerates on the job. A single "AI-friendly" boolean would mislead users. So we extract two flags:

🛠 AI tools welcome at work
On-the-job AI workflows welcomed.
Signals: names specific tools (Cursor, Claude, Copilot), "agent leverage", "AI-assisted development", "ship fast with LLMs".
🤝 AI-OK in interviews
AI tools allowed during interviews.
Signals: "bring your AI tools to the interview", "agentic IDE assessment", "take-home with your usual stack".

Safety belts on scoring

How this relates to industry frameworks

The Build-vs-Use split is not original to us — it mirrors the dichotomy in the Lightcast Open Skills taxonomy (the de facto standard used by LinkedIn, Indeed, and most ATS vendors) which separates AI engineering (builds AI) from AI literacy (uses AI). Our contribution is applying it per-listing, with a numeric score, which is novel at the consumer-facing level.

Academic frameworks that informed this design:

What's novel here

No existing job board produces a per-listing 0–100 AI score. AIOE/SML/OECD all score at the occupation or sector level. Consumer-facing boards (aijobs.net, aijobs.ai, Wellfound) operate as binary "AI bucket" filters without granularity. Our contribution: applying the industry-standard Build/Use dichotomy to individual listings, transparently, using an LLM with a published rubric.

Model & cost

Scoring is done by Claude through a structured extraction pipeline. One merged prompt extracts all structured fields, categorization, both scores, and both employer-openness flags in a single call. Content-hash deduplication ensures unchanged jobs skip the LLM entirely on re-runs.

If you disagree with a score

Email hello@ai-jobs.careers with the job URL and your reasoning. We re-score on a rolling basis and refine the rubric as the market evolves.

Browse scored jobs →