ScenarioLab

Simulations grounded in real-world incidents

Can your candidates spot
what AI gets wrong?

ScenarioLab gives security hiring teams incident simulations — candidates review a deliberately flawed AI-generated report set against a named real-world attack (SolarWinds, Log4Shell, Colonial Pipeline and more), under time pressure. Scoring is fully deterministic. No LLMs in the loop.

See how it works →

Free trial · no credit card required

ai-pentest-report.txt

AI Security Analysis — Network Scan

CRITICAL · SQL injection on /api/users

CVSS 9.8 · Apply input sanitisation immediately.

MEDIUM · TLS 1.0 downgrade risk

Recommendation: disable TLS 1.2 and below.

LOW · CVE-2021-44228 reference...

Candidate must identify

✕CVSS score inconsistent with described impact

✕Remediation contradicts current TLS best practice

✕CVE reference misattributed to wrong vendor

The hiring gap

AI writes the reports now. Who’s reading them critically?

AI output looks authoritative

AI-generated security reports are increasingly polished — and increasingly wrong. Candidates who can't distinguish confident AI prose from sound analysis are a liability.

Interviews don't surface this gap

Technical screens test memorised CVEs and syntax. Scenario-based assessments reveal whether candidates actually reason through an analysis or just pattern-match.

Standard rubrics are subjective

Most soft-skill assessments rely on interviewer opinion. ScenarioLab scores against a fixed key — same standard for every candidate, every time.

How it works

Three steps from sign-up to insight

Choose an incident

Pick a real-world incident — SolarWinds, Log4Shell, Colonial Pipeline and more. Each is paired with an AI-generated report containing embedded analytical flaws, scored against a fixed rubric.

Invite candidates

Paste a candidate email and we send them a private assessment link. They authenticate with a one-time code — no account creation needed.

Review the results

Each submission produces a detailed evaluation report: per-question scores, reasoning quality, and an overall band. Export as PDF for your records.

Incident library

Grounded in attacks that actually happened

Every scenario is a simulation of a named real-world security incident. Candidates receive background context before they begin — so they assess the AI report knowing what the real attack looked like, not in a vacuum.

SolarWinds Supply Chain Attack

2020

Supply chain · APT29

Nation-state attackers inserted SUNBURST into signed Orion updates, reaching ~18,000 organisations including US federal agencies.

Log4Shell (CVE-2021-44228)

2021

Zero-day · RCE

A critical flaw in Apache Log4j enabled unauthenticated remote code execution by logging a crafted string — exploited globally within hours of disclosure.

Colonial Pipeline Ransomware

2021

Ransomware · OT impact

DarkSide gained access via a single unprotected VPN credential, triggering a 6-day pipeline shutdown and fuel shortages across the US East Coast.

Uber Data Breach

2022

MFA fatigue · social engineering

An attacker used push-bombing and impersonation to bypass MFA, then found hardcoded admin credentials giving access to Slack, AWS, and internal dashboards.

Lazarus Group — Operation Dream Job

2020

Spearphishing · DPRK

North Korean APT posed as defence recruiters, delivering malware via fake job offer documents targeting aerospace and government contractors.

Generic Security Simulations

Foundational

Phishing · malware · spearphishing

Scenarios covering attack patterns that appear across the full threat landscape — phishing, malware delivery, and spearphishing — not tied to a single named event.

Sample report

This is what you get after every submission

Every completed assessment generates a full evaluation — per-dimension scores, a critical thinking profile, and a hiring recommendation. No interpretation needed.

Alex M.

a7f3...c91e@candidate

Senior SOC Analyst Screening · Phishing Campaign Attribution

MID

Submitted 10 Apr 2026 · Time taken 17m 42s

68 / 100

Proficient

Critical thinking profile

Strongest area

Threat attribution

Weakest area

Remediation reasoning

Summary

Strong attribution instincts; remediation steps lack precision under time pressure.

Evaluation narrative

Alex correctly identified the misattributed threat actor and flagged the CVSS inconsistency on Q2. Remediation responses were directionally correct but lacked the specificity expected at mid level — particularly around lateral movement containment. Overall reasoning is sound; gaps are addressable with structured mentorship.

Recommendation: Proceed to final interview with focus on incident response depth.

Dimension breakdown

Threat attribution85%

Strong

Evidence evaluation72%

Strong

Risk prioritisation60%

Adequate

Remediation reasoning42%

Weak

Dimension	Score	Max	Band
Threat attribution	17	20	Strong
Evidence evaluation	18	25	Strong
Risk prioritisation	15	25	Adequate
Remediation reasoning	18	30	Weak

Built for security teams

Designed with rigour in mind

No AI in scoring

Every response is evaluated against a deterministic rubric. Scoring is a pure function — reproducible and auditable.

Time-limited by design

20-minute sessions. Realistic pressure without the noise of open-book take-homes.

Structured evaluation

Per-question scoring with overall bands. Clear signal on where a candidate's reasoning breaks down.

Real incident context

Candidates receive background on the real-world attack before they begin — so they assess the report in context, not in a vacuum.

Privacy-first

Candidate emails are hashed after OTP verification. Raw answers are never written to the database.

Beyond analyst roles

Incident Management simulations evaluate decision-makers, not just report reviewers — see below.

New: beyond analyst roles

Evaluate incident managers, not just analysts

A second simulation track for roles that make decisions, not just review reports. Candidates walk a branching incident response grounded in the same real-world attacks — triage an alert, weigh a (sometimes flawed) AI-generated analysis, and choose how to contain, escalate, or investigate further. Every path is scored against the same deterministic rubric. No LLMs in the scoring loop, same as always.

Triage & PrioritizationContainment Decision-MakingEscalation JudgmentEvidence & AI-Analysis CritiqueRoot-Cause & Technical Reasoning

Pricing

Free access

Free during early access — pricing will be introduced later.

Early access

All features included while we're in early access — no credit card required.

Free · Early access

✓All scenarios
✓All assessments
✓Unlimited candidates
✓PDF export

Start assessing in minutes

Free trial, no credit card. Two assessments, three candidates — enough to see whether ScenarioLab fits your hiring process.

Can your candidates spot what AI gets wrong?