Skip to main content

The Analysis Engine

The analysis engine is the heart of REQQA. It is what turns a requirement or a user story from text you hope is clear into text you have checked is clear. Every analysis run applies the DeFOSPAM technique — REQQA's method for finding faults in requirements and Gherkin stories — through a set of focused analysers, each of which examines the artefact through one analytical lens and reports the problems it finds.

This page explains what the engine does, the families of analysers it runs, how a run is tracked from start to finish, and where the results and the AI usage end up. For the step-by-step task of running an analysis, see How to analyse a requirement.

What DeFOSPAM is

DeFOSPAM is a structured fault-finding technique. Rather than asking an AI a single open question — "is this requirement any good?" — REQQA breaks the question into a series of narrow, named checks. Each check has a step code, a prompt tuned to that one concern, and a defined output shape. Running them in sequence builds up a thorough, repeatable picture of where an artefact is weak.

This decomposition is deliberate. A focused analyser that only looks for ambiguity finds more ambiguity than a generalist asked to find everything at once, and its findings are easier to act on because every issue is already tagged with the lens that found it.

note

DeFOSPAM and the analyser codes are the same technique REQQA applies to your work and to its own. The codebase still uses the spelling DEFOSPAM internally — the project's original name — but the technique is unchanged.

Analyser code families

Analysers are grouped by what they analyse. The R- family analyses requirements; the D- family analyses stories (Gherkin/user stories). Each code is a two-part identifier — the family letter, a hyphen, then the analyser letter.

Requirement analysers (R-*)

These run against a single requirement's text (and, for most checks, the composed context of its mission and parent requirements):

CodeNameWhat it looks for
R-DRequirement DefinitionsExtracts terminology from the requirement text to build the glossary — the terms that need defining.
R-GGoals, Context and UsersAssesses business rationale, the user roles involved, scope, and traceability — does the requirement say why and for whom?
R-FRequirement FeaturesIdentifies the implementable features implied by the requirement text.
R-IInterfacesIdentifies and assesses every interface the requirement implies or requires.
R-RRulesExtracts and evaluates business rules, constraints, and calculations.
R-EEntities and DataIdentifies data entities, their attributes, relationships, and lifecycle.
R-CConditions and DecisionsMaps the conditional logic and assesses whether every branch is covered.
R-BBoundariesAssesses precision and completeness at edge cases and extreme values.
R-QQuality AttributesIdentifies non-functional obligations (performance, security, compliance) and assesses them.
R-AAmbiguityDetects vague language, contradictions, and implicit assumptions.
R-MMissing ElementsA final completeness sweep across all the analytical dimensions above.

Story analysers (D-*)

These run against a user story and its Gherkin scenarios:

CodeNameWhat it looks for
D-DDefinitionsValidates the terminology and definitions the story relies on.
D-FFeaturesAnalyses feature identification, granularity, and boundaries.
D-OOutcomesAnalyses outcome completeness, observability, and consistency.
D-SScenariosAnalyses scenario coverage, boundaries, and missing test cases.
D-PPredictionAnalyses the mapping from scenario to outcome — is the behaviour predictable?
D-AAmbiguityDetects duplicate scenarios, conflicts, and inconsistencies.
D-MMissingA comprehensive final check for missing elements.
tip

You rarely run a single analyser. A typical requirement run combines several codes — R-D,R-F (terminology plus implementable features) is a common starting pair — and the engine runs each in turn against the same artefact.

A third family for cross-artefact consistency checks (codes beginning C-) is reserved in the engine but not yet active.

Shared rules every analyser obeys

All analysers inherit a common set of classification rules, so findings stay consistent across the families. Every issue is classified into one of three categories:

  • defect — a genuine fault: ambiguity, contradiction, missing precision, an undefined term, an incomplete rule — anything that would cause trouble during implementation or testing.
  • suggestion — a recommendation for non-text content (a diagram, a wireframe, an API schema) or a process improvement. The requirement is not faulty for lacking these.
  • scope_query — a question about whether a piece of detail belongs in this document at all, or should live elsewhere (an interface spec, an NFR document, a design document).

This taxonomy keeps the analyser honest: it stops the engine from flagging "no diagram here" as a defect, and it keeps implementation-level detail (database schemas, deployment topology) out of functional requirements by routing it to a scope_query instead. Terms that need defining are always referred to the glossary rather than defined inline.

How a run is tracked

When you start an analysis, REQQA creates a row in the analyses table and runs it as a background job so the interface stays responsive. The run record carries everything you need to follow its progress and read its result:

  • stepCode — the codes that were requested, stored comma-separated in a single field. A run of definitions plus features is stored as the literal string R-D,R-F, not as separate rows. (This is a frequent source of confusion: there is one analyses row per run, not one per step.)
  • status — the lifecycle state of the run (for example Active, Partial, Superseded, Failed).
  • progress_percent, current_step, steps_completed, total_steps — live progress fields the engine updates as it works through the codes, so the job queue can show how far a run has got.
  • readiness — the engine's overall verdict on the artefact, one of READY, NEEDS_MINOR_REVISION, or NEEDS_MAJOR_REVISION.
  • issuesCount and criticalIssuesCount — totals across the run; the critical count is the number of HIGH-severity findings.
  • analysisSummary and fullResults — a short prose summary and the complete result payload (held as JSON) for the run.

When the same artefact is re-analysed, the newer run can mark an earlier one as superseded (via supersededBy/status), so the latest analysis is always the one that counts while the history remains.

Where issues land

Every finding becomes a row in the analysis_issues table, linked back to its run by analysisid and tagged with the step_code that found it. Each issue carries:

  • a title, description, impact, and recommendation — what is wrong, why it matters, and what to do about it;
  • a location — where in the artefact the problem sits;
  • a severity — one of HIGH, MEDIUM, or LOW;
  • an aiseverity — the severity the AI originally assigned, kept alongside the working severity so an expert override is auditable;
  • a confidence — HIGH (clearly evidenced in the text), MEDIUM (reasonably inferred), or LOW (possible but speculative);
  • an issue_categorydefect, suggestion, or scope_query as described above;
  • a status (for example Open) you and your team can move as you work through the findings.

Severity matters beyond presentation: each application carries minimum severity thresholds (for missions, requirements, and stories), and a run can apply a per-run override severity so an expert can dial the floor up or down for a single analysis without changing the application's defaults.

caution

Severity is not the same as confidence. A HIGH-severity issue the analyser is only LOW- confidence about is still worth reading — it is flagging a serious problem it is not certain exists. Use both fields together when triaging.

Where the AI usage is logged

Behind each analyser is one or more calls to the configured AI model. Every call is recorded in the gptlog table for audit and cost tracking, capturing the model name, the prompt and response, the elapsed time, and the token counts — prompt tokens, completion tokens, and total tokens — along with the organisation, the artefact, and the user. Because the model is configured per organisation, the same analyser run might be logged against, say, a Claude or a GPT model depending on the application's settings; the token figures let you see exactly what each run cost in usage terms. See AI models and cost for how the model choice is made.

How this fits the workflow

Analysis is not the end of the line — it is the feedback that drives improvement. The issues an analysis surfaces feed the synthesis and cleanup work that rewrites a requirement, the terms it extracts feed the glossary, and the features and outcomes it identifies inform stories and features. The better your mission and requirements, the sharper every analyser's findings become — which is why the loop tightens the more you use it.