BLUF Generator: Critical Ops Hackathon

Live demo: BLUF Generator

Tools used: Codex, Claude Code, Cursor, Gemini, Groq, Python, Next.js, Supabase, Vercel, Tavily

TL;DR: DC Critical Ops was a three-day national security hackathon in Washington, DC, from April 17-19, 2026, organized by Johns Hopkins University, Georgetown University and George Washington University.

Project

I built BLUF Generator for analysts who need to turn open-source reporting into structured intelligence-style reports. BLUF means Bottom Line Up Front, a writing style used in military and intelligence reporting where the main judgment comes first.

The project is an agentic summarization pipeline for publicly available information. It ingests reporting, extracts evidence, estimates source quality, cross-references claims, and drafts a BLUF-style analytic product for human review.

The interesting question was not “can an LLM summarize an article?” The harder problem was whether an AI system could preserve the shape of serious analysis: source quality, evidence, uncertainty, confidence, alternative explanations and review before release.

Workflow and Design

The live app is a Next.js dashboard for OSINT analysis. The interface starts with a topic and routes it through a backend pipeline:

  1. Collect or retrieve sources for the topic.
  2. Run source intake and extract evidence.
  3. Assign source and information quality signals using Admiralty-style reliability and credibility ratings.
  4. Generate a draft BLUF and key judgments.
  5. Run a tradecraft gate against analytic-quality standards.
  6. Persist sources, evidence, draft products, judgments, and trace events in Supabase.

The deployed version has quick-select topics like China-Taiwan, Strait of Hormuz, and NATO expansion. I chose these because they are messy national security questions where confident-sounding summaries can be dangerous if they hide uncertainty.

Backend

The backend is a Vercel-compatible Next.js App Router application. The intended flow is:

topic -> source intake -> evidence -> draft -> tradecraft gate -> product

Supabase stores the important objects: sources, evidence, products, judgments, and trace events. The UI is designed to show not only the final answer, but also the path the answer took:

Gemini and Groq are used in the agent layer. Tavily supports live web search when real-time OSINT is enabled. The architecture also includes a doctrine layer:

What Worked

The strongest part of the prototype is the workflow framing. The app does not just ask a model for a response. It separates collection, source evaluation, drafting and review, which makes the final result easier to inspect and criticize.

The trace view is important because it shows that the system did not jump from topic to answer. It routed the request, found or ingested material, generated a product and checked that product against a standard.

Limitations

The app is also affected by an agent orchestration issue, so the public demo may not fully represent the intended workflow at all times. I will update the project once the pipeline is stable.

Why It Matters

The real-world use case is an analyst workspace, not a replacement for analysts. A tool like this can help analysts move faster through source triage, evidence organization and first-draft generation, while making uncertainty visible.

The important work was scoping the workflow, keeping the product grounded in tradecraft, and deciding what not to claim before the system actually works.

Thanks

Thanks to Johns Hopkins University, Georgetown University and George Washington University for organizing the hackathon. Thanks also to the sponsors across defense technology, AI, and national security.

References