Skip to content

Research Papers

In-depth technical analysis to help engineers evaluate frameworks, tools, and practices.


Autonomous AI Agents: Execution Loops vs Interactive Assistance

Evidence synthesis comparing autonomous AI agent execution loops against interactive human-in-the-loop assistance, covering SWE-bench benchmarks, the METR RCT, industry telemetry, and adjacent domain evidence.

Topics: Autonomous agents, execution loops, SWE-bench, developer productivity, multi-agent systems, scaling laws

Read →


Agentic Development Tools and Execution Architectures

Architectural comparison of Claude Code, Goose, Cursor, and GitHub Copilot - how they differ in execution model, context management, and governance.

Topics: Execution architectures, multi-context systems, tool comparison, failure modes

Read →


Spec-Driven Development Framework Patterns

Analysis of BMAD, SpecKit, and OpenSpec frameworks - when to use each, how they integrate with existing workflows, and practical adoption guidance.

Topics: Specification-driven development, BDD patterns, contract testing, CI/CD integration

Read →


Companion Articles

These papers are supported by practitioner-focused articles in the Articles section:

About

These papers are:

  • Practically focused - real-world implementation over theory
  • Based on primary sources - official docs, repos, and hands-on experience
  • Open - free to use, share, and contribute to

Contributing

Found an error or have a suggestion? Open an issue or submit a PR.

Updated at:

Released under the MIT License.