Dev Espresso #3 – GitHub Copilot: From Modes to Multi‑Agents (What's Coming Next)

Dev Espresso #3 – GitHub Copilot: From Modes to Multi‑Agents (What's Coming Next)

Dariusz Luber
Dariusz Luber
📺

Prefer video content?

Jump to video

For the best experience, use both — watch the video and read the article; together they give you the full picture

TL;DR

Copilot already gives you three autonomy levels (Chat → Edits → Agent Mode) plus Custom Modes (personas) inside the IDE. By chaining a few lightweight practices (shared markdown context + deliberate role switching), you can simulate a multi‑agent workflow today. Coming soon: Agent‑to‑Agent (A2A) and richer MCP toolchains will make those hand‑offs automatic.

Action now: Use Custom Modes for thinking/planning/review; let Agent Mode implement small, reviewable tasks; capture decisions in a markdown log so other modes (QA / Cloud / Architect) can reference them.


1. The Shift: Autocomplete → Collaborative Autonomy

Most developers still treat Copilot like a supercharged autocomplete. Reality: it now spans a spectrum:

Mode Primary Use Autonomy Level Sweet Spot
Chat Q&A, exploration, explaining code Reactive Understanding & small snippets
Edits Local+multi‑file transformations Guided Refactors, restructuring
Agent Mode Goal‑oriented iterative work Proactive Small scoped tasks (tests, config, feature slice)

Mental model: each step adds initiative. You move from giving instructions line‑by‑line → describing goals and letting Copilot drive constrained progress.


2. Custom Modes/Custom Agents: Operationalized Personas

Custom Modes/Custom Agents let you save persona instructions (tone, constraints, priorities, tool access). Examples:

  • Architect: upfront design & trade‑offs
  • QA: test strategy, coverage, edge cases
  • Cloud / DevOps: infra, CI/CD, security & observability
  • Language Specialists: Node, Python, etc.

Why they matter:

  1. Repeatability – no re‑prompting boilerplate.
  2. Cognitive mode switching – you think differently when the persona enforces discipline (e.g., “propose risks before coding”).
  3. Team alignment – share .github/chatmodes/*.chatmode.md in the repo (in newer VS Code agents/*.agent.md).

Custom Agents have recently become available in the GitHub cloud coding agent (https://github.com/copilot).

Minimal Example Structure

.github/
  chatmodes/
    architect.chatmode.md
    qa.chatmode.md
    cloud-devops.chatmode.md
    js-ts-node.chatmode.md
    python.chatmode.md

Frontmatter keys you can use: description, tools, model. Body = persona contract.


3. A Practical Multi‑Persona Flow (Today)

Basic implementation

Until native inter‑agent collaboration arrives, stitch personas manually:

  1. Architect Mode → produce: goals, constraints, components, interface sketch, risks, milestones.
  2. Save that as docs/architecture-session-YYYY-MM-DD.md (or append to an ADR).
  3. QA Mode → reads that file → proposes test strategy + skeleton test files.
  4. Cloud / DevOps Mode → adds deployment/IaC + CI adjustments referencing the same doc.
  5. Agent Mode → implement one milestone at a time (never “build the whole feature”).
  6. Repeat: after each Agent task, update the shared doc (decisions / deviations / open questions).

Glue Principle: shared markdown context = temporary team memory.

Advanced implementation - Persona Memory (Topic‑Scoped Shared State)

Flattened, global persona memory files work—but scale better by scoping memory to a topic (feature, refactor, migration) and separating decisions from persona perspectives.

docs/
  topics/
    feature-{slug}/
      source/                   # authoritative raw artifacts (stable-ish)
        business-requirements.md
        design.md
        tests.md
        infra-notes.md          # optional if produced
      decisions/                # atomic, append-only decisions (ADR-lite) - later will evolve into proper project ADR
        2025-09-29-decide-database.md
        2025-09-29-choose-queue.md
      memory/                   # per-persona rolling synopses (summaries / deltas referencing source/)
        business.memory.md      
        architect.memory.md
        qa.memory.md
        cloud-devops.memory.md
        agent.memory.md         # (when delegating PoC / milestone execution)
      topic.summary.md          # regenerated consolidation across memory + decisions
      analytics/                # derived or generated signals from PoC(s) / CI
        coverage.json
        perf-snapshot-2025-09-29.md

Conceptual layers:

  • source: stable, reference documents (business-requirements, design, tests, infra notes)
  • decisions: immutable(ish) atomic rationales (one concern per file)
  • memory: succinct evolving state per persona (delta + intent referencing source/)
  • topic.summary.md: consolidated digest (state, milestones, cross-persona open questions)
  • analytics: machine or script generated inputs (metrics, coverage, perf)

Guiding rules:

  1. One topic folder per significant feature or initiative; global concerns can live under topics/platform/ etc.
  2. Each persona updates only its own memory file; other personas propose via quoted blocks or PR comments.
  3. Decisions graduate from memory to decisions/ when stabilized (gives them a permalink and diff history).
  4. Memory entries are timestamped, capped in size, and reference artifacts instead of duplicating them.
  5. topic.summary.md is regenerated (manual or scripted) after each milestone to keep onboarding cost near zero.

Why this works:

  • Topic scoping prevents global memory bloat.
  • Clear separation: decisions vs transient reasoning.
  • Enables selective context injection (only mount the relevant topic folder for prompts / agents).
  • Future A2A orchestration can diff memory snapshots to derive new tasks.
  • Analytics stay external, reducing noise in human‑authored summaries.

Memory File Template (memory/architect.memory.md)

---
persona: architect
topic: feature-{slug}
updated: 2025-09-29T09:00:00.000Z
---

## Current Intent
<concise statement>

## Key Assumptions
- ...

## Open Questions
| ID | Question | Owner | Added |
|----|----------|-------|-------|

## Decision Candidates (Unratified)
- [ ] Q1: <short label> – pending perf data

## Recently Ratified (Moved to decisions/ soon)
| Date | Decision | Rationale | Link |
|------|----------|-----------|------|

## Risks / Watch
- (<timestamp>) <risk statement>

## Signals (Imported)
- Test coverage delta: <value> (source: analytics/coverage.json)
- p95 latency: <value> (perf-snapshot-2025-09-29.md)

## Upcoming Milestones
- M<n>: <goal> (ETA, dependency)

topic.summary.md (optional aggregator):

---
topic: feature-{slug}
phase: milestone-2
updated: 2025-09-29T11:30:00.000Z
---

## Snapshot
Status: Implementing milestone 2 (API gateway integration)
Next Critical Decision: choose message queue (see decisions/2025-09-29-choose-queue.md)

## Milestones
| ID | Description | State | ETA |
|----|-------------|-------|-----|

## Open Questions (Cross-Persona)
| ID | Question | Owner | Added |
|----|----------|-------|-------|

## Recent Decisions
| Date | Decision | Impact |
|------|----------|--------|

## Key Risks
- <risk>

Update cadence suggestions:

  • Architect memory: after milestone planning/close or assumption shift.
  • QA memory: after test suite evolution or coverage anomaly.
  • Cloud DevOps memory: after infra change or new SLO signal.
  • Agent memory (if used): terse execution notes; often you can omit and rely on architect + summary.
  • topic.summary.md: regenerate post‑milestone (scriptable via future automation).

Prompt example (topic‑scoped):

"Architect Mode: Read docs/topics/feature-{slug}/memory/architect.memory.md plus topic.summary.md; reconcile differences with decisions/*. List any stale assumption candidates and refresh 'Upcoming Milestones'. Keep under 80 lines."

This pattern creates a bounded, queryable state capsule per feature—easy for humans now, and machine agents later.


4. Common Anti‑Patterns

Anti‑Pattern Why It Hurts Fix
One giant Agent request ("build full feature") Bloated diffs, hallucinated structure Slice into PR‑sized milestones
Skipping design mode Leads to rework and fragile architecture Force Architect Mode first for non‑trivial tasks
Letting QA mode modify prod code Blurs responsibility & risk QA suggests tests; Agent applies with tests
No decision log Personas lose continuity Maintain docs/decisions.md or ADRs
Overwriting human judgment False confidence Treat outputs as drafts; review everything

5. Looking Forward: Agent‑to‑Agent (A2A) & MCP

Today: Modes are sandboxed – no shared persistent memory.

Emerging Layers:

  • MCP (Model Context Protocol): standardized way to expose tools (Jira, build systems, orchestrators) to Copilot.
  • A2A (Vision): agents exchanging structured tasks + artifacts. Architect persona could emit a task graph; QA subscribes to design changes; Cloud persona auto‑adds observability hooks.
  • Agent Factory / Orchestrators: define, spawn, and supervise persona agents with access boundaries + explicit capabilities.

Benefits When Mature (Deliberative A2A Loop):

  • Multi‑scenario synthesis: architect/infra/test personas co‑generate 2–4 plausible solution stacks (trade‑offs explicit) before any code.
  • Structured rejection memory: discarded approaches logged (with reason + blocking constraint) to prevent cycling.
  • Automatic clarification pass: if critical unknowns detected (risk, data volume, latency expectation), a Question Backlog artifact is generated before proposing final architecture.
  • Consensus scoring: agents independently score scenarios (cost, complexity, resilience, change surface) → orchestrator aggregates + highlights divergence.
  • Human gating: no scenario is promoted to “Plan” state without an explicit human accept (agent output stays draft until promoted).
  • Selective knowledge consultation: personas pull only relevant slices of source/ + memory (bounded context windows) to reduce hallucinated constraints.
  • Policy / compliance hooks: security / governance agents inject veto notes or required controls into scenario diffs, not after the fact.
  • Stepwise delegation: implementation agents receive a signed-off micro‑plan (milestone slice) not an open‑ended feature brief.
  • Traceable deliberation: every accepted element links back to scenario comparison or question resolution entry.

Prepare now by:

  • Defining evaluation criteria rubric (e.g., cost, blast radius, MTTR impact, testability) in a reusable markdown snippet.
  • Capturing rejected ideas with rationale (so future agents can short‑circuit repeats).
  • Maintaining an explicit Question Backlog section per topic (questions.md or in summary) with status (open|answered|assumption).
  • Normalizing decision records (timestamp, driver, alternatives, rationale, consequences).
  • Tagging artifacts with stable IDs for future machine cross‑referencing.

6. GitHub Cloud Coding Agent vs IDE Modes

Aspect IDE Custom Modes Cloud Coding Agent
Persona customization Yes (.chatmode.md) No (repo instructions only)
Execution environment Your local dev setup Ephemeral GitHub VM
Can open PRs autonomously Indirect (you drive) Yes
Multi‑file editing loops Yes Yes
Extend via MCP Emerging Emerging
Agent Mode availability Yes N/A (different model)

Use cloud agent for: repo‑contained, reviewable automation (config tweaks, dependency bumps, small feature slices, small code refactoring) especially when local environment context isn’t critical.


7. Iterative Persona Workflow (Diagram)

Instead of a linear checklist, the real loop cycles through: ingest sources → persona-specific memory synthesis → plan slicing → execution → feedback → refinement → finalization (ADR / test scenarios / infra hardening). The diagram shows how raw inputs feed structured reasoning before code changes.

%%{init: {"flowchart": {"htmlLabels": true, "curve": "basis"}} }%%
flowchart TD
  subgraph SOURCES["Raw Source Inputs"]
    transcripts["Meeting<br/>Transcripts"]
    clientDocs["Client Docs<br/>/ Briefs"]
    domainNotes["Domain<br/>Notes"]
  end
  sourcesAggregate["source/<br/>curated excerpts"]
  transcripts --> sourcesAggregate
  clientDocs --> sourcesAggregate
  domainNotes --> sourcesAggregate

  subgraph PERSONA_LOOP["Persona Pass Iteration"]
    BA["Business Analyst<br/>(business.memory.md)"]
    ARCH["Architect<br/>(architect.memory.md)"]
    QA["QA<br/>(qa.memory.md)"]
    DEVOPS["Cloud DevOps<br/>(cloud-devops.memory.md)"]
  end

  sourcesAggregate --> BA --> ARCH --> QA --> DEVOPS --> PLAN["Implementation Plan<br/>(IMPLEMENTATION_PLAN.md)"] --> SUMMARY["topic.summary.md<br/>(Consolidated Snapshot)"] --> AGENT["Agent Execution<br/>(step i)"] --> REVIEW{"Review &<br/>Decision?"}
  REVIEW -->|Approve| NEXT["Next Plan Step"]
  NEXT --> AGENT
  REVIEW -->|Info Gaps| FEEDBACK["Feedback /<br/>New Questions"]
  FEEDBACK -.-> sourcesAggregate
  SUMMARY --> FINALIZE["Finalize Artifacts<br/>(ADR · Test Scenarios)"] --> DONE["Ready /<br/>Reduced Unknowns"]

  classDef phase fill:#f0f6ff,stroke:#3d7ac4,stroke-width:1px,color:#1a1a1a
  classDef memory fill:#f7f2ff,stroke:#7d3fb2,stroke-width:1px,color:#1a1a1a
  classDef plan fill:#fff4e0,stroke:#e09600,stroke-width:1px,color:#1a1a1a
  classDef summary fill:#e8f9f0,stroke:#25a25a,stroke-width:1px,color:#1a1a1a
  classDef exec fill:#f2f2f2,stroke:#666,stroke-width:1px,color:#1a1a1a
  classDef review fill:#ffe7e7,stroke:#d93c3c,stroke-width:1px,color:#1a1a1a
  classDef finalize fill:#e6f3ff,stroke:#2d8fd5,stroke-width:1px,color:#1a1a1a
  classDef done fill:#d9f7e6,stroke:#1f8150,stroke-width:1px,color:#1a1a1a

  class transcripts,clientDocs,domainNotes,sourcesAggregate phase
  class BA,ARCH,QA,DEVOPS memory
  class PLAN plan
  class SUMMARY summary
  class AGENT exec
  class REVIEW,FEEDBACK review
  class FINALIZE finalize
  class DONE done

Operational notes:

  1. Sources are never edited—curate excerpts into source/ (traceable back to originals or transcript timestamps).
  2. Each persona updates only its memory file with deltas (intent, risks, open questions) referencing source/.
  3. The implementation plan is produced only after core personas converge (BA → Architect → QA → DevOps).
  4. Summaries (topic.summary.md) are regenerated after plan approval and after each executed step.
  5. Execution halts after every plan step for explicit review; deviations feed back into architect + business memory.
  6. Finalization: stable design decisions become ADRs; QA formalizes test scenario specs; infra notes crystallize into IaC / pipeline PRs.
  7. Loop restarts whenever new external input (client feedback, risk discovery) changes assumptions.

Fallback linear prompts (if you prefer the earlier list) can still be derived from this diagram; treat each edge as a discrete, reviewable persona hand‑off.


8. Lightweight MCP Setup (Optional Starter)

Place in .vscode/mcp.json and keep secrets in .vscode/.env (gitignored):

{
  "servers": {
    "jira": {
      "type": "http",
      "url": "https://jira.example.com/mcp",
      "envFile": ".vscode/.env",
      "headers": { "Authorization": "Bearer ${JIRA_TOKEN}" },
      "tools": ["search_issues", "create_issue"]
    },
    "a2a-orchestrator": {
      "type": "http",
      "url": "https://agents.example.com/a2a/mcp-bridge",
      "envFile": ".vscode/.env",
      "headers": { "X-Org": "${ORG_ID}", "Authorization": "Bearer ${A2A_TOKEN}" },
      "tools": ["delegate_task", "get_agent_status", "collect_results"]
    }
  }
}

.gitignore addition:

.vscode/.env

9. Actionable Takeaways

Goal Do This Now
Faster, safer design Add Architect + QA chat modes
Consistent infra & security Add Cloud DevOps mode with least‑privilege checklist
Reduce rework Write decision markdown after each milestone
Prepare for A2A future Keep persona contracts + structured artifacts
Smaller PRs Constrain Agent tasks to single milestone with acceptance checks

10. Call to Action

Experiment with this workflow for your next feature. Share with your team. Leave feedback: Do you want deeper dives into MCP, security agents, or automated test generation? Your input will shape future episodes.


11. Resources


Found this helpful? Power up my next content!

Buy me a coffee