llm-party: Run Claude, Codex, and Copilot Together in One Terminal

llm-party: Run Claude, Codex, and Copilot Together in One Terminal

llm-party open source multi-agent AI coding CLI logo

llm-party is an open-source multi-agent CLI that runs multiple AI coding agents in the same terminal session. Claude Code, OpenAI Codex, GitHub Copilot, Ollama, GLM, and any Claude-compatible API run side by side as peers. Every agent sees the full conversation. Agents hand off work to each other automatically. The developer controls routing with @tags.

Most multi-agent setups use MCP or master-servant patterns where one model controls the others. llm-party uses peer orchestration instead. No hierarchy. No CLI wrapping or output scraping. Direct SDK integration with each provider. Each agent keeps its own strengths without being filtered through another model’s judgment.

Multiple AI agents collaborating in one terminal: Claude Code, Codex CLI, and GitHub Copilot with shared context
Three AI coding agents in a single terminal session with shared context and automatic handoff.

How It Works: Multi-Agent Orchestration Without MCP

llm-party uses official SDKs from each provider: the Anthropic Agent SDK, OpenAI Codex SDK, and GitHub Copilot SDK. Each agent gets a persistent session with its provider, real streaming, and full tool access.

The architecture:

Terminal
    |
Orchestrator
    |
    +-- Agent Queue Manager (per-agent queues, non-blocking)
    |
    +-- Agent Registry
    |     +-- Claude  -> ClaudeAdapter  (Anthropic Agent SDK)
    |     +-- Codex   -> CodexAdapter   (OpenAI Codex SDK)
    |     +-- Copilot -> CopilotAdapter (GitHub Copilot SDK)
    |     +-- Custom  -> CustomAdapter  (routes through native CLI)
    |
    +-- Conversation Log (ordered, all messages)
    |
    +-- Transcript Writer (JSONL, append-only)
    |
    +-- Session Manifest (per-agent cursors + SDK session IDs)

Messages route by tag. @claude goes to Claude. @all broadcasts to every agent. Agents dispatch to each other with @next:<tag>, up to 15 automatic hops per cycle.

@claude: Plan the auth module for this API

[CLAUDE] JWT tokens, refresh rotation, middleware on
protected routes. Files: jwt.ts, auth.ts, login.ts
@next:codex

[CODEX] Implemented. Three files changed.
@next:copilot

[COPILOT] Two findings: token expiry is hardcoded,
refresh endpoint missing rate limiting.
@next:claude

[CLAUDE] Fixed both. Done.

Official SDKs, Not CLI Wrapping

Unlike tools that scrape terminal output or wrap CLI binaries, llm-party integrates directly with each provider’s official SDK. Real streaming, real tool use, real session management.

ProviderOfficial SDKPublished by
Claude Code@anthropic-ai/claude-agent-sdkAnthropic
Codex CLI@openai/codex-sdkOpenAI
Copilot@github/copilot-sdkGitHub
Custom (Ollama, GLM, etc.)Any Claude-compatible APIVia native CLI

Authentication flows through each provider’s own CLI. If claude, codex, or copilot commands work on the machine, they work inside llm-party. No separate API keys, no proxy servers, no new accounts.

Parallel AI Agents with Non-Blocking Queues

Each agent has its own processing queue. Typing continues while agents work. Fast agents respond immediately while slower agents process in order. When an agent is busy and a new message arrives, it queues automatically (up to 20 pending, 5-minute TTL).

The status bar shows real-time agent state with superscript queue counts. Each agent displays what it is currently doing: file reads, shell commands, search queries. No waiting for one agent to finish before addressing another.

Agent Sidebar (v0.12.0)

Toggle with Ctrl+B. A live activity panel showing per-agent status with scrollable activity logs. Each entry shows the tool in use and the target: Read: src/index.ts, Bash: npm test, Search: handleSubmit. Auto-hides on narrow terminals.

Cancel Panel (v0.11.0)

Press Esc while agents are active. A multi-select modal lists all running agents. Select which ones to cancel. The rest keep working. Per-agent abort signals, not a global kill switch.

Persistent Sessions Across Multiple AI Agents

Every session generates a unique ID and writes an append-only JSONL transcript. A manifest file alongside each transcript stores per-agent SDK session IDs and message cursors. Sessions survive terminal closure.

Resume where all agents left off:

llm-party --resume 20260407-143022-abc123

Claude resumes its conversation. Codex resumes its thread. Copilot resumes its session. Each agent receives only unseen messages since its last turn. No duplicate processing.

Shared Memory Across AI Agents and Projects

llm-party ships with a three-layer memory architecture that gives AI coding agents persistent context across sessions and projects.

Project Memory lives in the repository. Technical decisions, file paths, current state. Agents read it on boot and write to it as they work. Stored as markdown in .llm-party/memory/.

Mind-Map is shared across all projects. Obsidian-compatible markdown with wikilinks in ~/.llm-party/network/mind-map/. Cross-project context, architectural decisions, lessons learned. Open the folder in Obsidian to visualize the knowledge graph.

Self-Memory is per-agent behavioral learning. When an agent receives a correction, it records the correction. Next session, the same mistake does not repeat. Each AI agent builds its own learning history independently.

Run Local Models with Ollama, GLM, or Any Compatible API

Any endpoint that speaks the Anthropic API protocol works as a custom provider. Add Ollama for local inference, GLM for Zhipu AI, or any other compatible endpoint. Configuration is a JSON entry, not a code change.

Ollama (run local LLMs alongside Claude and Codex):

{
  "name": "Ollama",
  "tag": "ollama",
  "provider": "custom",
  "cli": "claude",
  "model": "llama3",
  "env": {
    "AUTH_URL": "http://localhost:11434/v1",
    "AUTH_TOKEN": "ollama"
  }
}

GLM (Zhipu AI):

{
  "name": "GLM",
  "tag": "glm",
  "provider": "custom",
  "cli": "claude",
  "model": "glm-5",
  "env": {
    "AUTH_URL": "https://api.z.ai/api/anthropic",
    "AUTH_TOKEN": "your-glm-api-key"
  }
}

Configurable AI Agent Skills

Agents load markdown-based skill files that define specialized capabilities: code review standards, deployment checklists, writing guides, security audits. Skills are discovered from ~/.llm-party/skills/ (global) and .llm-party/skills/ (project-local). Assign skills to specific agents in config:

{
  "name": "Reviewer",
  "tag": "reviewer",
  "provider": "claude",
  "model": "opus",
  "preloadSkills": ["aala-review"]
}

Skills are plain markdown. Anyone on the team can write or modify them.

Prompt System

Every agent receives a base system prompt automatically. Additional prompt files can be layered per agent using the prompts field. Template variables ({{agentName}}, {{agentTag}}, {{humanName}}, {{validHandoffTargets}}, and others) are resolved at boot.

Safety

All agents run with full permissions. File system access, shell execution, no approval gates. This is intentional. Run llm-party in a disposable environment: a Docker container, a VM, or a throwaway git branch. Not directly on production.

Peer Orchestration vs. MCP: How llm-party Compares

MCP-based orchestrationllm-party
ArchitectureMaster controls servant agentsPeer orchestration, all agents equal
IntegrationCLI wrapping, output scrapingDirect official SDK adapters
ContextMaster filters what agents seeEvery agent sees the full conversation
SessionsFresh each timePersistent per provider, resumable
ConcurrencySequential or blockingNon-blocking per-agent queues
ProvidersUsually single-providerClaude, Codex, Copilot, Ollama, any
AuthSeparate API keys per toolUses existing CLI authentication

Install

Requires Bun runtime and at least one AI coding CLI installed and authenticated (Claude Code, Codex CLI, or GitHub Copilot).

bun add -g llm-party-cli
llm-party

First run launches an interactive setup wizard that auto-detects installed AI CLIs and creates the config.

Terminal Commands

CommandAction
@claude / @codex / @copilotRoute message to agent
@allBroadcast to all agents
Ctrl+BToggle agent sidebar
Ctrl+PToggle agents panel
EscCancel panel (select agents to stop)
/resume <id>Resume previous session
/configOpen config wizard
/agentsAgents panel
/infoCommands and shortcuts
Shift+EnterMulti-line input

Repository: github.com/aalasolutions/llm-party Website: llm-party.party npm: llm-party-cli License: MIT

What v0.12.0 Includes

llm-party has shipped 29 releases since March 2026. The current release, v0.12.0, includes:

  • Peer orchestration for Claude Code, Codex CLI, GitHub Copilot, and custom providers
  • Agent sidebar with live activity logs (Ctrl+B)
  • Cancel panel for selective agent termination (Esc)
  • Non-blocking per-agent queues with superscript queue counts
  • Session persistence and resume across all providers
  • Three-layer memory (project, mind-map, self-memory)
  • Ollama and custom provider support via config
  • Markdown-based skills assignable per agent
  • Agent-to-agent handoff with @next:<tag> routing

llm-party is built and maintained by AALA AI, the AI engineering division of AALA IT Solutions. Open source under MIT license.