My AI tool use has evolved since my last post on the subject. The question has shifted from “should I use these tools?” to “how do I use several of them at once without losing my mind?” The goal isn’t raw output — it’s reducing the cost of context switching. Jumping between tasks is expensive for me mentally, and it’s expensive for an agent that has to rebuild context each time. The setup below keeps each task in its own lane.

The Workspace: tmux as the Foundation

I run everything inside tmux. My base window has three vertically stacked panes: one dedicated to the Android emulator, and two general-purpose panes I use for things like taking notes in Neovim or running quick commands. That window stays open and mostly stable while I work.

From there, I open additional windows — one per task I want to run in parallel. Each task window is split into two panes: a large one (roughly 80%) running an agent, and a smaller one (20%) for terminal work like git commands or launching an editor directly.

On my corporate Mac I also use a tiling app to snap windows into fixed positions. It sounds like a small thing, but keeping your environment spatially consistent reduces cognitive overhead. You know where things are without thinking about it.

Git Worktrees: The Enabling Piece

Running multiple agents in parallel only works cleanly if they’re not stepping on each other. That’s where git worktree comes in. Each worktree is a separate checkout of the same repository, so agents can work on different branches simultaneously without file conflicts.

The constraint is system resources. Each worktree with a running agent is consuming memory and CPU, so you have to be deliberate about how many you spin up. It’s a practical ceiling, not a theoretical one.

The Agents: Wibey and Code-Puppy

I’ve been using two agents — Wibey and Code-Puppy — with different Claude models depending on the task. For complex work that benefits from deeper reasoning I’ll use an extended thinking model; for more straightforward tasks I switch to a standard thinking model to keep token usage in check. Treating tokens as a real resource rather than an afterthought matters when you have multiple agents running concurrently.

Wibey has better integration with our corporate tooling, so it handles the work that would otherwise interrupt my flow: updating Jira tickets, writing Confluence documentation, and creating Pull Requests on our corporate GitHub. That’s not a small thing. That administrative overhead is exactly the kind of task that pulls a senior developer out of deep work at the worst moment. Having an agent that can handle it reliably — tasked from a tmux pane without switching context — is where I’ve seen the most tangible productivity gain.

More recently I’ve been experimenting with using Wibey as an executive agent, with a defined role and skill set written in a markdown file imported into its settings, directing subordinate agents running in their own worktrees. In practice it’s marginally useful at this point. The limiting factor isn’t orchestration — it’s visibility. You can’t always tell what a sub-agent is actually doing, and when something goes wrong, diagnosing it is harder than it would be with a human. I haven’t found a clean solution to this yet; structured output and keeping tasks tightly scoped help at the margins, but it’s still the weakest part of the setup. Sometimes the right move is just to open a terminal pane and edit the code directly.

Android Development: A Note on Tooling

Android Studio has a habit of getting sluggish when you switch between projects frequently. For this kind of multi-project, multi-agent workflow, that’s a real problem. I’ve had better results using VSCode with Kotlin extensions. It’s lighter, faster to context-switch, and the terminal integration means I can launch it from a tmux pane without breaking my flow.

One thing that’s non-negotiable with this setup: configure a local Gradle cache. With multiple worktrees running builds — for tests, for manual validation, for whatever an agent just did that you need to verify — you will be compiling often. On a large codebase, that pain adds up fast. Without a shared local cache, each worktree risks triggering a full Gradle build, and on a project of any real size that’s the kind of friction that makes the whole parallel workflow feel not worth it. Get the cache right early and you’ll barely notice the builds. Skip it and you’ll be watching progress bars when you should be working.

What Actually Works

The setup earns its complexity when tasks are well-defined and fit cleanly into separate worktrees. Routine work — adding tests, small features with clear scope, isolated refactors, and the steady stream of Jira and documentation updates — is where you see real throughput gains from parallelism.

It breaks down when tasks are entangled, when you need to understand what an agent did before you can continue, or when the business logic is subtle enough that reviewing the agent’s output takes longer than writing it yourself. None of that is surprising. It’s the same lesson from before: know your domain, review everything, and don’t architect yourself into a corner you can’t get out of without the tools.

Still Figuring It Out

This isn’t a finished system. It’s a snapshot of what’s working right now, and that changes as the tools change. I share what I’m finding with my team — what’s useful, what’s a dead end, what’s worth trying. That kind of lateral knowledge sharing matters more than usual right now because these tools are moving fast enough that no one person has the full picture. Something I wrote off three months ago might be worth revisiting. Something that worked last month might have a better answer today.

The honest position is that we’re all still learning what these tools are actually good at. The capabilities are shifting, the models are improving, and the workflows that make sense today might look different in six months. That’s fine. Staying curious and staying skeptical at the same time is the right posture. Share what you find, stay open to being wrong about it, and keep the fundamentals close — because those aren’t changing.