In the first part, we covered what Claude Code does well, where it fails, and how a single developer can effectively operate with the output of what is essentially a small engineering team. This part is about how our developer tried to build a semi-autonomous, multi-agent development pipeline within an existing infrastructure.
Claude Code Integration in Practice
1. At night
Claude Code connects to the developer’s GitHub repository and collects all tickets that need to be completed. During the night, it works through these tasks, pushes changes to GitHub, opens pull requests (PRs), and assigns Copilot as a reviewer. Copilot performs a code review, after which Claude Code resumes execution to address the feedback. By the time the developer wakes up, a batch of iterated and reviewed PRs is already waiting.
2. In the morning
The developer checks the PRs, merges the correct code, and leaves comments on those that need corrections.
3. During the day
Claude Code runs on an hourly loop, checking open PRs, resolving merge conflicts, and addressing comments. If something marked as “critical” appears, it tries to solve it first. At this stage, the developer’s role is more like a Tech Lead because neither Claude Code nor Copilot can reliably detect system-level issues that are not visible at the code level. The issue here isn’t that the AI can’t “see” the code, but that it doesn’t execute the application in a real user context, track database state over time, or analyze production logs. As a result, it lacks full situational awareness of how the system behaves end-to-end.
4. In the evening
The developer receives a daily summary via email and plans the next day’s tasks. On the next day, the cycle repeats.
Components of The Project
Beyond the code itself, the system also includes several additional layers:
- CI/CD – This pipeline works fine, since every time code is merged into the master, it automatically gets tested and deployed to production.
- TDD – Each bug fix begins with a failing test, and each feature starts with test design before implementation. Moreover, 44% of the codebase is currently covered by automated tests, and it’s growing over time.
- QA – The developer gave Claude Code browser access, allowing it to launch the application, log in, and perform basic user flows similarly to a QA engineer. However, it’s not doing complex scenario testing, only running smoke tests that can save the developer 15 minutes in the morning.
- Documentation – This is one of the strongest components of the system. The project includes a diverse documentation set that not many projects have: architecture overviews, user guides, process documentation, test specifications, and functional requirements.
Bottlenecks of the Project
In addition to the unsatisfying release cycle that still needs some work, there are two more things that need alterations:
1. Writing tickets is too slow
Formulating requirements takes longer than AI can implement them. To address it, our developer is planning to have briefing sessions with Gemini Live, since its voice interface is solid and it tends to be more analytical in discussion.
2. Reviewing PRs is exhausting
Although the coding process itself has become significantly faster, reviewing pull requests remains mentally exhausting and time-consuming to developer. To solve it, there are several optimization strategies:
- Low-impact PRs (documentation updates, test fixes, minor refactoring, and trivial bug fixes) may be merged automatically if both Claude Code and Copilot approve them.
- Adding a new review stage, using Gemini. It will focus specifically on architectural critique and higher-level technical validation.
Final Thoughts
Indeed, none of this system works perfectly. That’s why our developer is still trying to build an autonomous multi-agent pipeline. Many parts require manual correction and guidance, both at the organizational and technical aspects.
While both LLM-based systems function as IDE, they still lack true reasoning. They can implement solutions efficiently, but strategic thinking and final responsibility remain firmly on the human.