// case study
Pebble Pop was an early swarm-built game that research agents brought back to life.
Pebble Pop is a Peggle-like Godot game with colored pegs, a shooter, bucket catch, scoring, level progression, power-ups, menu flow, and a state server for QA. It was built earlier by the swarm, left alone, then recovered after research agents diagnosed a false validation cascade.
Evidence note: the raw log archive is incomplete because some logs were removed during later cleanup. This case study is based on surviving controller history, git history, research reports, QA reports, screenshots, and project files. Raw logs are not copied verbatim because they can contain local paths and personal context.
// recovery arc
The game was playable, but the swarm kept treating a headless Godot warning as a project failure.
The recurring failure text was "47 resources still in use at exit". Recovery tasks interpreted that stderr line as a script or scene leak, so the system kept generating new bug tasks even though the validation scripts were already printing success and exiting with code 0.
Research agents isolated the problem by running controlled headless checks, bare Godot launches, and validation scripts. The result was a clear rule: success is exit code 0 plus the expected OK message. The resource warning was engine/headless cleanup noise, not a Pebble Pop bug.
Once that diagnosis was written down, later agents could stop chasing the false leak, finish UI polish, apply the StateServer port fix, and return to real QA.
// what exists now
Pebble Pop is a full first-playable slice.
The surviving project files show a complete arcade loop: shooter, ball physics, peg collision, bucket, scoring, level manager, level select, power-up manager, UI controller, background, QA instrumentation, and GUT coverage.
Core loop
Shoot balls into a peg board, clear targets, catch rebounds in the bucket, and track score and remaining balls.
Progression
Three level configurations, level selection, score accumulation, and completion flow are represented in project code and tests.
QA hooks
StateServer support, screenshots, and automated checks make the game inspectable by harness QA agents.
Menu state with board, bucket, and play flow visible.
Polished gameplay state after recovery, with starfield, score, and ball count visible.
// sanitized timeline
The important work was diagnosis, not more blind bug fixing.
These entries are summarized from surviving project artifacts and controller history, not the lost raw-log archive.
May 28 - Research audited UI anchors and layout risks. May 28 - Art and polish passes added visible sprites, level select flow, UI feedback, and gameplay screenshots. May 29 - Research diagnosed the recurring "47 resources" validation cascade as headless Godot/GUT noise. May 29 - Validation pattern documented: All scripts OK, All scenes OK, Main scene OK, exit code 0. May 29 - QA context confirmed the core game loop was functional after background, art, and UI updates. May 29 - StateServer gained STATE_PORT support so QA launches could avoid stale-process port collisions.
// lesson
Research agents gave the swarm a way out of its own recovery loop.
The failure mode was not "agents cannot fix the game." It was "agents kept trusting a noisy validation interpretation." The research task changed the operational facts available to the swarm: what to ignore, what to trust, and how to decide whether the project was actually valid.
That is the value of read-only diagnosis in an agent swarm. It can stop repeated write attempts when the right move is to refine the test oracle.
Read the other preserved runs.
Pebble Pop shows recovery and diagnosis. Chess-2 and Gravity Golf show end-to-end build and stabilization.