I made this for myself, and it seemed like it might be useful to others. I'd love some feedback, both on the threat model and the tool itself. I hope you find it useful!
Backstory: I've been using many agents in parallel as I work on a somewhat ambitious financial analysis tool. I was juggling agents working on epics for the linear solver, the persistence layer, the front-end, and planning for the second-generation solver. I was losing my mind playing whack-a-mole with the permission prompts. YOLO mode felt so tempting. And yet.
Then it occurred to me: what if YOLO mode isn't so bad? Decision fatigue is a thing. If I could cap the blast radius of a confused agent, maybe I could just review once. Wouldn't that be safer?
So that day, while my kids were taking a nap, I decided to see if I could put YOLO-mode Claude inside a sandbox that blocks exfiltration and regulates git access. The result is yolo-cage.
Also: the AI wrote its own containment system from inside the system's own prototype. Which is either very aligned or very meta, depending on how you look at it.
Comments URL: https://news.ycombinator.com/item?id=46706796
Points: 22
# Comments: 40