Show HN: yolo-cage – AI coding agents that can't exfiltrate secrets | Knfoz News

I made this for myself, and it seemed like it might be useful to others. I'd love some feedback, both on the threat model and the tool itself. I hope you find it useful!

Backstory: I've been using many agents in parallel as I work on a somewhat ambitious financial analysis tool. I was juggling agents working on epics for the linear solver, the persistence layer, the front-end, and planning for the second-generation solver. I was losing my mind playing whack-a-mole with the permission prompts. YOLO mode felt so tempting. And yet.

Then it occurred to me: what if YOLO mode isn't so bad? Decision fatigue is a thing. If I could cap the blast radius of a confused agent, maybe I could just review once. Wouldn't that be safer?

So that day, while my kids were taking a nap, I decided to see if I could put YOLO-mode Claude inside a sandbox that blocks exfiltration and regulates git access. The result is yolo-cage.

Also: the AI wrote its own containment system from inside the system's own prototype. Which is either very aligned or very meta, depending on how you look at it.

Comments URL: https://news.ycombinator.com/item?id=46706796

Points: 22

# Comments: 40

Show HN: yolo-cage – AI coding agents that can't exfiltrate secrets

Related Articles from hnrss.org

Autonomous (YC F25, AI-native financial advisor at 0% advisory fees) is hiring

PicoPCMCIA – a PCMCIA development board for retro-computing enthusiasts

European lawmakers suspend U.S. trade deal

JPEG XL Demo Page

Bending Spoons laid off almost everybody at Vimeo yesterday

SmartOS