Different bets, I think. Devin is autonomous in the cloud and grades its own tests. PaellaDoc runs local, model-agnostic, and decides done with a gate that runs your criteria, and it serves the no-coder case Devin doesn’t.
For people who’ve tried both: where does each actually win for you?
Comparison: PaellaDoc vs Devin: two different bets on autonomous AI coding · PAELLADOC
I have real respect for Devin. For a senior engineer handing off well-scoped work to an autonomous agent in the cloud, it is further along than anything I am shipping.
It just makes two bets I could not make. One, it is one stack in one cloud. I wanted the model to be my choice, task by task, with a local option for when the code cannot leave the machine. Two, it grades its own tests, which is fine when an engineer reviews the PR and catches the green-but-broken. The person I built PaellaDoc for has no safety net: they cannot read the diff, so if the agent says done and the build is green, it ships broken.
So PaellaDoc is local, model-agnostic, and puts an independent gate where Devin puts trust. Different bet, for a different person. If you have run both, where does each one actually win for you?