The review burden scales with the agent's decision space (which almost nobody is doing anything to limit today) and it's a variable you can control. When an agent has a discrete state, a subset of tools it can work with and constraints on the quality of the result the surface area of what could go wrong shrinks in proportion. The article treats this shift as inherent. It's more like we're all surprised pikachu that we gave the agents access to everything and expected it not to happen ever.