inference compute is vastly different versus training, also it has to stay hot in vram which probably takes up most of it. There is limited use for THAT much compute as well, they are running things like claude code compiler and even then they're scratching the surface of the amount of compute they have.
Training currently requires nvidia's latest and greatest for the best models (they also use google TPU's now which are also technically the latest and greatest? However, they're more of a dual purpose than anything afaik so that would be a correct assesment in that case)
Inference can run on a hot potato if you really put your mind to it
I think I've heard multiple time that a large % of training compute for SoTA models is inference to generate training tokens, this is bound to happen with RL training
Electricity is charged whenever you use it or not, so very unlikely, but sure, they can find uses for it. Although they are not going to make that much money compared to claude code subscriptions.
the datacenter has a fixed cost for power, industrial power is not consumer power especially at large scale. Scale really kicks in if you own your power plant (ex: hydro, wind, solar).
For an example, even if you have a fixed power budget at the data centre level, you still have opportunity costs: if you turn some unused GPUs off, you can run other things hotter.
Training currently requires nvidia's latest and greatest for the best models (they also use google TPU's now which are also technically the latest and greatest? However, they're more of a dual purpose than anything afaik so that would be a correct assesment in that case)
Inference can run on a hot potato if you really put your mind to it