2024-04-19 10:27am
Mark Zuckerberg - Llama 3, $10B Models, Caesar Augustus, & 1 GW Datacenters - fascinating interview - the least robot-like I’ve ever seen Zuck, he’s getting that billion-dollar media training. Highlights include:
- They got the edge in GPU race because in 2022 they realised they were short on GPUs for training their Reels recommendation system, so purchased double what they needed.
- They foresee the bottleneck being energy production (not chips) both in the (regulatory) time and tech required to produce enough energy to power the chips
- They have their own chips now, so they can lessen their reliance on more expensive Nvidia chips - they wont train llama4 with their own silicon but might train llama5