llama
links#
Llama 3.3 is a text-only 70B instruction-tuned model that provides enhanced performance
Meta introduces Llama 3.1 including a 405B model. Zuck restates their commitment to open source. Models are up on hugging face, with 405b having a 200gb+ vram requirement.
Mark Zuckerberg - Llama 3, $10B Models, Caesar Augustus, & 1 GW Datacenters - fascinating interview - the least robot-like I’ve ever seen Zuck, he’s getting that billion-dollar media training. Highlights include:
- They got the edge in GPU race because in 2022 they realised they were short on GPUs for training their Reels recommendation system, so purchased double what they needed.
- They foresee the bottleneck being energy production (not chips) both in the (regulatory) time and tech required to produce enough energy to power the chips
- They have their own chips now, so they can lessen their reliance on more expensive Nvidia chips - they wont train llama4 with their own silicon but might train llama5