Duo Discover
Posts
Zuck's new Llama is a beast

Zuck's new Llama is a beast

Take a first look at Meta's LLaMA 3.1 405b large language model. Find out how it stacks up to other AI products like ChatGPT, Mistral, and Claude.

Alex John
July 28, 2024

In partnership with

When Mark Zuckerberg isn't wake surfing, wearing a tuxedo and a puka shell necklace at his Lake Tahoe mansion, crushing cold ones and waving the American flag, he’s at work battling Google and OpenAI for AI supremacy. Yesterday, Meta released its biggest and baddest large language model yet, LLaMA 3.1, which is free and arguably open-source.

Take a demo, get a Blackstone Griddle

Automate expense reports so you can focus on strategy
Uncapped virtual corporate cards
Access scalable credit lines from $500 to $15M

Request a demo

This powerhouse took months to train on 16,000 Nvidia H100 GPUs, costing hundreds of millions of dollars and using enough electricity to power a small country. The result? A massive 405 billion parameter model with a 128,000-token context length that reportedly outperforms OpenAI’s GPT-4 and even beats Claude 3.5 Sonet on some key benchmarks.

However, benchmarks can be deceptive. The only way to truly gauge a model's performance is by using it. In today's update, we’ll dive into LLaMA 3.1 and see if it lives up to the hype.

Key Highlights:

- LLaMA 3.1 comes in three sizes: 8B, 70B, and 405B, where B refers to billions of parameters.

- Despite more parameters often capturing more complex patterns, it doesn't always guarantee a better model. For instance, GPT-4 is rumored to have over 1 trillion parameters.

- LLaMA 3.1 is kind of open source: you can use it freely unless your app has 700 million monthly active users, in which case, you’ll need a license from Meta.

- The training data includes diverse sources like blogs, GitHub repos, and even Facebook posts and WhatsApp messages.

- The training code is remarkably simple, just 300 lines of Python and PyTorch, using Fairscale to distribute training across multiple GPUs.

- The model weights are open, a significant advantage for developers wanting to build AI-powered apps without relying on GPT-4’s API.

While self-hosting this model is expensive (the weights are 230GB and even an RTX 4090 struggles), platforms like Meta, Groq, or Nvidia's Playground offer opportunities to try it out for free.

Initial feedback suggests the smaller versions of LLaMA are more impressive than the massive 405B model. However, the real power lies in its ability to be fine-tuned with custom data, potentially leading to some incredible uncensored models in the near future.

Despite LLaMA 3.1's strengths, it still lags behind Claude in some tasks. For example, it struggled to build a Svelte 5 web application with runes, a new feature that only CL 3.5 Sonet handled correctly in a single shot. Nonetheless, LLaMA 3.1 excels in other areas like creative writing and poetry, though it isn't the best I've seen.

It's fascinating that despite multiple companies training massive models on massive computers, they seem to be plateauing at similar levels of capability. OpenAI made a significant leap from GPT-3 to GPT-4, but subsequent advancements have been incremental.

Reflecting on the current state of AI, it seems we're far from achieving artificial superintelligence, which remains a concept largely confined to Silicon Valley imaginations. Meta, however, is keeping it real in the AI space, and LLaMA 3.1 represents a small step for man but a giant leap for Zuckerberg's redemption arc.

Thank you for reading, and stay tuned for more updates.

What did you think of this week's issue?

We take your feedback seriously.