stafford williams

Anthropic releases Claude Sonnet 4.5 claiming it’s

the best coding model in the world.

API price remains the same as earlier versions at

$3/$15 per million tokens.

Also, ccusage deprecates blocks --live as the official way to track claude usage is via claude.ai/settings/usage.

ai, claude, anthropic, model • 2025-09-30 • 7:45am

Pricing remains consistent with previous Opus and Sonnet models: Opus 4 at $15/$75 per million tokens (input/output) and Sonnet 4 at $3/$15.

Claude Opus 4 is our most powerful model yet and the best coding model in the world. GitHub says Claude Sonnet 4 soars in agentic scenarios and will introduce it as the model powering the new coding agent in GitHub Copilot

Claude Code, now generally available, brings the power of Claude to more of your development workflow - new beta extensions for VS Code and JetBrains integrate Claude Code directly into your IDE. Beyond the IDE, we’re releasing an extensible Claude Code SDK, so you can build your own agents and applications using the same core agent as Claude Code.

anthropic, ai, claude, model • 2025-05-23 • 9:17am

Google releases Gemini 2.5 Flash allowing developers to switch between reasoning/non-reasoning at the API level.

$0.15/1mil tokens input
$0.60/1mil tokens output (no reasoning)
$3.50/1mil tokens output (reasoning)

gemini, ai, google, model • 2025-04-20 • 7:42pm

OpenAI has released o3 and o4-mini, their most powerful reasoning models. These models outperform their predecessors while delivering answers in under a minute. OpenAI also launched Codex CLI, an open-source terminal tool, with a $1 million fund to support related projects.

o3 $10/mil tokens input, $40/mil tokens output
o4-mini $1.1/mil tokens input, $4.4/mil tokens output

openai, ai, model • 2025-04-17 • 6:58pm

GPT 4.1 is announced as 3x developer-centric model releases available specifically via API. An internal instruction following eval was used to improve the model’s ability to follow instructions. Context size is 1m tokens. Pricing appears competitive if the new models have similar performance to Gemini/Sonnet:

gpt-4.1 $2/mil tokens input, $8/mil tokens output
gpt-4.1-mini $0.4/mil tokens input, $1.60/mil tokens output
gpt-4.1-nano $0.1/mil tokens input, $0.4/mil tokens output

GPT 4.5 will be deprecated over API in a few months

ai, openai, model • 2025-04-15 • 11:30am

Microsoft make their world model able capable of generating 10fps and release a Quake II inspired demo. John Carmack weighs in on the haters.

ai, model, microsoft • 2025-04-08 • 1:39pm

Meta releases first Llama 4 models:

Llama 4 Scout, 109B parameters, 10M context length
Llama 4 Maverick, 400B parameters, 1M context length

The above are distilled from an unreleased, still in training, larger model:

Llama 4 Behemoth, 2T parameters

ai, llm, model, llama, meta • 2025-04-06 • 9:12am

Gemini 2.5 Pro pricing is out. For prompts less than 200k tokens:

Input: $1.25/mil tokens
Output: $10/mil tokens

For prompts over 200k tokens:

Input: $2.50/mil tokens
Output: $15/mil tokens

Importantly this bumps the rate limit to at least 150RPM & 1,000RPD on Tier 1.

ai, llm, model, gemini, google • 2025-04-05 • 10:38am

Google Deepmind launch Gemini 2.5 Pro, their latest SOTA model, which debuts at #1 on the LLM Leaderboard. No pricing yet, though it’s available for free via Google AI Studio and OpenRouter.

ai, llm, model, gemini, google • 2025-03-26 • 8:30am

OpenAI release o1-pro and it costs $150 per million token input and $600 per million token output.

Currently, it’s only available to select developers — those who’ve spent at least $5 on OpenAI API services

ai, model, openai, llm • 2025-03-21 • 6:47am

open llama - A permissively licensed open source reproduction of Meta AI’s LLaMA with 3B, 7B and 13B models trained on RedPajama dataset and other sources, providing a drop-in replacement for LLaMA.

ai, llm, model • 2025-02-28 • 9:15am

Grok3 set to launch though after the “launch” it appears that:

Not all the models and related features of Grok 3 are available yet (some are in beta), but they began rolling out on Monday.

ai, llm, model, grok • 2025-02-18 • 11:14am

OpenAI o3-mini released.

This model continues our track record of driving down the cost of intelligence—reducing per-token pricing by 95% since launching GPT‑4—while maintaining top-tier reasoning capabilities.

ai, model, openai • 2025-02-01 • 9:06am

OpenAI announces new o3 model

the company unveiled o3, the successor to the o1 “reasoning” model it released earlier in the year. Neither o3 nor o3-mini are widely available yet, but safety researchers can sign up for a preview for o3-mini starting today.

ai, model, openai • 2024-12-23 • 10:28am

Gemini 2.0 Flash Thinking

a new experimental model that unlocks stronger reasoning capabilities and shows its thoughts.

ai, model, gemini • 2024-12-20 • 6:12am

Meta releases Llama 3.3

Llama 3.3 is a text-only 70B instruction-tuned model that provides enhanced performance

ai, model, llama, meta • 2024-12-17 • 9:49am

Introducing Stable Diffusion 3.5 - A nice surprise considering the flop of sd3, the emergence of flux models and the non-commercial license on flux-pro. That first image is next level considering the gimped sd3 (censored) and the prompt “woman lying in grass” drama

ai, stable-diffusion, model • 2024-10-23 • 6:28am

Claude 3.5 Sonnet (new)

Early customer feedback suggests the upgraded Claude 3.5 Sonnet represents a significant leap for AI-powered coding.

ai, model, anthropic, claude • 2024-10-23 • 6:18am

Nvidia releases a 72b multimodal LLM. The article claims it’s open source, but it appears to only have open weights and is otherwise commercially restricted.

ai, model, llm, nvidia • 2024-10-02 • 8:08pm

Introducing OpenAI o1-preview, a thinking/reasoning model.

As an early model, it doesn’t yet have many of the features that make ChatGPT useful, like browsing the web for information and uploading files and images. For many common cases GPT‑4o will be more capable in the near term.

ai, llm, model, openai • 2024-09-13 • 9:10am

FLUX dropped and it’s blows Stable Diffusion 3 out of the water, though has very high resource requirements. I’m running the schnell version locally. Prompt adherence is great, text capability is incredible.

ai, image-gen, flux, model • 2024-08-05 • 10:58am

Mistral announce Mistral Large 2

Mistral Large 2 has a 128k context window and supports dozens of languages

ai, llm, model, mistral • 2024-07-25 • 8:44am

Meta introduces Llama 3.1 including a 405B model. Zuck restates their commitment to open source. Models are up on hugging face, with 405b having a 200gb+ vram requirement.

ai, model, llama, meta • 2024-07-24 • 8:58am

SD3 weights dropped last night. I gave it a shot last night myself with their supplied comfyui workflows, as a base model it looks extremely promising, details are next level, though it still doesn’t appear to know jack about hands, faces still need hires fix. Very promising for a base model.

ai, image-gen, stable-diffusion, stability-ai, model • 2024-06-13 • 11:30am

Microsoft releases Phi-3 vision

a 4.2B parameter multimodal model with language and vision capabilities.

ai, llm, model, microsoft • 2024-05-22 • 8:38am

Hello GPT-4o

We’re announcing GPT‑4o, our new flagship model that can reason across audio, vision, and text in real time.

openai, llm, model, ai • 2024-05-14 • 8:22am

Introducing the next generation of Claude

The family includes three state-of-the-art models in ascending order of capability: Claude 3 Haiku, Claude 3 Sonnet, and Claude 3 Opus.

ai, llm, claude, anthropic, model • 2024-03-06 • 6:23pm

links#