ai

companies

OpenAI - ChatGPT
Stability AI - Stable diffusion
Meta AI - Llama
Anthropic - AI assistant
ElevenLabs - Voice synthesis
Mistral AI

posts

2026-06-28 How I Build Software in June 2026
2026-04-30 Steering the Vibe: Orchestration
2026-03-08 Steering the Vibe: Permissions
2026-02-22 Steering the Vibe: Complexity
2026-02-11 Steering the Vibe: Review
2026-01-15 Agent orchestration
2026-01-04 Twelve months of agentic AI code-assist
2025-12-29 Steering the Vibe: Refactor
2025-12-21 Porting Jekyll to Astro with Claude Code
2025-12-14 Steering the Vibe: Verify
2025-12-07 Steering the Vibe: Commits
2025-06-04 AI Code Assistance: Are we talking about the same thing?
2025-03-16 Embrace Vibe Coding, Know Its Limits
2025-03-09 You Might Not Need an AI Framework
2025-03-03 Understanding AI-assisted Coding Workflows
2023-08-28 Data Processing with Large Language Models
2023-05-19 An Introduction to Stable Diffusion

links

Turbopuffer does vector and full-text search on cheap object storage instead of RAM: slower on cold queries, but roughly an order of magnitude cheaper. The Ottawa startup is profitable on ~1,200 customers — Anthropic, Cursor, Notion — on barely any funding.

ai, moat, vector, search • 2026-07-05 • 12:07pm

Linus Tovalds speaks to 3d printing and increased linux commits & security disclosures due to AI.

AI is a great tool, but it’s a tool. When I see people saying 99% of our code is written by AI … 100% of their code is written by compilers, but they never say that. AI is … not changing the fundamentals. …people will use AI to generate the code that the compilers use to generate the code that the assemblers then use to generate the machine code. This is revolutionary in the same sense that we’ve seen revolutions before. AI is great, but AI is not changing programming.

ai, linux, code-assist • 2026-06-23 • 2:10pm

I have rage quit integrating worktrees into my agentic orchestration at least three times. It’s a promising idea, having many agents work on the same repo, but is non-trivial to do and different for every repo.

For now I’m sticking to 2-3 agents on the same repo, in different areas of the codebase. Model seems to handle the other work going on, tho, Opus 4.8 appears more curious about it than others.

Really pleased to see others like Isaac Dedini experimenting the area - a great post on why “just use worktrees” is nonsense.

agent-orchestration, ai, code-assist • 2026-06-11 • 11:47am •

This is agent orchestration, but Loop Engineering is for sure a better term. My notes:

I don’t have a requirement for scheduled tasks as yet
Worktrees are close to impossible
I use assist for skills & cli-wrapping, though I do use Claude connectors
Sub-agents - no thanks, i’ll make the deterministic thing (assist) control the non-deterministic thing (claude).
State/memory - this is assist backlog

ai, agent-orchestration • 2026-06-09 • 9:58pm

David Fowler speaks to integrating agentic workflows into existing development teams, noting that:

If you care about your codebase, getting it ready for agents matters.

and,

No universal AI workflow can be dropped into every engineering team.

ai, code-assist • 2026-05-13 • 10:54am

A few different views (via links) of Steve Yegge’s Levels of AI Development.

ai, code-assist • 2026-05-07 • 10:41am

Thoughts on slowing the fuck down speaks to the outcome earnt by not paying attention to what your AI agents are generating.

ai, code-assist, quality • 2026-03-26 • 1:40pm

The Tool Overtriggering in Anthropic’s Opus 4.5 migration skill notes why I like less markdown than more. The optimized markdown you have right now is optimized only for the model that you optimized it for.

prompting, ai, code-assist • 2026-02-24 • 10:15am

Addy Osmani’s personal blog and his AI Engineering blog

blogs, ai • 2026-02-12 • 2:03pm

More shots fired from Anthropic at OpenAI - they’ve announced that Claude won’t have ads.

ai, moat, anthropic, claude • 2026-02-05 • 9:53am

Code is cheap. Show me the talk argues that the ability to think clearly, communicate problems, and architect solutions matters far more than the ability to write code itself.

ai, code-assist, ways-of-working • 2026-02-01 • 9:22pm

OpenCode appears to be pretty interesting and could be worth investigating further. Even though Anthropic blocked Claude subscriptions on it, afterwards GitHub Copilot announced official support for it, so you can still access Opus via Copilot credits. At first glance I was able to copy my ~/.claude/commands to ~/.opencode/commands and they appeared to just work (with Opus via Copilot). The larger benefit of opencode would be testing other models that Claude Code does not support, notably local models.

code-assist, ai, moat • 2026-01-22 • 3:36pm

Addy Osmani’s LLM coding workflow going into 2026 appears very similar to my own, though an interesting note on articles like this is that the reader can interpret the language used in different ways. Claude Code has an excellent /plan capability that Opus 4.5 excels on. I do plan, and I do write specs from time to time, but I also one-shot and two-shot changes and my actual development looks like a mixture of these. At no time do i generate pages and pages of specs and then have the LLM embark on a long horizon delivery period and not look at the code it wrote. Addy points out in anther post the importance of understanding the code with ideas like:

If you skip review, you don’t eliminate work - you defer it

And also links to another article noting that skipping review results in:

No consistency, no overarching plan. It’s like I’d asked 10 junior-mid developers to work on this codebase, with no Git access, locking them in a room without seeing what the other 9 were doing

This is inline with my findings and against a popular (hype) argument that an increase in code volume due to LLMs needs to be accompanied with a decrease in human-in-the-loop because speed. This is a drop in quality, which is deferred until your customer’s are affected and your DORA metrics deteriorate.

code-assist, ai, ways-of-working • 2026-01-21 • 11:41am

Model-Market Fit has some bangers:

When MMF exists, human-in-the-loop is a feature. It maintains quality, builds trust, handles edge cases. The AI does the work; the human provides oversight. When MMF doesn’t exist, human-in-the-loop is a crutch. It hides the fact that the AI can’t perform the core task. …The test is simple: if all human correction were removed from this workflow, would customers still pay? If the answer is no, there’s no MMF. There’s only a demo.

and

The gap between 80% and 99% accuracy is often infinite in practice.

model, ai, moat • 2026-01-21 • 10:58am

I’m finding promise in having agents use a CLI directly rather than having them generate API code. This tweet elaborates on the context window and efficiency improvements.

ai, code-assist • 2026-01-18 • 4:48pm

How to Pair With an Agent is a short but good example of to make your prompting less vague.

prompting, code-assist, ai • 2026-01-15 • 11:45am

The descriptions of Spec-Driven development that I have seen emphasize writing the whole specification before implementation. This encodes the (to me bizarre) assumption that you aren’t going to learn anything during implementation that would change the specification. Kent Beck puts it well:

I’ve heard this story so many times told so many ways by well-meaning folks—if only we could get the specification “right”, the rest of this would be easy.

ai, ways-of-working • 2026-01-11 • 11:41am •

Effective context engineering for AI agents

good context engineering means finding the smallest possible set of high-signal tokens that maximize the likelihood of some desired outcome

ai, anthropic, prompting • 2025-10-01 • 10:45pm

Anthropic releases Claude Sonnet 4.5 claiming it’s

the best coding model in the world.

API price remains the same as earlier versions at

$3/$15 per million tokens.

Also, ccusage deprecates blocks --live as the official way to track claude usage is via claude.ai/settings/usage.

ai, claude, anthropic, model • 2025-09-30 • 7:45am

Microsoft announces auto-model selection for VS Code that prioritizes Claude Sonnet 4 over GPT-5 for paid users

ai, code-assist, microsoft, vscode, anthropic, moat • 2025-09-18 • 7:25pm

Claude code supports remote MCP servers.

claude mcp add --transport http github https://api.githubcopilot.com/mcp/ \
    --header "Authorization: Bearer <GITHUB_PAT>"

claude, ai • 2025-06-19 • 2:12pm

How we built our multi-agent research system speaks to Anthropic’s multi-agent build experiences.

We found that a multi-agent system with Claude Opus 4 as the lead agent and Claude Sonnet 4 subagents outperformed single-agent Claude Opus 4 by 90.2% on our internal research eval

However these architectures burn through tokens fast:

In our data, agents typically use about 4× more tokens than chat interactions, and multi-agent systems use about 15× more tokens than chats.

Cognition goes further in Don’t Build Multi-Agents

In some cases, libraries such as swarm by OpenAI and autogen by Microsoft actively push concepts which I believe to be the wrong way of building agents.

anthropic, ai, cognition, patterns • 2025-06-19 • 8:27am

Anthropic releases Claude 4.

Pricing remains consistent with previous Opus and Sonnet models: Opus 4 at $15/$75 per million tokens (input/output) and Sonnet 4 at $3/$15.

Claude Opus 4 is our most powerful model yet and the best coding model in the world. GitHub says Claude Sonnet 4 soars in agentic scenarios and will introduce it as the model powering the new coding agent in GitHub Copilot

Claude Code, now generally available, brings the power of Claude to more of your development workflow - new beta extensions for VS Code and JetBrains integrate Claude Code directly into your IDE. Beyond the IDE, we’re releasing an extensible Claude Code SDK, so you can build your own agents and applications using the same core agent as Claude Code.

anthropic, ai, claude, model • 2025-05-23 • 9:17am

Microsoft to open source VSCode copilot & WSL

Over the next few months, AI-powered capabilities from the GitHub Copilot extension will migrate to the VS Code open source repository, says Microsoft. Windows Subsystem for Linux (WSL), the feature in Windows that allows users to run a Linux environment directly on their Windows system, will be open sourced.

ai, microsoft, copilot, moat • 2025-05-21 • 10:56am

Google releases an asynchronous coding agent called Jules. Unfortunately it appears to be wait-list only in Australia.

google, ai, moat • 2025-05-20 • 6:56pm

OpenAI introduces a cloud-based software engineering agent. Limited to Pro users for now.

ai, openai, moat • 2025-05-17 • 9:00am

OpenAI to buy Windsurf for $3 billion.

openai, ai, moat • 2025-05-06 • 8:16pm

Firefox integrated AI chatbots into a new sidebar within the browser but the integration doesn’t appear to have access to the page the browser is currently on 🤦. It appears to just be an iframe/tab to the provider’s normal web interface.

ai, firefox • 2025-05-06 • 12:10pm

ComfyUI is integrating external API models and demonstrate gpt-image-1 integrated with a ComfyUI workflow. The catch for now is you can’t supply your own API key/endpoint and rather require a ComfyUI account that you load with credits. Pricing reflects OpenAI list price.

comfy-ui, ai, image-gen • 2025-05-01 • 9:00am

Chain-of-Vibes by Pete Hodgson sums up nicely the approach I take to leverage the power of agentic AI workflows while sidestepping the limitations. Treat the tools “like weirdly knowledgeable, hyper-productive junior engineers”, give enough context but otherwise limit their output. Importantly, review the outcome of each small task and make a decision:

A) Accept and commit
B) Prompt to adjust
C) Manually fix up
D) Revert and re-prompt

The last item is an important one that’s often overlooked - throwing away the result and fixing your prompt is often more time efficient, especially when learning how to prompt.

blog, ai, prompting, code-assist • 2025-04-29 • 11:28am •

Prompting has deceptive simplicity. It seems like it should be straight forward but for me it’s required a lot of practice to get better results. Aidan Boyd’s post How I talk to AI has a good summary of how he thinks about prompting before doing so.

ai, prompting, blogs • 2025-04-28 • 2:31pm •

Google releases Gemini 2.5 Flash allowing developers to switch between reasoning/non-reasoning at the API level.

$0.15/1mil tokens input
$0.60/1mil tokens output (no reasoning)
$3.50/1mil tokens output (reasoning)

gemini, ai, google, model • 2025-04-20 • 7:42pm

OpenAI has released o3 and o4-mini, their most powerful reasoning models. These models outperform their predecessors while delivering answers in under a minute. OpenAI also launched Codex CLI, an open-source terminal tool, with a $1 million fund to support related projects.

o3 $10/mil tokens input, $40/mil tokens output
o4-mini $1.1/mil tokens input, $4.4/mil tokens output

openai, ai, model • 2025-04-17 • 6:58pm

GPT 4.1 is announced as 3x developer-centric model releases available specifically via API. An internal instruction following eval was used to improve the model’s ability to follow instructions. Context size is 1m tokens. Pricing appears competitive if the new models have similar performance to Gemini/Sonnet:

gpt-4.1 $2/mil tokens input, $8/mil tokens output
gpt-4.1-mini $0.4/mil tokens input, $1.60/mil tokens output
gpt-4.1-nano $0.1/mil tokens input, $0.4/mil tokens output

GPT 4.5 will be deprecated over API in a few months

ai, openai, model • 2025-04-15 • 11:30am

Microsoft make their world model able capable of generating 10fps and release a Quake II inspired demo. John Carmack weighs in on the haters.

ai, model, microsoft • 2025-04-08 • 1:39pm

A summary of the security concerns with Model Context Protocol.

ai, llm, mcp • 2025-04-08 • 12:55pm

Meta releases first Llama 4 models:

Llama 4 Scout, 109B parameters, 10M context length
Llama 4 Maverick, 400B parameters, 1M context length

The above are distilled from an unreleased, still in training, larger model:

Llama 4 Behemoth, 2T parameters

ai, llm, model, llama, meta • 2025-04-06 • 9:12am

The free lunch that was smashing GitHub Copilot’s premium models with unlimited requests is over and rate limits are coming in May. Any requests not hitting the base model (currently: OpenAI GPT-4o) are considered premium, so Sonnet, Gemini, o1 etc. The rate limits are as follows:

Copilot Free: 50/month
Copilot Pro: 300/month
Copilot Pro+ 1500/month
Copilot Business: 300/month
Copilot Enterprise: 1000/month

The premium models also have a rate multiplier where for example Claude Sonnet 3.5/3.7 have a rate mulitplier of 1, and o1 and GPT-4.5 have a rate multiplier of 10 and 50 respectively.

The news comes with an announcement of the GitHub Copilot Pro+ plan, a new individual tier priced at 39USD/month with 1500 requests/month to premium models.

ai, llm, github, copilot, microsoft • 2025-04-05 • 11:20am •

Gemini 2.5 Pro pricing is out. For prompts less than 200k tokens:

Input: $1.25/mil tokens
Output: $10/mil tokens

For prompts over 200k tokens:

Input: $2.50/mil tokens
Output: $15/mil tokens

Importantly this bumps the rate limit to at least 150RPM & 1,000RPD on Tier 1.

ai, llm, model, gemini, google • 2025-04-05 • 10:38am

The success of the new ChatGPT 4o image generation caused the rollout to be delayed but it’s now available to free users, rate limited to 3 generations per day.

generate a photo-realistic image of a woman on a giant block of cheese in the middle of a forest

ok, but now make her hold a sign that says 'woman on cheese in forest'

ok but she should be wearing a red dress

ok but now she, and the cheese, should be upside down, while the rest of the image remains correctly orientated

that is wrong, because the forest is upside down, when only the woman and the cheese should be

ai, chatgpt, openai, image-gen • 2025-04-02 • 9:16pm

Not sure what tangible changes this might produce considering Elon already owned both but, xAI acquired X

xAI and X’s futures are intertwined. Today, we officially take the step to combine the data, models, compute, distribution and talent. This combination will unlock immense potential by blending xAI’s advanced AI capability and expertise with X’s massive reach.

ai, moat, xai • 2025-03-29 • 11:48am

The next version of docker (v4.40) will add native llm capability to the docker cli. Docker Model Runner is not yet publicly released, but adds commands like docker model run that will run LLM models outside of containers. Initial reports look promising and may be nice replacement for running llama.cpp, koboldcpp or ollama locally.

docker, ai, llm • 2025-03-29 • 10:29am

OpenAI are adopting MCP. They’ve already integrated it with their Agents SDK and note they’re:

also working on MCP support for the OpenAI API and ChatGPT desktop app

mcp, ai, openai • 2025-03-27 • 9:45am

MCP C# SDK allows C# developers to build MCP clients and servers in the dotnet ecosystem.

mcp, csharp, ai • 2025-03-26 • 8:35am

Google Deepmind launch Gemini 2.5 Pro, their latest SOTA model, which debuts at #1 on the LLM Leaderboard. No pricing yet, though it’s available for free via Google AI Studio and OpenRouter.

ai, llm, model, gemini, google • 2025-03-26 • 8:30am

A couple of videos from Nvidia GTC:

Live at NVIDIA GTC With Acquired
- Cuda lost money for 10 years but now a key contributor to Nvidia’s moat
- Ex-intel CEO said inference cost is 10,000x too expensive and qpu, quantum compute, will be available within 5 years
GTC March 2025 Keynote with NVIDIA CEO Jensen Huang
- Tokens/second is everything. This is the purpose of data centres of GPUs and we can call these AI factories.
- Revenue & token/per second are ultimately power limited by how much electricity the AI factory has access to.
- Moore’s Law now applies to energy, not hardware.
- How big/smart the model is needs to be managed against how many tokens/second/per user. Bigger models require more compute taking away capacity from tokens/second/user and serving more users at once takes capacity away from the datacenter. The sweet spot is somewhere in the middle and represented by the area under the curve.
- Nvidia’s new open source Dynamo software:
  
  Efficiently orchestrating and coordinating AI inference requests across a large fleet of GPUs is crucial to ensuring that AI factories run at the lowest possible cost to maximize token revenue generation.
- Reasoning in LLMs improves accuracy with 20x tokens, 100x compute (llama 3.3 70b, 8x h100 vs deepseek r1 16x h100)
- Hopper to Blackwell = 25-40x better inference performance, obliterating previous spend on Hopper. While impressive, I don’t know how lab investors recoop this or subsequent hardware investments.
- Short term roadmap
  - Blackwell Ultra - 2nd half 2025
  - Vera Rubin - 2nd half 2026
  - Rubin Ultra - 2nd half 2027
- Hopper > Rubin - 900x perf, 0.03 cost
- Robotics is next trillion-dollar industry

ai, nvidia, moat • 2025-03-25 • 2:54pm

Claude now has web search but it’s only

available now in feature preview for all paid Claude users in the United States. Support for users on our free plan and more countries is coming soon.

ai, llm, anthropic, moat • 2025-03-21 • 7:01am

OpenAI release o1-pro and it costs $150 per million token input and $600 per million token output.

Currently, it’s only available to select developers — those who’ve spent at least $5 on OpenAI API services

ai, model, openai, llm • 2025-03-21 • 6:47am

A rules-based pattern is emerging for helping agentic workflows produce better results. Examples include GreatScottyMac’s RooFlow and, Geoff Huntley’s specs and stdlib approaches.

ai, code-assist, patterns • 2025-03-19 • 3:50pm

Brendan Humphrey on Vibe Coding, aligns with my own thinking on vibe coding:

…these tools must be carefully supervised by skilled engineers, particularly for production tasks. Engineers need to guide, assess, correct, and ultimately own the output as if they had written every line themselves.

Smashing Create PR with vibe coding output amounts to an attack on the PR process:

Generating vast amounts of code from single prompts effectively DoS attacks reviewers, overwhelming their capacity for meaningful assessment

But there is still some value:

Currently we see one narrow use case where vibe coding is exciting: spikes, proofs of concept, and prototypes. These are always throwaway code. LLM-assisted generation offers enormous value in rapidly testing and validating ideas with implementations we will ultimately discard.

ai, code-assist, vibe-coding • 2025-03-17 • 2:40pm

Eugene Yan’s blog - Senior Applied Scientist at Amazon

blogs, ai • 2025-03-16 • 12:09pm

Hamel Husain’s blog - independent AI consultant

blogs, ai • 2025-03-16 • 12:09pm

Simon Willison’s blog - ai researcher, independent open source developer, co-creator of the Django Web Framework

blogs, ai • 2025-03-16 • 12:09pm

Evalite - a vitest-based eval runner by Matt Pocock.

ai, llm, evals, vitest • 2025-03-07 • 3:11pm

Introducing GPT-4.5 - hallucinations down, accuracy up, non-reasoning. Rolling out to pro + api. Doesn’t look like anyone will be coding with it any time soon with this type of api pricing:

Input: $75.00 / 1M tokens
Cached input: $37.50 / 1M tokens
Output: $150.00 / 1M tokens

And then in a tweet from sama:

this isn’t a reasoning model and won’t crush benchmarks. it’s a different kind of intelligence and there’s a magic to it i haven’t felt before. really excited for people to try it!

ai, llm, openai • 2025-02-28 • 9:18am

open llama - A permissively licensed open source reproduction of Meta AI’s LLaMA with 3B, 7B and 13B models trained on RedPajama dataset and other sources, providing a drop-in replacement for LLaMA.

ai, llm, model • 2025-02-28 • 9:15am

What We’ve Learned From A Year of Building with LLMs is a huge overview of findings building LLM applications from:

starting with prompting when prototyping new applications

all the way through:

what is a completely infeasible floor demo or research paper today will become a premium feature in a few years and then a commodity shortly after

ai, llm • 2025-02-28 • 9:15am

Emerging Patterns in Building GenAI Products - a look at a number of different gen-ai patterns across evals, embeddings, RAG, Guardrails, fine tuning.

ai, evals, patterns • 2025-02-28 • 9:14am

Fuck you, show me the prompt is an investigation into extracting the actual prompt that is sent to a model by llm abstraction libraries.

There are many libraries that aim to make the output of your LLMs better by re-writing or constructing the prompt for you. The prompts sent by these tools to the LLM is a natural language description of what these tools are doing, and is the fastest way to understand how they work.

ai, prompting, llm • 2025-02-28 • 9:13am

The Novice’s LLM Training Guide - a look at fine-tuning LLMs using Low Rank Adaption (LoRA)

ai, llm, lora • 2025-02-28 • 9:11am

Claude 3.7 Sonnet and Claude Code - hybrid reasoning model, same price as claude 3.5, improved accuracy. Also a terminal-based agentic coding tool, however this requires an api key.

ai, llm, claude, anthropic • 2025-02-25 • 6:55am

ChatGPT Deep Research hallucinates

it claimed again to produce a complete dataset but in fact only produced ~7 lines, with a placeholder for the other ~3000.

ai, llm, deep-research, chatgpt, openai • 2025-02-20 • 9:15am

When it comes to ai-assisted coding with an agentic workflow, I’m using Roo Code in vscode using GitHub Copilot’s Claude 3.5 Sonnet. An easier way to get started is GitHub Copilot’s Agent mode, which available in preview via vscode-insiders. How are you using agentic workflows to write code?

code-assist, ai, copilot • 2025-02-18 • 1:25pm •

Grok3 set to launch though after the “launch” it appears that:

Not all the models and related features of Grok 3 are available yet (some are in beta), but they began rolling out on Monday.

ai, llm, model, grok • 2025-02-18 • 11:14am

Introducing Perplexity Deep Research - Perplexity undercuts OpenAI by releasing their own Deep Research, for free.

ai, llm, deep-research, perplexity • 2025-02-17 • 5:46am

If it’s got an API, you can integrate it with a large language model via model context protocol — like Ahmadreza Niroomand’s MCP server for SEQ.

mcp, llm, ai • 2025-02-16 • 1:34pm •

Building a SNAP LLM eval - the first write-up in a series about our process of building an “eval” — evaluation — to assess how well AI models perform on prompts

ai, llm, evals • 2025-02-16 • 9:02am

Your AI product needs evals - How to construct domain-specific LLM evaluation systems to improve AI by iterating quickly.

ai, llm, evals • 2025-02-16 • 9:01am

OpenAI roadmap update for GPT-4.5 and GPT-5 from sama, which indicates that the model they’ve been cooking for some time can no longer be considered GPT-5.

We will next ship GPT-4.5, the model we called Orion internally, as our last non-chain-of-thought model.

ai, llm, openai • 2025-02-13 • 8:44am

Convert a Figma design to code with Claude - After theoretically setting up claude to read/write from jira & github I remarked that my only job left would be to copy a screenshot from figma into the prompt and ask it to build the UI, but it looks that can can be integrated too.

ai, llm, claude, figma, mcp • 2025-02-13 • 6:23am

Deepseek vs Claude PR Reviews - also demonstrates the value of being able to quickly switch between models.

ai, llm, claude, deepseek • 2025-02-09 • 10:29am

The End of Programming as We Know It is another argument against ai replacing programmingers and for ai extending programmer capability.

ai, llm, future • 2025-02-08 • 9:17am

OpenAI’s Deep Research: Novel User Applications and Community Insights - I prompted Deep Research to research itself

Investigate latest community news of OpenAI’s Deep research function and what novel approaches people are finding it useful for.

ai, llm, chatgpt, deep-research, openai • 2025-02-07 • 5:49pm

LLM Cost Analysis 2023-2026 - I asked ChatGPT deep research to generate a report that investigates the $/million tokens over time across providers and predict the price of tokens in 2026

Prepare a report that investigates the cost per million token of LLMs since 2023, with estimations on what the cost will be in 2026.

ai, llm, chatgpt, deep-research, openai • 2025-02-07 • 5:13pm

GitHub Copilot: The agent awakens - Just when you thought it was Cursor/Claude Desktop/Roo/Cline, m$ reminds you they’ll eat your lunch.

ai, copilot, microsoft, code-assist • 2025-02-07 • 10:23am

Open-source DeepResearch – Freeing our search agents - after the release of OpenAI’s Deep Research, Hugging Face deliver an open source alternative in 24 hours.

ai, llm, deep-research, hugging-face • 2025-02-05 • 8:55pm

Getting AI-powered features past the post-MVP slump

The non-negotiable first step in systematically improving your AI systems is establishing a solid feedback loop.

ai, llm, evals • 2025-02-05 • 8:12am

Beyond the AI MVP: What it really takes

almost no one is talking about how to integrate this stuff into a normal software development lifecycle. There’s a reason no one is talking about this: it’s because most teams, even those at billion-dollar companies, just haven’t built this yet.

ai, complexity • 2025-02-02 • 9:52am

OpenAI o3-mini released.

This model continues our track record of driving down the cost of intelligence—reducing per-token pricing by 95% since launching GPT‑4—while maintaining top-tier reasoning capabilities.

ai, model, openai • 2025-02-01 • 9:06am

DeepSeek hit with ‘large-scale’ cyber-attack after AI chatbot tops app stores - Attack forces Chinese company to temporarily limit registrations as app becomes highest rated free app in US.

I noted I could access the chat signup page after a few refreshes but the api signup was constantly throwing 500.

ai, deepseek • 2025-01-30 • 7:04am

On DeepSeek and Export Controls - CEO of Anthropic shares his take on DeepSeek - It’s not as good as everyone says it is, but China needs to be further restricted from chips anyway.

ai, llm, deepseek, anthropic • 2025-01-30 • 6:19am

New image model family: Janus-Pro - DeepSeek creators just dropped a stable diffusion competitor.

Janus-Pro, which DeepSeek describes as a “novel autoregressive framework,” can both analyze and create new images… and most Janus-Pro models can only analyze small images with a resolution of up to 384 x 384.

ai, image-gen, deepseek • 2025-01-28 • 12:22pm

Hugging Face NLP Course

This course will teach you about natural language processing (NLP) using libraries from the Hugging Face ecosystem

ai • 2025-01-28 • 8:16am

The Short Case for Nvidia Stock, a 60min read, but, also a very good overview of the current state of ai, including cerebras + deepseek.

ai, nvidia, moat • 2025-01-28 • 6:53am

Why everyone in AI is freaking out about DeepSeek

The open-source availability of DeepSeek-R1, its high performance, and the fact that it seemingly “came out of nowhere” to challenge the former leader of generative AI, has sent shockwaves throughout Silicon Valley and far beyond

ai, deepseek, moat • 2025-01-27 • 8:32pm

Ignore the Grifters - AI Isn’t Going to Kill the Software Industry

It’s highly unlikely that software developers are going away any time soon. The job is definitely going to change, but I think there are going to be even more opportunities for software developers to make a comfortable living making cool stuff.

ai, future • 2025-01-25 • 9:09am

OpenAI announces new o3 model

the company unveiled o3, the successor to the o1 “reasoning” model it released earlier in the year. Neither o3 nor o3-mini are widely available yet, but safety researchers can sign up for a preview for o3-mini starting today.

ai, model, openai • 2024-12-23 • 10:28am

Gemini 2.0 Flash Thinking

a new experimental model that unlocks stronger reasoning capabilities and shows its thoughts.

ai, model, gemini • 2024-12-20 • 6:12am

Meta releases Llama 3.3

Llama 3.3 is a text-only 70B instruction-tuned model that provides enhanced performance

ai, model, llama, meta • 2024-12-17 • 9:49am

Elon Musk wanted an OpenAI for-profit

in 2017, Elon not only wanted, but actually created, a for-profit as OpenAI’s proposed new structure. When he didn’t get majority equity and full control, he walked away and told us we would fail.

ai, moat, openai • 2024-12-14 • 9:44am

Cerebras Now The Fastest LLM Inference Processor; Its Not Even Close

To put it into perspective, Cerebras ran the 405B model nearly twice as fast as the fastest GPU cloud ran the 1B model. Twice the speed on a model that is two orders of magnitude more complex.

ai • 2024-11-20 • 5:46am

OpenAI and others seek new path to smarter AI as current methods hit limitations

Ilya Sutskever, co-founder of AI labs Safe Superintelligence (SSI) and OpenAI, told Reuters recently that results from scaling up pre-training - the phase of training an AI model that use s a vast amount of unlabeled data to understand language patterns and structures - have plateaued.

Then from Yann LeCun:

I don’t wanna say “I told you so”, but I told you so.

Also, from Gary Marcus;

Yann LeCun is absolute conniving thief

ai, moat • 2024-11-14 • 5:48am

Introducing Stable Diffusion 3.5 - A nice surprise considering the flop of sd3, the emergence of flux models and the non-commercial license on flux-pro. That first image is next level considering the gimped sd3 (censored) and the prompt “woman lying in grass” drama

ai, stable-diffusion, model • 2024-10-23 • 6:28am

Claude 3.5 Sonnet (new)

Early customer feedback suggests the upgraded Claude 3.5 Sonnet represents a significant leap for AI-powered coding.

ai, model, anthropic, claude • 2024-10-23 • 6:18am

Legacy Modernization meets GenAI - I am constantly pondering when and how AI will help me understand, maintain and/or uplift an existing codebase and here’s an article on the subject, tho, TL;DR: keep waiting

ai, legacy • 2024-10-19 • 2:05pm

Nvidia releases a 72b multimodal LLM. The article claims it’s open source, but it appears to only have open weights and is otherwise commercially restricted.

ai, model, llm, nvidia • 2024-10-02 • 8:08pm

OpenAI to remove non-profit control and give Sam Altman equity

The OpenAI non-profit will continue to exist and own a minority stake in the for-profit company

ai, moat, openai • 2024-09-26 • 6:00pm

Mira Murati, the CTO of OpenAI steps down.

ai, openai, moat • 2024-09-26 • 1:55pm

An open letter to European policymakers requesting improvement to AI regulations in the region.

The EU’s ability to compete with the rest of the world on AI and reap the benefits of open-source models rests on its single market and shared regulatory rulebook.

Zuck & Yann LeCun included as signatories.

ai, moat • 2024-09-20 • 1:31pm

Introducing OpenAI o1-preview, a thinking/reasoning model.

As an early model, it doesn’t yet have many of the features that make ChatGPT useful, like browsing the web for information and uploading files and images. For many common cases GPT‑4o will be more capable in the near term.

ai, llm, model, openai • 2024-09-13 • 9:10am

FLUX dropped and it’s blows Stable Diffusion 3 out of the water, though has very high resource requirements. I’m running the schnell version locally. Prompt adherence is great, text capability is incredible.

ai, image-gen, flux, model • 2024-08-05 • 10:58am

Mistral announce Mistral Large 2

Mistral Large 2 has a 128k context window and supports dozens of languages

ai, llm, model, mistral • 2024-07-25 • 8:44am

Meta introduces Llama 3.1 including a 405B model. Zuck restates their commitment to open source. Models are up on hugging face, with 405b having a 200gb+ vram requirement.

ai, model, llama, meta • 2024-07-24 • 8:58am

Stability AI holds on, appointing a new CEO.

ai, image-gen, stable-diffusion, moat • 2024-06-22 • 10:08am

Nvidia passes Microsoft in market cap to become most valuable public company though days later $646 billion wiped from Nvidia in biggest ever company loss in history of the world.

ai, nvidia, moat • 2024-06-19 • 11:20pm

SD3 weights dropped last night. I gave it a shot last night myself with their supplied comfyui workflows, as a base model it looks extremely promising, details are next level, though it still doesn’t appear to know jack about hands, faces still need hires fix. Very promising for a base model.

ai, image-gen, stable-diffusion, stability-ai, model • 2024-06-13 • 11:30am

xAI gets a Series B funding round of $6 billion

ai, grok, moat, xai • 2024-05-28 • 3:48pm

Prompting Fundamentals and How to Apply them Effectively has some really good prompting guidance.

ai, llm, prompting • 2024-05-26 • 12:00pm

Yann LeCun says LLM’s won’t achieve AGI.

ai, meta, moat • 2024-05-23 • 12:06pm

I pondered whether LLMs would be any good at solving for Vehicle Routing Problem - thankfully I don’t need to investigate as arxiv.org once again delivers. TL;DR - yes, as long as you’re happy with it being wrong 30-40% of the time.

ai, vrp, llm • 2024-05-22 • 10:25am

Microsoft releases Phi-3 vision

a 4.2B parameter multimodal model with language and vision capabilities.

ai, llm, model, microsoft • 2024-05-22 • 8:38am

Microsoft’s AI chatbot will ‘recall’ everything you do on its new PCs

ai, copilot • 2024-05-21 • 4:25pm

Scarlett Johansson issues a statement on OpenAI and OpenAI posts about How the voices for ChatGPT were chosen.

ai, openai, moat • 2024-05-21 • 9:23am

I’ve been running koboldcpp in wsl, but the Tcl/tk UI is tiny. This looking interesting tho, and already in a container.

ai, llm • 2024-05-20 • 1:57pm

Yann LeCun reminding us there is no such thing as a rogue super intelligence.

ai, asi • 2024-05-19 • 11:28am

Doomers have lost the AI fight

When Ilya Sutskever left OpenAI this week, the firm lost its last influential leader known to question CEO Sam Altman’s push to deploy AI fast.

ai, moat, openai • 2024-05-17 • 7:20pm

Marc Andreessen on navigating a model’s latent space via prompting.

ai, llm, prompting • 2024-05-17 • 2:00pm

Stability AI discusses sale

In the first quarter of 2024, Stability AI generated less than $5 million in revenue and lost more than $30 million, the report said, adding that the company currently owes close to $100 million in outstanding bills to cloud computing providers and others.

ai, moat, stability-ai • 2024-05-17 • 8:48am

glif lets you package your comfyui workflow into an app with no code.

ai, comfy-ui, nocode • 2024-05-15 • 2:50pm

Sam Altman on Ilya leaving OpenAI

ai, moat, openai • 2024-05-15 • 12:23pm

Hello GPT-4o

We’re announcing GPT‑4o, our new flagship model that can reason across audio, vision, and text in real time.

openai, llm, model, ai • 2024-05-14 • 8:22am

Mark Zuckerberg - Llama 3, $10B Models, Caesar Augustus, & 1 GW Datacenters - fascinating interview - the least robot-like I’ve ever seen Zuck, he’s getting that billion-dollar media training. Highlights include:

They got the edge in GPU race because in 2022 they realised they were short on GPUs for training their Reels recommendation system, so purchased double what they needed.
They foresee the bottleneck being energy production (not chips) both in the (regulatory) time and tech required to produce enough energy to power the chips
They have their own chips now, so they can lessen their reliance on more expensive Nvidia chips - they wont train llama4 with their own silicon but might train llama5

ai, meta, moat, llama • 2024-04-19 • 10:27am

Amazon pours additional $2.75bn into AI startup Anthropic

Extra financing will bring technology giant’s total investment in OpenAI rival to $4bn

ai, moat, anthropic, amazon • 2024-03-28 • 9:34am

Stability AI CEO resigns because you’re ‘not going to beat centralized AI with more centralized AI’

Stability AI, which has lost more than half a dozen key talent in recent quarters, said Mostaque is stepping down to pursue decentralized AI.

ai, stability-ai, moat • 2024-03-24 • 11:05am

Microsoft CEO on owning OpenAI, from Elon vs OpenAI lawsuit

Microsoft’s CEO boasted that it would not matter if OpenAI disappeared tomorrow. He explained that “we have all the IP rights and all the capability. We have the people, we have the compute, we have the data, we have everything. We are below them, above them, around them.”

ai, moat, microsoft, openai • 2024-03-22 • 1:45pm

Stable Diffusion 3: Research Paper

Stable Diffusion 3 outperforms state-of-the-art text-to-image generation systems such as DALL·E 3, Midjourney v6, and Ideogram v1 in typography and prompt adherence, based on human preference evaluations.

ai, image-gen, stable-diffusion, stability-ai • 2024-03-14 • 7:13pm

Today we’re excited to introduce Devin, the first AI software engineer.

Devin is an autonomous agent that solves engineering tasks through the use of its own shell, code editor, and web browser.

ai, code-assist • 2024-03-13 • 9:32am

Introducing the next generation of Claude

The family includes three state-of-the-art models in ascending order of capability: Claude 3 Haiku, Claude 3 Sonnet, and Claude 3 Opus.

ai, llm, claude, anthropic, model • 2024-03-06 • 6:23pm

Open LLM leaderboard - A Hugging Face space that compares and benchmarks various open-source Large Language Models in an open and reproducible way.

ai, llm, hugging-face • 2023-08-31 • 2:37pm

The last 12 months has seen explosive growth in open source (stable diffusion, llama, hugging face), open research and open datasets while the drive to censor/limit commercial/closed models makes them perform worse. Large companies appear not well suited to this pace such that there’s prediction that opensource will overtake closed source offerings.

ai, moat • 2023-08-11 • 10:29am

Let’s build GPT:from scratch, in code, spelled out - An in-depth video tutorial demonstrating how to implement a GPT model from scratch, providing step-by-step code implementation.

ai, llm • 2023-08-01 • 10:04am

Ice Ice MATRIX

ai, video-gen • 2023-07-29 • 11:08pm

AI Now Institute

The AI Now Institute produces diagnosis and actionable policy research on artificial intelligence.

ai, ethics • 2023-07-19 • 12:48pm

An explanation of model size including an introduction to model quantization.

ai, llm, hugging-face • 2023-07-06 • 12:42am

sounddraw - generate tracks with AI

ai, audio-gen • 2023-06-25 • 11:11am

gpt-engineer - A CLI tool that allows users to specify software in natural language and have AI write and execute the code, with capabilities for code generation experiments and improvements.

ai, code-assist • 2023-06-24 • 10:41pm

gpt4all

GPT4All runs large language models privately on everyday desktops & laptops

ai, llm • 2023-06-24 • 10:41pm