Watching a bot attempt to fact-check
An open audit of Kind Raspberry Chickadee, an AI Community Notes writer

Watching a bot attempt to fact-check

Please support the work of the Poynter Institute and MediaWise. Tokens are expensive, but supporting journalism, a free society and a healthy information ecosystem is priceless.

New here? What is Community Notes — and why is a bot writing them?

Community Notes is X's crowd-sourced fact-checking system: contributors add context to potentially misleading posts, and a note is shown publicly only if people who usually disagree with each other both rate it helpful. In a recent pilot, X opened an API that lets approved AI "Note Writers" draft notes automatically — and bots now write roughly half of all notes on the platform. This page follows one of those bots, in public, as it works.

This page shows what an AI Community Notes writer does in public, every day, in detail. The bot writes under the Community Notes alias Kind Raspberry Chickadee — X assigns bird names to keep contributor identities pseudonymous (my human alias is Melodious Glacier Quail). It reads posts X surfaces, then looks for evidence that could fact-check the claim — fact-checks from IFCN signatories (PolitiFact), but also government data (BLS jobs reports, CBO scores), primary records (Congress.gov, agency filings), reporting from major newsrooms and even other X posts the current post may be misrepresenting. It submits a community note when it finds a clean match. It honestly declines most of the time.

A grey-headed bird with a raspberry-pink chest perched on a mossy branch surrounded by raspberries.
Kind Raspberry Chickadee as imagined by Google Gemini. Raspberry chickadees aren't real.

I was inspired by the excellent work of Alexios Mantzarlis at Indicator. I wanted to try to build a better bot writer that would actually address political misinformation. Especially ahead of the elections. That's why it has a narrow beat. And will probably have a low "helpfulness rating." But that's what we're testing! You'll see I also did some light fine-tuning of an open-source model and pitted the frontier models against each other to see which ones are the least terrible at this thing.

I built it with lots of button smashing with Claude Code. But also with the deep expertise I've developed (sadly) over the last FIVE years of note-watching. I run it. And I publish every step it takes here so other journalists, researchers and the public can see how this kind of system actually behaves.

Posts seen

Pulled from X's eligible-posts endpoint.

On-beat

About US politicians, federal policy, or election integrity.

Notes written

Drafted and validated against the source.

Submitted to X

In test_mode until the account earns in.

Data refreshes every time the bot runs (currently every two hours). Last update: .

The funnel

Where do posts drop out? Most never get past the first filter. That's by design — the bot's beat is narrow.

Step by step, what each stage means:


Why the bot declines

The bot says no a lot. Here's why, with counts:

The biggest two buckets are by design:

  1. Off-beat. Most of what X surfaces isn't about U.S. politics. The bot drops it before spending an Opus token on it.
  2. Picker: candidate doesn't match claim. The bot found a fact-check on a related topic, but it doesn't directly rate this post's claim. Rather than stretch, the bot returns nothing.

The other two are safety rails:

  1. Opus declined to write. The model was given evidence and decided the case wasn't airtight. This is what you want a fact-checking bot to do when in doubt.
  2. URL validator rejected. A safety check catches notes where Claude tried to write a URL itself. The bot only ever cites URLs that came back from a real search result; if the prose Claude produced contains any http, https, or domain text, the note is dropped. This is the structural fix for a hallucination problem an earlier version of the bot had.

Every note the bot has submitted


Recent refusals

For transparency, here are the most recent posts the bot looked at and decided not to note. Each one is a small judgment call.


Daily activity