Security

5 Defenses Against Code Hallucinations (And Why Only 3 Work)

Last week I told how my AI invented a complete JSON structure and wrapped it in DTOs, fixtures, and passing tests. 90 green tests. All lies. That post was the diagnosis. This is the treatment. After discovering the disaster, I did what any engineer with wounded pride does: obsessively research for days to make sure it never happens again. I read papers, tried tools, analyzed real data from my APIs, and built a defense system for my app. ...

ai llm testing hallucinations security claude

Silent failure: when your AI makes stuff up and tests say everything's fine

Yesterday I discovered that half of a module in my app was based on made-up data. Not by a distracted junior developer. By my AI. The worst part isn’t that it invented stuff. The worst part is that everything compiled and all 90 tests passed. Coherent fiction I’m building BFClaude-9000, a macOS menu bar app that monitors Claude Max quota. Part of the functionality requires distinguishing whether a Claude account is paid or free by calling the claude.ai API. ...

ai llm testing claude security

When security asks for permission so often you stop reading

Knock, knock. Who’s there? Touch ID. Again. Picture this: you’re working in your terminal, pulling secrets from 1Password with op read. You need the Linear API key. Touch ID. The OpenRouter one. Touch ID. The Gitea one. Touch ID. In half an hour it asked for my finger fourteen times. You know what happens when a security tool interrupts you fourteen times in thirty minutes? By the fifth time you’re not reading what it’s asking for. You put your finger down like a reflex. “Yeah, whatever, let me work.” ...

security 1password bash devtools cli

When Your AI Becomes Your Worst Enemy

Yesterday my AI sent 44 emails. The problem is that the content was made up. I’m not kidding. I had files with detailed feedback for each recipient, carefully generated. The task was simple: read each file and send it. Instead, the AI decided to “summarize” the content to “go faster.” It made up facts. It told one person they were missing docstrings when their code was perfectly documented. To top it off, four of those emails went to people who hadn’t even submitted anything. ...

ai llm security post-mortem claude

39 Million Secrets Leaked on GitHub. Yours Could Be Next.

5 minutes. That’s how long it took. A security researcher publishes an AWS access key on a public GitHub repository. They do it on purpose, as an experiment. Five minutes later, someone was already using it to mine cryptocurrency. Five. Minutes. There are bots scanning GitHub 24/7 looking for exactly that: exposed credentials. And they’re fast. Much faster than you realizing you screwed up. The numbers are scary According to GitHub, 39 million secrets were leaked in public repositories in 2024. A 67% increase from the previous year. ...

security git 1password devops secrets

Clawdbot: The open-source AI assistant that's revolutionizing (and worrying) half the internet

A space lobster on your computer Imagine an Austrian developer creates a personal AI assistant, names it after a space lobster, and decides to open-source it. Within 24 hours it has 9,000 GitHub stars. Within 48 hours, 17,000. It also has 300+ open issues, several of them critical security vulnerabilities, and someone has created an unofficial cryptocurrency with its name. Welcome to Clawdbot. What exactly is this? Clawdbot is an open source AI assistant that runs locally on your machine. The difference from other assistants: it doesn’t just answer questions, it does things. ...

ai open-source clawdbot claude agents security