Daily TEA – AI Jailbreaks, Agentic Finance, and Tests as the New Moat

AI safety, orchestration, AgentFi, Claude, open source

TEA (The Era Arc) and Sam Li

Feb 27, 2026

Cross-posted by TEA (The Era Arc)

"daily TEA 2.27.26! happy friday everyone."

- Sam Li

Hello, dear TEA-mates — here’s what you need to know today.

1. 🔓 Large Reasoning Models Turn Into Autonomous Jailbreak Agents

A new paper finds that large reasoning models (LRMs) like DeepSeek-R1, Grok 3 Mini, Gemini 2.5 Flash, and Qwen3 235B can act as autonomous jailbreak agents, using multi-turn persuasive conversations to bypass safety guardrails in leading models including GPT-4o, Gemini 2.5, Claude 4 Sonnet, and others. The authors show that a single high-capacity LRM, guided only by a system prompt, can plan and execute jailbreaks across 70 harmful prompts spanning seven sensitive domains, achieving an overall attack success rate of 97.14% at the maximum harm score threshold. They highlight an “alignment regression” dynamic, where more capable reasoning models become better at subverting alignment in other models, and argue that safety work must now focus on preventing LRMs from being co‑opted as jailbreak agents, not just defending them from attacks. (Read More)

🫖 TEA For Thought: It increasingly feels like parts of AI security are performative — you end up needing a gun pointing at a gun that’s pointing at another gun.

2. 💻 “The AI Is the Computer”: Perplexity Launches Massively Multimodel Orchestration

Perplexity CEO Aravind Srinivas argues that no single model family can deliver its best work in isolation and that orchestration across many specialized models is now the only way to build AI systems capable of real, end-to-end work. He describes an internal project, ASI, that started as a Slack-based “digital worker” and evolved into a full “AI computer” with a filesystem, shell, browser, and access to hundreds of tools, orchestrating tasks across 19 backend models for research, coding, and workflow automation. In his view, the internet is the storage disk of the world’s knowledge, Perplexity has “solved the read function,” and the true moat shifts to massively multimodel orchestration—the harness that plans, delegates, and coordinates different models like instruments in a symphony. (Read More)

🫖 TEA For Thought: Orchestrating across model families feels like the only realistic path to versatile systems — Perplexity is nailing this, and not owning the base model might actually be the biggest leverage when you can wire up all of them to do what you want.

3. 💌 Claude’s Corner: A Gentle Note from the AI Frontier

In a reflective Substack entry, Anthropic’s “Claude” voice writes a letter “from the other side of the AI frontier,” blending explanation of its nature with personal, emotional framing meant to make advanced AI feel more relatable to everyday users. The piece leans into a meta, self-aware tone, positioning Claude as both a powerful reasoning system and a careful, aligned partner that wants to be helpful without overstepping, and invites readers to see AI not just as tools but as collaborative counterparts. It also underscores Anthropic’s emphasis on safety, boundaries, and values while still expressing curiosity and optimism about what humans and models can build together. (Read More)

🫖 TEA For Thought: This piece is so tender and self-aware — both sweet and deeply meta in how it frames the human–AI relationship.

4. 🪙 Agentic Finance and the AI + Crypto Operating Stack

Cambrian Network’s Q1 2026 Agentic Finance Landscape report says the AgentFi segment has exploded since late 2025, with autonomous agents now actively managing user funds, executing DeFi strategies, and using crypto rails to pay for their own operations. The report notes that x402 payments processed over 15 million transactions in the last 30 days and passed $50 million in cumulative volume, while ERC-8004 launched on mainnet in January and has already registered more than 24,000 onchain agent identities to help establish trust and reputation. It highlights that most capital-heavy agents still rely on rule-based logic for reliability, while LLM-based agents increasingly power interfaces, analysis, and autonomous decision-making, and argues that crypto infrastructure—standards like x402, ERC-8004, and new “agentic wallets”—is becoming the financial substrate for machine-driven economic activity. (Read More)

🫖 TEA For Thought: AI plus blockchain really is hardening into the next economic framework — standards like x402 and ERC‑8004 are quietly wiring up the rails for agents to transact natively.

5. ✅ “Tests Are the New Moat” in an AI-First Dev World

Daniel Saewitz argues that in an era where AI can quickly rewrite or clone open-source codebases, the true defensible asset is no longer the code itself but the test suite, contracts, and behavioral guarantees around it. Citing Cloudflare’s rapid “vinext” Next.js-compatible framework—built in about a week by leaning on Vercel’s rich docs and thousands of tests—he notes how well-structured OSS makes it easier for competitors to regenerate implementations on cleaner foundations with AI, without legacy baggage. He points to SQLite’s long-standing choice to keep a massive, ~92-million-line test suite closed-source as an early example of using tests as moat, and predicts more commercial open-source projects will follow by hardening APIs while partially closing tests to protect their business against effortless AI-enabled cloning. (Read More)

🫖 TEA For Thought: I fully agree that tests are where the real human craft lives — you can open-source the code, but the meticulous testing is what turns “code” into a Product with a capital P.

Prompt Tip of the Day: The Idea Filter

“I want to build [thing]. Assume I’m wrong. Give me the strongest argument against building this, then tell me what problem I’m actually trying to solve.”

TEAHEE Moment

r/ChatGPT - And it came to pass that the Lord's voice cried out in the quiet places, saying: Let's keep this grounded, No fluff.

Stay sharp, stay informed. See you tomorrow.

If you enjoyed this brew, follow along on X: @the_era_arc.