Every model failed this benchmark

Plus, ✨Google's TurboQuant compresses AI memory 6x with zero accuracy loss, how to run tasks on your computer from your phone with Claude, and more!

Hola Decoder😎

If someone forwarded this to you and you want to Decode the power of AI and be limitless, then subscribe now and Join Decode alongside 30k+ code-breakers untangling AI.

🧪 Every Frontier AI Model Just Scored Under 1% on a New Intelligence Test

The ARC Prize Foundation launched ARC-AGI-3, a benchmark that tests whether AI can learn on the fly by solving puzzles in unfamiliar environments. Every major model failed. Every human passed.

The Decode:

1. All Top Models Scored Near Zero - Gemini 3.1 Pro scored 0.37%. GPT-5.4 hit 0.26%. Claude Opus 4.6 managed 0.25%. Grok 4.2 scored 0%. Meanwhile, 100% of human testers solved every environment on their first attempt with no instructions.

2. This Isn't the Same as Previous Benchmarks - On ARC-AGI-2, these same models score between 65-85%. ARC-AGI-3 is fundamentally different. It tests 135 novel interactive environments where agents must adapt in real time, not pattern-match from training data.

3. The Methodology Is Controversial - Scoring uses a squared efficiency penalty that critics say is designed to produce low numbers. Extended-thinking models were excluded. Some argue the benchmark is unfair by design.

ARC founder François Chollet's core argument: today's models only perform well when humans build scaffolding around them. If it's truly AGI, there should be no human in the loop. With OpenAI renaming its division "AGI Deployment" and a $2M Kaggle prize now live, this benchmark is a direct challenge to the industry's biggest claims. You can play it here

Together with Grapevine

📉 Consumers Are More Media-Savvy Than Ever. Here's What Still Converts.

Today's buyer knows the difference between a brand speaking for itself and a credible voice speaking independently. Ads from branded handles are easy to discount. 

A clinical pharmacist reviewing scrubs. A veterinarian recommending a supplement. A makeup artist breaking down an ingredient. That's harder to scroll past.

Grapevine works across some of the most trust-dependent categories - GLP-1s (Futurhealth), telehealth (Alloy), finserve (Better), and DTC (Fabletics, Particle for Men, Arrae) precisely because expert creator voice and publisher advertorial move audiences that branded creative can't.

  • Just Food for Dogs scaled Grapevine assets from 15% to 45% of paid media in 6 months
  • Madison Reed unlocked 20% efficiencies over Target CPA and 50% higher LTV
  • Mathnasium cut Meta CPL by 33% in under 30 days

The brands winning right now are running both creator whitelisting and publisher advertorial whitelisting at the same shop, as one fully managed service. 

No platform juggling. No separate agency relationships. Brief to launch, handled.

👉 Book a free strategy call for your first campaign strategy session - no commitment required.

✨How to Run Tasks on Your Computer From Your Phone With Claude

Text a task. Leave your desk. Come back to finished work. Here's how to set it up.

Step 1: Enable Claude Computer Download the Claude desktop app at claude.ai/download. Go to Settings > Desktop App > General > turn on Browser Use > turn on Computer Use. Pro ($20) or Max ($100) plan required.

Step 2: Set up Dispatch Open the Claude desktop app, click Dispatch in the left sidebar, and connect your phone (iOS or Android). This links your phone to your desktop session.

Step 3: Send a task from your phone Text Claude exactly what you need. Be specific — which file, which app, which meeting. Vague instructions produce vague results.

The magic prompt: "Hey Claude, [do this task]. I'm away from my desk."

If Claude can't use a connector like Slack or Calendar, add: "Use my computer directly."

Step 4: Let Claude work Claude opens apps, finds files, switches tools, and completes the task using your actual machine, not copies. It asks permission before touching anything new.

Step 5: Come back to finished work Add "Text me when it's done" to get a notification. Review the output on your Mac when you're back.

Things worth trying: export a deck as PDF and attach it to a calendar invite, pull metrics into a weekly report, organize your Downloads folder, or update your calendar from Slack messages. You can try Claude Computer here.

⚡ Google's TurboQuant Compresses AI Memory 6x With Zero Accuracy Loss

Google Research introduced TurboQuant, an algorithm that shrinks the memory AI models use during long conversations by over 6x while losing almost no accuracy. It also speeds up processing up to 8x on Nvidia H100 chips.

The Decode:

Why This Problem Matters - AI models keep a running log of every conversation. As chats get longer, that storage balloons, slowing responses and driving up costs. TurboQuant compresses that log down to 3 bits without any retraining or fine-tuning.

It Scored Perfectly on the Hardest Tests - On needle-in-haystack benchmarks, which test whether a model can find one detail buried in massive text, TurboQuant lost zero accuracy. It also delivered up to 8x faster processing with no extra runtime cost.

It Beats Rivals in Search Too - TurboQuant outperformed existing methods in vector search, the technology behind semantic matching in search engines. It achieved better recall without needing dataset-specific tuning that competitors rely on.

The paper, set for ICLR 2026 in April, has implications for anything running on large-scale vector infrastructure. Faster search, cheaper inference, and longer conversations without degradation. For Google-scale systems, this is foundational efficiency work.

Together with Playbook Pro

Unlock Hidden Cash to Add Months of Runway

Growth isn’t about chasing new funding; it’s about making every dollar move faster.

Leading DTC brands have gained months of extra runway by simply reshaping how cash enters and exits their business.

Financial Alchemy is your practical guide packed with real operator insights and ready-to-use templates. Inside, you’ll learn how to:

💰 Flip your cash cycle so revenue arrives well before expenses, real examples show brands securing payment weeks ahead of supplier deadlines.

📊 Stress-test growth plans with the Cash Flow Simulator to uncover an average of $80K in trapped working capital each quarter.

🧮 Speed payback with the Unit Economics Analyzer, cutting CAC recovery to as little as five weeks.

🤝 Negotiate supplier terms using the Supplier Toolkit, with tactics that added 12% operating margin in a single contract cycle.

Get your ebook and turn everyday cash flow into your most dependable growth engine.

🏆 Tools you Cannot Miss:

💡 Painkiller Ideas - Discover real pain points, validate them with AI research, and follow a proven playbook from idea to revenue.

🏠 Pedra - Instantly stage empty properties with AI in seconds to create realistic, buyer-ready visuals.

🔍 Parse - Deploy autonomous agents that audit your tools like Stripe and GitHub to uncover hidden risks your team missed.

🗺️ Funizy - Plan your perfect day in any city with personalized itineraries tailored to your style.

🎬 Plot Party - Generate high-quality storyboards with consistent characters, styles, and scenes in under five minutes.

🚀 Quick Hits

🔦 Kimberly-Clark, LG, and Burberry are not smarter than your team, they just stopped relying on tools that cannot hear. Syncly Social catches spoken brand mentions, untagged creator placements, and competitor trend signals inside video audio across TikTok, IG, and YouTube. See what you have been missing and get started free.

🏢 Meta is laying off hundreds of employees across multiple teams as it ramps up massive investments in AI infrastructure, shifting focus away from metaverse initiatives toward its long-term AI strategy.

🎬 Disney’s major bets on AI and the metaverse are facing setbacks, as its Sora partnership collapses and Epic’s metaverse plans stall, raising doubts about its future in emerging digital experiences.

🍎 Apple’s deal with Google gives it access to Gemini for training smaller, efficient AI models, allowing Apple to build optimized “student” models tailored for on-device performance.

🤖 Reddit will require accounts with suspicious, bot-like behavior to verify they’re human through methods like biometrics or passkeys, alongside new labels to identify registered automated profiles.

🔊 Mistral, a French AI company, launched Voxtral TTS, an open-source speech model that supports nine languages, enables realistic voice cloning from short samples, and delivers fast, real-time performance for enterprise voice applications.

🧩 Prompt of the Day

Form Field Reduction Strategy

Simplify forms to reduce friction, increase completion rates, and improve conversions.

Paste the prompt: Drop this into ChatGPT and fill in your form details.

Prompt to paste

Create a form field reduction strategy for [Insert signup or checkout flow]. Include:

Current Fields: [List all existing form fields]
Necessary vs Optional: [Mark each field as required or optional]
Reduction Plan: [Suggest fewer fields and simplifications for faster completion]
Use Case: Improve form completion rate, reduce friction, and increase conversions.

🤳AI Nugget of the Day

Thanks for Decoding with us🥳

Your feedback is the key to our code! Help us elevate your Decode experience by hitting reply and sharing your input on our content and style.

Keep deciphering the AI enigma, and we'll be back with more coded mysteries unraveled just for you!