The words AI NEWS printed in black sans serif font with a large red X over them on a pure white background

AI Models Are Weirdly Bad at Knowing About AI: The VideoPoet Anecdote

Someone shared a link to VideoPoet, a Google Research project from December 2023. The response? “lol this one is old I remember it.” And then the real observation came out: people try to find AI news by asking AI, and the AI consistently recommends outdated stuff.

This is the actual story. Not VideoPoet itself. The meta failure: modern AI systems are weirdly bad at staying current on their own ecosystem. GPT-5 thinking it’s based on GPT-4. Gemini not knowing what Nano Banana is. AI knows nothing about AI, and that’s strange when you think about it. You’d expect these companies to at least put information about their own models in the system prompt.

What VideoPoet Actually Was: An LLM for Video

Since someone’s AI assistant surfaced it as “news,” let’s cover what VideoPoet actually is. Google Research launched it in December 2023 as an LLM-style approach to zero-shot video generation, not diffusion-based like most other video models at the time. This was a significant technical move because it showed that the highly successful autoregressive transformer architecture used for text could be adapted for high-fidelity motion video.

The core idea is straightforward: take any autoregressive language model and turn it into a video and audio generator by feeding it discrete tokens instead of text. They used MAGVIT V2 as a video and image tokenizer and SoundStream as an audio tokenizer. All modalities—text, image, video, and audio—become sequences of tokens in one unified vocabulary that the LLM can predict. This single-model, multi-modal approach was designed to handle a wide range of tasks without needing separate models bolted together.

VideoPoet trains a single autoregressive LLM on a mixture of multimodal objectives, making it a multi-task tool right out of the box:

  • Text-to-video and Text-to-image
  • Image-to-video animation
  • Video frame continuation (to extend clips)
  • Video inpainting and outpainting (editing)
  • Video stylization (using text prompts to change visual style)
  • Video-to-audio synthesis

Because everything is just tokens, you can compose tasks for zero-shot behaviors like generating audio from text alone. The default output is short, high-quality 2-second clips, but for longer videos, it uses a simple autoregressive trick: condition on 1 second of video, predict the next 1 second, and loop. This method helped it maintain strong identity preservation better than prior work.

The Meta Problem: AI Still Doesn’t Know AI

VideoPoet is now 18 months old. The video generation space has moved on significantly. We’ve had Runway Gen-4.5, Kling, and multiple other major releases. Recommending VideoPoet as current news is like recommending GPT-3.5 as the latest from OpenAI.

The real issue is that AI models have a blind spot for AI news. This makes sense if you think about it from a training data perspective. Knowledge cutoffs mean models don’t know about recent releases. But the problem goes deeper than that. Models often don’t even know accurate information about products from their own companies. This is where the irony hits hardest: Google’s own model surfacing an outdated Google research project.

It feels like a solvable problem. If you’re OpenAI, put information about your current model lineup in the system prompt. If you’re Google, make sure Gemini knows what Google products exist. This isn’t rocket science. It’s basic product awareness. The fact that this is a persistent issue suggests two things: either the labs haven’t prioritized keeping their models current on their own ecosystem, or the internal complexity of managing real-time data integration is worse than we think.

ChatGPT’s tendency to surface outdated AI news is especially bad if you don’t have personalization hyper-tuned for that use case. Most users don’t. They ask a question, get an answer that sounds confident, and assume it’s current. This is a trust problem. If I ask ChatGPT about the latest developments in video generation and it tells me about a project from December 2023 without mentioning that it’s old, I’m getting actively misled. The model doesn’t know what it doesn’t know, and it doesn’t know to caveat its recommendations with “this might be outdated.”

The Necessity of Human Verification

For staying current on AI, you really can’t rely on AI itself. You need human-curated sources, newsletters, or following the right people on social media. The models are perpetually behind, and they don’t tell you that they’re behind. This limitation is a crucial lesson for anyone relying on LLMs for competitive intelligence or technical research.

This is a known limitation that the major labs haven’t prioritized fixing. They could. Real-time information retrieval exists. Web search integration exists. But the integration is often clunky, and models frequently default to their training data instead of searching for current information.

Until this gets fixed, treat AI recommendations about AI with skepticism. Check dates. Verify with other sources. And definitely don’t assume that because a model sounds confident about a recommendation, it’s actually current. The irony of AI being bad at AI is funny, but it’s also a real limitation that affects how useful these tools are for people trying to stay informed about the space. I’ve written before about the importance of good systems for AI content, and this meta-failure of AI-on-AI knowledge is another example of why you can’t just trust raw model outputs without verification.