Luma AI just released Ray 2, their latest video AI model, and the quality is matching Google’s Veo 2. I’ve been hunting for examples, and while Veo 2 still has an edge in terms of video length and resolution options, Ray 2 is fast and will be more accessible.
The model stands out for a few key reasons:
First, it handles both text and image inputs, unlike Veo 2 which is text2video or text2image2video right now. This means you can start with an existing image and extend it into video, opening up different creative possibilities.
Second, the multimodal transformer architecture shows impressive understanding of how objects and people should move and interact. The motion looks natural rather than the jerky or distorted movement common in other AI video models.
Third, Ray 2 is actually available now through AWS Bedrock, while Veo 2 remains waitlist-only. Luma AI has also shown strong customer service – they just issued full refunds to users affected by a recent Photon API issue, demonstrating good accountability.
The main limitations currently are the 5-10 second video length cap and 1080p resolution ceiling. But for quick social media content, product demos, and creative experiments, Ray 2 delivers professional quality results fast.
I’ll keep testing and comparing as both models develop. But right now, Ray 2 is the most accessible option for high-end AI video generation, with quality rivaling Google’s offering.
For more AI analysis and testing, check out my recent post on NVIDIA’s COSMOS model for physical AI simulation: https://adam.holter.com/nvidia-cosmos-a-7b-model-built-for-training-physical-ai/