Lucy Edit is Decarts instruction-guided text to video editor. It takes a source clip, follows a natural language prompt, and applies edits while keeping motion and composition. Community tests point to two options: a dev build at roughly $0.03 per second at 720p, and a Pro tier around $0.15$0.30 per second for 480p$720p. The standout use today is clothing and character edits that hold up across frames. Right now, outputs are limited to 16:9.
What Lucy Edit actually does well
Instruction guidance is the core idea: you describe the change, and the model applies it. You do not need to hand-draw masks or fine-tune on your clip. The model tracks motion and tries to apply the change consistently across time.
- Clothing and accessory changes are the strongest path. Think outfit swaps, adding glasses or a hat, or replacing a jacket with something more formal.
- Character replacement is viable when the replacement fits the same rough scale and pose. People to people, or to animals or stylized characters, can work.
- Object swaps can work when the item matches size and context. A mug swapped to a bottle, a phone swapped to a camera, etc.
- Global scene edits are possible, but push the model harder. Expect more identity drift when you change backgrounds or environments for the whole clip.
On prompt style, 2030 descriptive words tend to be stable. Verbs like change, add, replace, transform to help. You are guiding a change, not rewriting the entire scene from scratch.
Pricing, tiers, and what to expect
There are two clear paths to try:
- Dev build: about $0.03 per second at 720p. Open-weight for non-commercial use, with 5000 free credits to get started. Good for trying ideas and tooling.
- Pro tier: about $0.15$0.30 per second at 480p$720p. Paid API aimed at production use with better reliability and priority. Inputs are base64 video plus a text prompt, outputs are MP4 H.264.
Aspect ratio is currently fixed at 16:9 across deployments. If you need square or vertical outputs, you will be cropping or padding after the fact. For temporal consistency, 81frame generations are a good default.
Dev is budget-friendly for tests. Pro is priced for production time savings.
Cost planning for real clip lengths
Here is the quick math for typical short clips you would edit for marketing or social units:
- 10second clip: Dev $0.30, Pro $1.50$3.00
- 20second clip: Dev $0.60, Pro $3.00$6.00
- 30second clip: Dev $0.90, Pro $4.50$9.00
The pattern is clear: use Dev for prompt discovery and batch ideation. When you have the look locked, move the selected shots to Pro for production reliability.
Where Lucy Edit fits next to generators
If you need a fresh clip from an image or text, use a generator. If you need an edit that keeps the original motion, use an editor. That split matters. I covered Decarts Lucy-14B for fast imagevideo drafts here: Lucy-14B on Fal.ai: UltraFast ImageVideo For Drafts, Not Finals. Lucy-14B is about speed for quick ideation. Lucy Edit is about surgically changing a shot you already like without losing motion. If you produce social or marketing content in volume, you will likely use both patterns in the same pipeline.
If you want a pure generator at higher fidelity or vertical 1080p, see my notes on Googles launch here: Google Veo 3 Goes General Availability. Different tool, different job.
What the model is built on
Lucy Edit Dev runs on Wan2.2 5B with a high-compression VAE and a DiT stack. That gives you a compact latent space for the video and a diffusion transformer driving the edit. The Pro tier is exposed through a hosted API and returns MP4 with H.264, which fits standard video pipelines.
Dev is open-weight for non-commercial use. You can test locally where you have the right hardware, or call a hosted endpoint with the free credits to validate your workflow. The team plans Diffusers and ComfyUI support, local inference nodes, and fine-tuning scripts. The obvious target is letting teams wire this into their existing nodes and schedulers rather than waiting on a vendor UI.
Prompts, masks, and workflow notes
You can skip manual masks for the common cases. The model does its own tracking and applies the change through the clip. When you need precision, start by tightening the language before you reach for masks. The model tends to respond to direct verbs and concrete nouns.
Prompting Cheatsheet
- Length: 2030 words. Enough detail to anchor the edit without drifting.
- Verbs that work: change, add, replace, transform to.
- Be specific on garment type, color, material, and fit. Example: replace the red hoodie with a black leather jacket, slim fit, minimal shine.
- Character swaps: describe the new subject clearly. Example: replace the man with a golden retriever wearing a blue bandana, same position and size.
- Objects: match scale. Example: replace the ceramic mug with a tall stainless bottle, matte, same size.
- Global scenes: expect more identity drift. Keep a backup of the original for safety.
- Temporal settings: target 81 frames for consistent motion.
Shot selection and setup that help
- Keep the subject reasonably centered with clear separation from the background. That reduces chances of identity drift during wardrobe swaps.
- Avoid extreme motion blur. The model can preserve motion, but heavy blur makes the target ambiguous.
- Hold exposure steady. Rapid exposure swings complicate consistency across frames.
- For character replacement, match the incoming character to the source subjects scale and rough posture.
Strengths and limits to plan around
From public tests and docs, here is how to plan in practice:
- Best use case: wardrobe and character changes that must stick through motion. Short ads, social clips, creative mockups, A/B variants for outfits.
- Good fit: moderate object swaps where size and context match the source.
- Works, with caveats: whole-scene transformations. You can change the background, but identity may drift.
- Mixed: subtle color corrections on the same object. These can be hit or miss if you only change hue and leave everything else the same.
- Known quirk: add-on props may cling to the subject more than intended. Plan to refine prompts or try a slightly different composition.
- Framing: locked at 16:9 for now. If your workflow is vertical, plan to letterbox, crop, or run a separate post-process.
Dev vs Pro: when to pick each
Dev is the practical start for anyone building a pipeline or testing use cases. It is cheap, open-weight for non-commercial use, and you have enough free credits to see if your prompts and clips work. Pro is for paid production across teams, where a hosted API, priority, and stability matter. Here is a simple way to decide:
- Prototyping prompts, building a UI, or validating edit categories: use Dev.
- Delivering a client cut this week: start with Dev to lock the look, then run the final on Pro.
- Batch editing for social inventory: Pro if uptime and queue priority matter to your team.
On cost, the math is straightforward. At $0.03 per second, a 12-second edit on Dev is about $0.36. The same on Pro sits in the $1.80 to $3.60 range depending on resolution and tier. If you are pushing many variants, Dev is the draft engine. If you are approving a final, Pro is the steady path.
Integration notes
Lucy Edit Dev is available as open weights for non-commercial testing, and it is also hosted with a simple API. The Pro API expects base64-encoded video plus a text prompt, and returns an H.264 MP4. That means it fits into a typical encode pipeline, and you can keep output handling in your current stack. The model is already available through common inference platforms such as Fal.ai endpoints.
Once Diffusers and ComfyUI nodes land, local scheduling becomes straightforward for teams that prefer on-prem or hybrid setups. If you want a separate low-latency sandbox for motion ideas and personalized style training, see my notes on Kreas Realtime Sculptd-Video. That is a different tool, but it pairs well with editing passes after you commit to a base take.
Workflow recipes
- Fast ideation to edit: generate a rough clip with an imagevideo model, shortlist takes, then run Lucy Edit for wardrobe consistency across the best shots. See the Lucy-14B writeup for draft speed: Lucy-14B.
- Liveaction cleanup: take a wellshot base clip, run a few prompt variants to test outfits or character swaps, then pick a winner and run Pro for your final export.
- Vertical deliverables: accept the 16:9 output, then crop centerweighted to 9:16. If critical action is offcenter, consider letterboxing and dynamic reframing in your NLE.
Quality guardrails
- Run multiple seeds for the same prompt; pick the take with the least artifacting on edges and hair.
- Keep edits semantically close to the source. A leather jacket for a hoodie is stable. Turning a hoodie into a reflective armor suit is riskier.
- If an accessory sticks to the subject unnaturally, reformulate the instruction with position hints like near left ear, small, thin metal ring, low gloss.
- For global backgrounds, describe lighting and depth. Example: replace background with a softfocus city street at dusk, warm lights, shallow depth of field.
Model design and why it matters for editors
Wan2.2 5B with a high-compression VAE and a DiT backbone is a sensible choice for an edit model. The VAE pushes frames into a compact latent space, so the edit is operating on compressed features rather than raw pixels. The DiT handles the temporal sequence, which is where naive editors usually fall apart. That is why the model preserves motion better than most inferencetime edits. Keeping the subject identity while changing outfits is where the training signal shows up.
Licensing and where to start
Lucy Edit Dev ships under a Non-Commercial License v1.0. That fits academic work, weekend projects, or internal prototyping. Commercial work belongs on the Pro tier. If you want to kick the tires, the fastest path is the dev model card on Hugging Face: decart-ai/Lucy-Edit-Dev. From there, you can grab the weights for local runs or point a test harness at a hosted endpoint and try the 5000 free credits.
Bottom line
Lucy Edit is a practical text-to-video editor with clear strengths. It is strongest on wardrobe and character changes that keep motion intact. It is affordable in dev, priced for production in Pro, and it avoids the usual friction of masks and manual tracking. The main constraint is the fixed 16:9 output and the limits around global edits. If your workflow is heavy on variant creation and you already have wellshot base clips, this is worth slotting in immediately. If you need vertical, 1080p, or new shots from stills, pair it with a generator like Lucy-14B for drafts or a higherfidelity option covered here: Veo 3 GA. Use the right tool for each step: generate ideas upstream, then use Lucy Edit to keep motion and identity while you change what matters.