A multi-provider AI video generation platform that aggregates 9 avatar providers and 10 text-to-speech engines with real-time cost estimation, automatic failover, and provider health monitoring — always the best video at the best price.
9 avatar providers — 6 cloud (D-ID, HeyGen, Synthesia, Colossyan, Runway, Rephrase) and 3 local GPU (SadTalker, Wav2Lip, Roop). Switch providers per job without changing code.
Premium voices from ElevenLabs and Play.ht, voice cloning via Resemble.ai, cloud engines from Google/Amazon/Azure, plus free local options with Piper and Qwen3-TTS. 1,300+ voices.
See exactly what each provider will charge before generating. Compare costs side-by-side. Recommendation engine factors budget, quality, and speed.
Every provider health-checked every 30 seconds. Dashboard shows status, response times, and uptime. Degraded providers flagged before they cause failures.
When a provider goes down, jobs automatically fail over to the next best alternative. Factory pattern routes based on availability, cost, and quality.
Adding a new provider takes one Go file and one factory registration. Interface enforces Generate, HealthCheck, GetMetadata, and IsAvailable — plug-and-play.
Aggregates 9 providers behind a unified interface. Switch from D-ID to HeyGen to a free local model without changing your workflow. Provider lock-in eliminated by design.
Factory pattern with hot-swap: change providers per-job.
Every job shows exact cost before you hit generate — broken down by avatar provider and TTS engine. Recommendation engine suggests the best fit for your budget.
Save 40-60% by routing budget-friendly jobs to local providers.
SadTalker, Wav2Lip, and Roop run on your GPU — no API keys, no per-second charges. Piper and Qwen3-TTS provide free local voice synthesis.
Three free avatar + two free TTS engines included.
Single Go binary — fast compilation, tiny Docker images, sub-100ms API response times, and 100+ concurrent jobs. No Python dependency chains.
Go binary + HTMX frontend = deploy anywhere in seconds.
HeyGen and Synthesia produce the highest-quality talking-head videos. For lip-sync specifically, Wav2Lip (local, free) often matches cloud providers.
Yes. Three avatar providers and two TTS engines run entirely locally with no API keys and no per-use cost. GPU required for avatar providers; Piper runs on CPU.
Every provider implements a HealthCheck that runs every 30 seconds. When a provider goes down, the factory routes to the next best alternative based on your criteria.
Yes. Implement the Service interface (Generate, HealthCheck, GetMetadata, IsAvailable), create one Go file, register in the factory. Dashboard and monitoring pick it up automatically.
Generate AI demo videos with 9 avatar providers and 10 TTS engines. Real-time cost comparison, automatic failover.