Twitter/X

@dr_cintas: The Gemini Omni API looks insane 🤯 Agents can finally take video as an input, reason over it, and h...

The Gemini Omni API looks insane 🤯

Agents can finally take video as an input, reason over it, and hand back an edited scene where everything you didn’t touch stays exactly where it was.

Here are 3 things built with it👇

→ Landscaping proposal from a customer’s lot video. They submit footage, the agent renders the full transformation.
→ Animated professor that explains your dashboards after running the analysis.
→ 8-bit morning briefing. Calendar, email, and market intel turned into a sidescroller of the goals you’ll clear today.

Video was the input agents ignored. That just ended.

hyperagentpartner

Video

Hyperagent (@hyperagentapp)

We got early access to the Gemini Omni API

Google is calling this model "Nano Banana for video"

3 things we built with it👇

  1. Landscaping proposal from customer video. Customer submits a video of the current lot. Hyperagent designs and renders the transformation. Perfect realism, no surreal changes to the surroundings.

  2. Animated professor who explains your dashboards. Hyperagent runs analysis on a business question, then generates an explainer video to walkthrough the findings.

  3. 8-bit morning briefing. Hyperagent builds morning briefings based on calendar, email/chat, and market intel. Then generates a sidescrolling platformer video showing the goals you'll clear today.

Our take:
> Video has been a vastly underused input for agents. This is the first model to make video directly malleable to our agents.
> Get more imaginative with your outputs. How could a video artifact make the work more memorable or playful?

coming soon to Hyperagent

Video

— https://nitter.net/hyperagentapp/status/2067631028328419492#m