Hermes Agent Now Does Reference-Image Editing Using Your Codex Login

Hermes Agent now supports reference-image editing using your Codex or ChatGPT login. Drop in a source image plus up to 16 reference images and it transforms them directly. No separate API key needed.

Hermes Agent Now Does Reference-Image Editing Using Your Codex Login

Here is a feature I have been waiting for. Hermes Agent now supports reference-image editing using your Codex or ChatGPT login.

You drop in a source image, add up to 16 reference images for style or composition guidance, and it transforms them directly. Not just text-to-image anymore, actual image-to-image editing.

Kshitij Kapoor, a Nous Research contributor, posted about it on July 2 showing the feature working. The response was immediate: 180+ likes and 14 replies in the first day, most of them from Hermes users who had been asking for this exact workflow.

What Hermes Agent Added

Before this, Hermes Agent could generate images from text prompts using the FAL.ai backend. That worked fine for one-off generation. But if you wanted to edit an existing image or use a reference for style guidance, you were stuck sending the image as a URL and hoping the provider handled it.

Now the flow is straightforward.

Put an image into Hermes and tell it what you want to change. Reference images get sent along with the request, and the backend (either your Codex/OpenAI connection or the native FAL pipeline) runs the edit. The result comes back as a generated image you can use immediately.

The feature is backed by two pull requests that have been in development since May: PR #21570 adds end-to-end reference image forwarding through both the Codex and FAL paths, and PR #35942 originally added the image-to-image capability to the image_generate tool.

No More Separate API Key

This is the part I like most. Previously, if you wanted image generation in Hermes, you needed a FAL.ai API key. Separate account, separate dashboard, separate billing.

Now your existing Codex or ChatGPT login doubles as your image editing backend. If you already have OpenAI credentials set up for Hermes (which most users do), you do not need anything else. The feature works with whatever image provider you already configured.

That matters for anyone running Hermes in production or as a daily driver. One less credential to manage, one less rate limit to watch, one less API bill to track.

How Many References

Kapoor confirmed the tool supports up to 16 reference images in a single request. That is a lot of guidance for one generation. In my testing with the underlying image_generate tool, I found the practical sweet spot is around 3 to 5 reference images for style consistency, but the upper limit means you can batch a full mood board if your project calls for it.

The number of reference images matters because it changes what you can do with the feature. Three references gives you style, color, and composition. Sixteen lets you define an entire design language in a single edit.

The Security Piece

This feature landed alongside some serious security hardening. PR #57698, merged on July 3, consolidated the credential guards across all image providers (OpenAI, OpenRouter, xAI, and FAL) into a single shared chokepoint. The fix ensures that when Hermes reads a local image file to send to the provider, it cannot accidentally leak credential data from other files.

It is the kind of infrastructure work that does not show up in feature announcements but makes the feature safe to use. I flagged this in my coverage of the v0.18.0 Judgement Release. The team has been closing security issues aggressively, and this continues that pattern.

What This Means for Daily Use

For anyone running Hermes as their main AI interface, this fills a real gap. The previous text-only image generation was useful for quick visualizations, but it could not do the thing users actually need: take an existing image and make it better.

Now you can:

  • Pass a product photo and 4 reference images showing a specific aesthetic for consistent branding.
  • Drop in a wireframe screenshot with style references to generate polished UI mockups.
  • Use an initial AI generation as the source image for a second pass with more specific reference guidance.

I will note the same caveat that applies to any AI image tool: the quality depends on the backend model. With the Codex/OpenAI path, you get DALL-E 3 generation quality. With FAL, you get FLUX 2 Klein 9B which has different strengths in realism vs. stylization. Your mileage varies by provider.

Bottom Line

Hermes Agent adding reference-image editing through your existing Codex or ChatGPT login is the kind of feature that turns a tool from something you try into something you keep. The credential consolidation, multi-provider support, and high reference count make it practical for real workflows, not just demos.

The feature is live now. If you use Hermes, your image tool already supports it. Give it a prompt with a source image attached and see what comes back.

Tony Simons

Reviewed & Written By

Tony Simons

Independent tech reviewer and creator of Tony Reviews Things. 14 years of hands-on testing, software auditing, and workflow automation. I test the gear so you don't waste your money on junk.

Submit a Take

Your email address will not be published. Required fields are marked *