OpenAI Beefs Up ChatGPT’s Image Generation Model

OpenAI Beefs Up ChatGPT’s Image Generation Model

OpenAI launched a new image generation AI model on Tuesday, dubbed ChatGPT Images 2.0. This model can generate more than one image from a single prompt, like an entire study booklet, as well as output text, including in non-English languages like Chinese and Hindi. This release is available globally for ChatGPT and Codex users, with a more powerful version available for paying subscribers.

When any major AI company releases a new image model, it can revive interest and boost usage, especially if social media users adopt a meme-able trend, transforming images of themselves. Last year, Google’s launch of the Nano Banana model was a major moment for the company, especially when users started posting hyperrealistic figurines of themselves online. Earlier this year, ChatGPT Images made waves on social media as users shared AI-generated caricatures.

AI-GENERATED BY OPENAI

What’s Different?

Since the new model can tap into ChatGPT’s “reasoning” capabilities, Images 2.0 can search the internet for recent information and generate more than one image at a time. In essence, the bot can use additional steps to output more thorough generations from a single prompt. Images 2.0 also has a more recent knowledge cutoff date: December 2025.

This also means that outputs from the new model are more granular. For example, I generated an infographic with San Francisco’s weather forecast for the next day, as well as activities worth doing. The image ChatGPT generated included accurate weather details for the rainy day, along with accurate-looking drawings of the Ferry Building, Castro Theater, Painted Ladies houses, and Transamerica Pyramid.

Additionally, Images 2.0 is more customizable for users who want unique aspect ratios for image outputs. The new model can generate images ranging from 3:1 wide to 1:3 tall, and users can adjust the image’s size as part of their prompt to the AI tool.

First Impressions

After a few hours of generating images with the new model, I was generally impressed with the text rendering capabilities, in English at least. Not that long ago, image outputs featuring text, from any of the major models, often included numerous malformed characters or words with errant extra letters. ChatGPT struggled to label images accurately two years prior, so the cleaner, more complex outputs from Images 2.0 are a sign of continued improvement. Google has also focused on improving image outputs featuring text in its recent iterations of Nano Banana.

AI-GENERATED BY REECE ROGERS

Testing the outputs in different languages from the new model, I asked ChatGPT to generate a Timothée Chalamet–themed collage poster, as if it were crafted by someone from his Chinese fan base. The output featured an assortment of photorealistic-looking images of the movie star, some showing him dressed in traditional clothes or with cat ears drawn on. The AI collage was maximalist in its details, with over 20 different snippets of text, as well as images of a dumpling, a cup of boba, and a panda.

I don’t speak the language, so I nudged the bot for a translation, with a basic prompt: “What does that text say?” ChatGPT’s response was critical of its own output.

“A lot of it is fake, or semi-gibberish AI text dressed up to look like Chinese meme-poster writing, so it does not all cleanly translate,” read its output, in part, before ChatGPT went through a list of what looked accurate and what looked off. “There are also a few bits that are clearly malformed or mixed with Japanese-looking characters, like the checklist card and some decorative lines on the right. Those are mostly nonsense made to resemble East Asian fan-edit text rather than accurate sentences.”

AI-GENERATED BY REECE ROGERS

So, while the new ChatGPT Images model performed well in my initial tests when generating text in English, I’m unsure whether users around the globe will have similar results when generating in their own languages. Though with OpenAI’s strides in improving its English outputs for AI images, I wouldn’t be surprised if more users’ data from around the world combined would lead to additional improvements in future iterations of this model.

Related posts

Mozilla: Anthropic’s Mythos found 271 zero-day vulnerabilities in Firefox 150 – Ars Technica

Google’s new Deep Research and Deep Research Max agents can search the web and your private data

Rumor: Street Fighter 6 Season 4 DLC character roster potentially leaks online including big Final Fantasy crossover – EventHubs