Google on Monday introduced a new experimental artificial intelligence (AI) tool that can fuse images to generate unique outputs. Dubbed a whisk, it’s a funny tool that doesn’t have any major application other than its designated function. The Mountain View-based tech giant has recently released a number of interesting AI tools like Zenchess, which uses Image3 AI models to generate unique chessboard pieces. With Whisk, the company is demonstrating how AI can use just images as a prompt to generate unique art.
Google’s Whisk can ‘remix’ input images
In a blog post, the tech giant introduced the new AI tool. Whisk is currently only available in the US, and can be accessed through Google Labs, the company’s platform for releasing experimental tools built using native AI models. Like all other tools, Whisk is experimental and Google highlights that sometimes it may not perform the way users want.
AI image generators are quite common, however, as most of them accept either only text or a mixture of text and images as input. In short, image generation models require some degree of natural language cues to understand what to create. However, Whisk differs from such models because users can simply add images to inspire the model to generate outputs.
Whisk asks users to add three images – one each for subject, scene, and genre. Once added, the AI tool automatically processes the visual information to generate a unique image that is a combination of all three input images. Users can also simply add two images, one for the subject and the other for the scene, to generate output.
Google explained that behind the scenes, the Gemini model processes the images and writes a detailed natural language signal, which is then fed into the Imagen 3 model. The prompt is intended to capture the essence of the images and does not attempt to generate an objective mix of the input images.
Since Whisk is an experimental model, the images generated may differ from user expectations. To give users more control over the output, Whisk lets users refine and edit images after generation. Users can easily check the built-in signal written by Gemini and change it or add more information to achieve the desired result.
“We built it for intense visual exploration, not pixel-perfect editing. It’s about exploring ideas in new and creative ways, allowing you to work through dozens of options and download what you like,” Google said.
Follow Gadgets 360 for the latest tech news and reviews xFacebook, WhatsApp, Threads and Google News. For the latest videos on gadgets and tech, subscribe to our YouTube channel. If you want to know all about the top influencers, follow our in-house Who’sThat360 on Instagram and YouTube.
Microsoft CEO Satya Nadella pushes for Xbox games on all devices