Apple has released MGIE, an AI image editing tool that can make detailed edits based on text prompts
Apple’s MGIE AI tool for image editing was trained on multimodal large language models (MLLM).
Apple researchers have released an artificial intelligence (AI)-powered image editing tool called MGIE, which is capable of editing images using simple text prompts. MGIE, which stands for MLLM-Guided Image Editing, is capable of Photoshop-style edits, global optimisation, and local edits. The AI tool was released just a few days after Apple announced in its quarterly earnings call that it has been spending a “tremendous amount of time and effort” in generative AI. The image editing model shows an improvement on currently existing AI editing tools.
Researchers from Apple and from the University of California, Santa Barbara collaborated on the efforts to develop the tool. VentureBeat reports that the paper was presented at the International Conference on Learning Representations (ICLR) 2024. A preprint version of the research paper has also been hosted on arXiv.
The AI tool is capable of doing Photoshop-style edits which include cropping, resizing, rotating, adding filters, and more. It can also add global optimisation where it can alter the brightness, contrast, sharpness, colour balance, and even add generative elements to the image. Additionally, it can perform local edits where it adds, removes, or alters one particular object or element in the image.
To make an edit, users can simply write a plain text prompt such as “make the sky brighter” or “make the house bigger”, which is then interpreted as an image command and is used to increase the brightness by a certain percentage or increasing the size of the house by certain metric. Users can also provide more complicated and granular edits such as “adjust between the dark and light areas to bring out the details of the leaves and the tree trunk.” The more detailed a prompt is, the closer to the desired result it will get.
While AI-based photo editing tools such as Photoshop’s Generative Fill and under testing FireFly, Canva’s Magic Design, and Luminar Neo already exist, they all require the user to interact with the software to either map out the edit location or to make granular changes. Apple’s MGIE, on the other hand, can do the editing entirely on its own. It uses “instruction-based image editing” or “text-guided image editing”, which is made possible by taking a unique approach to artificial intelligence frameworks.
Instead of relying on the Generative Adversarial Network (GAN) framework, the AI model uses the diffusion model which is a more advanced architecture when it comes to realistic photo generation and instruction adherence. Next, the researchers shifted to using a multimodal large language model to ensure that it was capable of translating natural language into images and showing the desired effect. Further, human evaluators were also used during the process to rank the edits, and the feedback was used to further improve the model.
The tech giant has made the MGIE AI image editing tool available to download as an open-source project through GitHub. At present, it remains uncertain whether Apple intends to incorporate this technology into its devices. However, Apple CEO Tim Cook has hinted that the company will unveil generative AI features it has been developing later this year. Additionally, reports suggest that Apple is actively working on new AI-powered features for the iOS 18 update slated to arrive later this year.