Yazılar

Google Rolls Out Gemini Live with Camera and Screen Sharing to All Android Devices

Google Expands Gemini Live with Camera and Screen Sharing to All Android Devices

Google has officially expanded the Gemini Live features, including Camera and Screen Share, to all compatible Android devices. Initially introduced last week for select models like the Google Pixel 9 and Samsung Galaxy S25 series, this new functionality is now available for any Android device that supports the Gemini app. However, it’s important to note that access to these features still requires a Gemini Advanced subscription, meaning they are not available for free to all users.

The expansion announcement was made via the official Google Gemini app account on X (formerly Twitter), where the company shared that the Gemini Live features had received positive feedback from users. Google emphasized that the rollout is happening gradually and will eventually reach all devices capable of running the Gemini app, offering more users the ability to use the new tools.

The Gemini Live features, including real-time camera assistance and screen sharing, were first previewed at Google I/O last year. After nearly a year of development, the features were shown again at the 2025 Mobile World Congress (MWC), where they garnered attention for their advanced capabilities. Developed by Google DeepMind as part of Project Astra, these tools enable the Gemini AI chatbot to provide live, contextual support through a user’s device camera feed or screen capture, allowing for more dynamic and interactive assistance.

These upgrades mark a significant step in Google’s push to enhance its AI offerings. By integrating real-time visual and screen-based interactions, Gemini Live aims to revolutionize how users interact with AI, providing hands-on, personalized help directly on their mobile devices. As the rollout continues, more Android users will be able to explore how these cutting-edge features can improve their experience with the Gemini platform.

OpenAI Unveils O3 and O4-Mini Models Featuring Advanced Visual Reasoning

OpenAI Launches O3 and O4-Mini AI Models With Enhanced Visual Reasoning

OpenAI has unveiled two new AI models—O3 and O4-Mini—designed to push the boundaries of machine reasoning and visual understanding. These models are successors to the earlier O1 and O3-Mini versions and are available to paid ChatGPT users. Highlighted for their visible chain-of-thought (CoT) capabilities, the new models are built to process complex queries involving both text and visual inputs. Their release follows closely on the heels of the GPT-4.1 model series, marking a busy week for the San Francisco-based AI research company.

Announced via a post on X (formerly Twitter), OpenAI described O3 and O4-Mini as their “smartest and most capable” models to date. One of their standout features is enhanced visual reasoning—the ability to interpret and draw inferences from images. This advancement allows the models to extract detailed context, understand spatial relationships, and interpret ambiguous visual data more effectively than their predecessors.

OpenAI also revealed that these are the first models capable of autonomously using all the tools integrated into ChatGPT, such as Python coding, web browsing, file analysis, and image generation. This multi-tool synergy enables the models to handle more dynamic tasks, such as manipulating images (cropping, zooming, flipping), running analytical scripts, or retrieving information even from flawed or low-quality visuals. The potential applications range from reading difficult handwriting to identifying obscure details in images.

In terms of performance, OpenAI claims that both O3 and O4-Mini outperform previous versions—including GPT-4o and O1—on benchmarks like MMMU, MathVista, “VLMs are blind,” and CharXiv. While no comparisons were made with third-party models, these internal benchmarks suggest a notable leap in reasoning and image-based comprehension. As OpenAI continues to iterate, these releases underscore its ongoing focus on building increasingly versatile and intelligent AI systems.

ChatGPT Introduces Library Feature for Easy Access to AI-Generated Images

OpenAI Launches New Library Feature to Organize AI-Generated Images in ChatGPT

OpenAI has introduced a new library feature within ChatGPT that provides users with a centralized space to view all their AI-generated images. Announced on Wednesday, the feature is now available across all ChatGPT platforms — web, desktop, and mobile — for registered users. The library is designed to help users easily browse, revisit, and reuse their previously created images without digging through old chat histories. In addition to viewing, the update also offers editing capabilities directly from the library interface.

The feature was officially revealed via OpenAI’s post on X (formerly Twitter), highlighting its broad availability to both free users and those subscribed to the Plus and Pro plans. Accessible via the left-hand sidebar on web and mobile apps, the library displays only images generated using GPT-4o’s image creation capabilities. Images created with earlier models like DALL-E are not included in this view, according to OpenAI’s support documentation.

Inside the library, users will find a new “Make Image” button at the bottom, offering a quick way to jump back into generating fresh visuals. When a user taps and holds on an existing image, it enlarges in a separate window where four new options appear: Edit, Select, Save, and Share. Saving allows users to download the image locally, while sharing integrates with third-party apps to send images to friends and social media.

The editing tools add even more flexibility. Selecting “Edit” opens a new chat where the image is attached, allowing users to apply further text-based prompts for significant modifications or to generate related creations. The “Select” tool provides more granular control, letting users highlight and modify specific parts of an image. An adjustable slider refines selection sizes, and Undo/Redo options streamline the editing process. Additionally, a Copy button lets users quickly add images to their clipboard for use elsewhere. Together, these new features mark a major step forward in making image generation within ChatGPT more organized and interactive.