Google Gemini AI assistant automating tasks on a smartphone screen, showing integration with ride-hailing and food delivery apps.
Technology & Gadgets

Gemini Unleashed: Your Phone’s AI Now Books Rides and Orders Dinner

Share
Share
Pinterest Hidden

Remember the promises of a decade ago? Voice assistants like Siri and Google Assistant were touted as your personal digital concierges, capable of booking rides or ordering coffee with a simple command. Yet, for many, these visions remained largely unfulfilled, often leading to clunky experiences or simply opening an app without completing the task. Fast forward to today, and the landscape has dramatically shifted. With the advent of large language models (LLMs) and advanced AI, those ambitious promises are not just back on the table – they’re becoming a reality.

At Samsung’s recent Galaxy Unpacked event, Google and Samsung unveiled a groundbreaking integration: the Gemini voice assistant can now seamlessly complete tasks across a selection of popular third-party apps. Imagine asking your phone to book an Uber or order dinner from Uber Eats, DoorDash, or Grubhub, and watching it happen. This isn’t just a demo; it’s an early preview of a powerful new capability.

A New Era of Automation: What Gemini Can Do

This innovative feature, initially rolling out in the US and South Korea, will debut with the Galaxy S26 smartphones on March 11, followed by a software update for the Google Pixel 10 series. More app support is anticipated later this year with the arrival of Android 17. So, how exactly does this next-generation automation work?

Seamless Ride-Hailing and Food Delivery

The process is designed for efficiency and minimal user intervention. For instance, if you ask Gemini to “Get me an Uber to the airport,” the assistant will open the Uber app within a virtual window. While Gemini diligently executes the action in the background, you can monitor its progress via a live notification. The goal is for you to delegate these ‘digital laundry’ tasks – those necessary but unexciting chores – and focus on more important things.

Should Gemini require further clarification, such as distinguishing between multiple airports in a region, it will intelligently prompt you for the necessary information. Crucially, Gemini doesn’t make the final decision; once it has navigated to the booking stage, you retain control to select your ride type (e.g., UberX or UberXL), confirm the fare, and finalize the booking with a tap.

Beyond Simple Commands: Gemini’s Intelligent Problem-Solving

Google’s Sameer Samat, President of the Android Ecosystem, highlights Gemini’s ability to handle more complex scenarios, demonstrating its prowess with a Grubhub order. Picture this: a group chat discussing pizza preferences for a board game night. Samat simply asked Gemini to “Figure out the order.” The AI intelligently parsed the conversation, organized everyone’s specific pizza requests, and then, with a command to “order this on Grubhub for home delivery,” initiated the automation. Gemini opened Grubhub in a virtual window, added all items to the cart, and presented it for final verification and order placement.

Adapting to the Unexpected

What truly sets Gemini apart is its resilience. Unlike rigid, script-based assistants, Gemini is designed to encounter and resolve issues autonomously. Samat recounted instances where Gemini, faced with a restaurant limiting large pizzas during a busy service, intelligently suggested two medium pizzas instead. It’s an assistant that strives to find a workaround before needing to consult you, embodying the true spirit of helpful AI.

More Than Just a Map: True AI Reasoning

Samat emphasized that Gemini doesn’t merely follow a memorized “map” of an app, a limitation seen in earlier AI agents. Instead, it leverages its reasoning capabilities to formulate a plan, visually interprets the screen much like a human would, and navigates accordingly. This means Gemini can adapt to visual changes or updates within an app, ensuring its functionality remains robust and reliable.

From Shopping Lists to Smart Carts

Another compelling example involved a Google Keep note detailing an RSVP list for a barbecue, complete with dietary restrictions like vegan guests. Gemini was tasked with calculating the required hot dogs and buns. After performing the calculation, Samat asked Gemini to add these items to his Safeway cart on DoorDash. Within minutes, the virtual cart was populated, ready for checkout.

The Future of AI Vision: Three Paths to Task Completion

Gemini’s task automation is multifaceted. While the virtual window demonstration showcases its ability to navigate apps directly, it’s just one of three approaches. For apps with a Model Context Protocol (MCP) integration – an open-source universal language for LLMs to communicate with third-party apps – Gemini can execute tasks entirely in the backend, presenting only the final checkout step. Additionally, developers can build “App Functions” that allow Gemini to interface in a structured, optimized manner. The direct app navigation, as seen in the examples, serves as a powerful fallback when these deeper integrations aren’t yet available.

This marks a significant leap forward in personal AI assistants, moving beyond simple queries to genuine, intelligent task execution. The era of truly smart, proactive digital helpers is finally here.


For more details, visit our website.

Source: Link

Share

Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *