The improved long context window can even pull information from multiple documents when responding to a single request. In the sidebar in Documents, I asked for help writing a sample letter to a potential job candidate—in the prompt I linked to the applicant’s job description document and PDF portfolio, both of which were in My Drive—and immediately I received a draft email which has taken into account relevant details from both documents.
The Gemini 1.5 Pro isn’t the only shiny new model, though: I also got to try out the newly launched Imagen 3, our highest-quality text-to-image model yet. One of the new capabilities I was excited about was its ability to generate text and decorative fonts, so I put it through its paces. I started by looking for a stylized alphabet – like letters written in jam on toast, or with silver balloons floating in the sky. Imagen 3 generated a complete alphabet of letters, which I could then use to print my (delicious) menus.
After my Imagen 3 interlude, I continued with more Gemini demos. In one of them, I can pull up the Gemini overlay on an Android phone and ask questions about anything on the screen. This really showed how we’re not only expanding what you can ask Gemini, but also making Gemini context-aware so it can anticipate your needs and offer helpful suggestions.
The use case here was a lengthy oven manual. Whether it’s a demo or real life, this is not something I would happily read. Instead of examining the document, I pulled up Gemini and immediately got an “Ask this PDF” suggestion. I tested questions like “how to update the clock” and quickly got correct answers. It worked just as well with YouTube videos. Instead of watching a 20-minute training video, I asked a quick question about how to edit boards, got an answer, and was on my way to the next demo, where I tested out a new chat mode called Gemini Live that lets you talk with Gemini in app, no typing required.
Talking to Gemini was a different experience than a traditional chatbot interface: Gemini’s responses are much more conversational than the paragraphs of text and bulleted lists you might normally get. In my demo, I learned that you can split Gemini in the middle of an answer. After searching for a list of kid’s activities for summer vacation, I was able to break down a list of suggestions to dive deeper into what materials I would need for a shirt.
The Project Astra demo – or “advanced agent to see and talk” – took things a step further to show the cutting edge of where our conversational AI projects are going.
#newest #products #updates
Image Source : blog.google