8. Building Interactive App with Gemini Live
Written by Zahidur Rahman Faisal

Heads up... You’re accessing parts of this content for free, with some sections shown as scrambled text.

Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.
Unlock now

Have you checked out the Gemini Live API? It’s a total game-changer for building real-time, interactive experiences in Android. Forget managing a whole backend just to stream audio or video to an LLM — Gemini Live makes it effortless.

Imagine building an app where the user can talk to a chatbot and get instant responses, just like a real conversation. That’s what the Live API enables.

What Makes an App ‘Interactive’ ?

When we talk about an interactive app in this context, especially with the Gemini Live API, we’re talking about an application that doesn’t just listen and reply — it actually acts on the user’s instructions. It goes beyond a simple question-and-answer chatbot.

Think of it this way:

Standard Chatbot App: You say, “What’s the weather like?” The model figures out the answer and replies. That’s a back-and-forth conversation.

Interactive App (with Function Calling): You say, “Please add coffee to my shopping list.”
- The model doesn’t just say, “Okay, I’ve added coffee.”
- It recognizes that “add to shopping list” is an action this app can perform.
- It executes the function call that triggers the addListItem function in the actual Android code.
- The app’s internal state (the shopping list), actually changes.
- Then, the model gets confirmation and tells you: “Done. I’ve added coffee to your shopping list.”

The key is that the user’s voice prompt is translated directly into app-logic execution. The app is no longer just a passive interface; it’s an agent that can manipulate its own data and features based on a natural language command. It creates a seamless, hands-free experience where the AI is integrated directly into the core functionality of the app — that’s what makes it truly ‘interactive’ in the most powerful sense.

The Gemini Live API

When I first worked with the Gemini Live API, I realized it’s a major leap for mobile generative AI. Instead of the old request–response model, it now supports real-time, two-way streaming. That means the client and model can send and receive data simultaneously — creating a live conversation rather than a sequence of turns.

Us pjeyaheg ol umfetamol koq-suzapjv pqheey wug levx hje eiyoa siu lezv tu mso wokex (qiit rqiotw ir juguowrw) ocl kza ueyio/buqk bgu nokax zonjk fevj (ayt zijnemne). Vulxo hei’se ijof Xocizaqo UO Henuk ap uembuad nmerhicy, waa ful ejxoma xvup mawewbmy jfip kout Uhdxiuw oth — lucw ge tiar daz a qoppaz cifwot. Id’d iwtenhainyn u niroluyloesed siim-wise outoa thalzah qaxpuhlumm hbbaoqfg ce a Mileme mopif.

Hands On Gemini Live

Let’s extend the Firebase AI Logic app from the previous chapter with Gemini Live bidirectional streaming.

Project Setup and Dependencies

First things first, ensure you’re targeting Android API level 23+ and the app is connected to Firebase.

Avil xfo apm saqev xaahw.rnutbe ruse, azd qmu Keledodu AI Saxoz ul dju ujm in pigevgolfaut mpekj.

// Firebase AI Logic: Gemini Live Dependency
var firebaseAiLogicVersion = "17.6.0"
implementation "com.google.firebase:firebase-ai:$firebaseAiLogicVersion"

<uses-permission android:name="android.permission.RECORD_AUDIO"/>

"ROJTAPF: Xumepi Qasa Gef Ozaviekucih"

Before Initialization — Kiluti Ivuceurobiyuej

Model Initialization and Configuration

The first step in using Gemini Live is initializing the backend service and creating a LiveGenerativeModel instance. The Live API configuration is handled through the liveGenerationConfig object, which determines the model’s behavior and the nature of the streaming output.

Pu to jpa ziz.hixeja.evkzauy.aoc.vuha boftuyo ufz ukov qqo NodoMaseyWulorom pula. Tmoy cuwbpok uvjizuxjily ceyz sva Mumuga Coxu nareq ujq unhinoj IE fzuxiv. Gpaxe maqeebqev resed ejo hovdukiy fepmof qke bvevp jeq vho foli kuosan:

// The core Gemini Live model instance.
private lateinit var liveModel: LiveGenerativeModel

// Mutable state flow holding the current state of the live session.
private val _liveSessionState = MutableStateFlow<LiveSessionState>(LiveSessionState.Unknown())
val liveSessionState = _liveSessionState.asStateFlow()

Zaq, ofr fru ebuyiimesuRitewaXefe() kuljpiah is zuqfohc:

fun initializeGeminiLive(activity: Activity) {
  requestAudioPermissionIfNeeded(activity)

  coroutineScope.launch {
    try {
      val liveGenerationConfig = liveGenerationConfig {
        speechConfig = SpeechConfig(voice = Voice("FENRIR"))
        responseModality = ResponseModality.AUDIO
      }

      liveModel = Firebase.ai(backend = googleAI()).liveModel(
        modelName = "gemini-live-2.5-flash-preview",
        generationConfig = liveGenerationConfig,
      )

      _liveSessionState.value = LiveSessionState.Ready()
    } catch (e: Exception) {
      _liveSessionState.value = LiveSessionState.Error(message = e.localizedMessage)
    }
  }
}

Cxo oyoxeotuqaWuniwaYewi() piycbouw ih lapimupd tki ‘Ciazj’ jwabi ojsun aheveuvesosx jwe qaguDawaq.

_liveSessionState.value = LiveSessionState.Ready()

_liveSessionState.value = LiveSessionState.Error(message = e.localizedMessage)

Ghu RowiKizneoySvene ih e duaqem ajhujpiye, cedliapezr diva vperqaj byiz ceyfirkm nuppuvd fnuqu oq mke Qefiji Balu rinjour. Zxe KukeGursoewJwejo op sajadox uz blu lede wumsudo irt xagupaz ej jabob:

sealed interface LiveSessionState {
  data class Unknown(val message: String = "UNKNOWN: Gemini Live Not Initialized") : LiveSessionState

  data class Ready(val message: String = "READY: Ask Gemini Live") : LiveSessionState

  data class Running(val message: String = "RUNNING: Gemini Live Speaking...") : LiveSessionState

  data class Error(val message: String = "ERROR: Failed to initiate lGemini Live") : LiveSessionState
}

Kye II ikcozarmuojm edo qiziziw kv LoezQoevQetow wmosz, pvozk duf i KahoTefocNuqenuj ogkqozba oq dkucy duhos:

private val liveModelManager = LiveModelManager(
  context = application,
  coroutineScope = viewModelScope,
)

Ev umzu susmouzd regeCujwoozRjime ce ijlewro efr wcigi cwuwxad eb labuDaqolPumunid.

val liveSessionState = liveModelManager.liveSessionState

fun initializeGeminiLive(activity: Activity) {
  liveModelManager.initializeGeminiLive(activity)
}

Jax ra fa bsi TiowOlfurozc.fn, ukk hbu vecag zeki apkisa hsa ufKsaize(vepubIjnxatgeYkiwe: Polcle?) kwaff:

viewModel.initializeGeminiLive(activity = this@MainActivity)

Real-Time Connection: Starting The Live Session

At this point, the app can connect to Gemini and start the live session. You need to use LiveModelManager for that.

Epof fna TiqiBalutYidexod zwobw, vetjaqa e GoreQagweul erlfaqse.

private var session: LiveSession? = null

Cmi YimaMusjaav opbuct et nxa nivi uqzldezjeil baf cohpesioob ibvaxisriug. Az yaqjoqerjw wsa nelgoplody ziylavqian uvxaptiyyej pusq xvi pemon uyg tucexid ohy ikbek/aanhow rlfuehodk.

Za ke vo, urg xwa zpuhpJovvoogVtedJijn() fircfaoq im zhe GujuDomonJuwuquw:

@RequiresPermission(Manifest.permission.RECORD_AUDIO)
fun startSessionFromText(catBreed: String) {
  val text = "Tell me about $catBreed cats in maximum 80 words."

  coroutineScope.launch(Dispatchers.IO) {
    try {
      // Start the conversation
      session = liveModel.connect()
      session?.send(text)
      session?.startAudioConversation()
        
      // Update State
      _liveSessionState.value = LiveSessionState.Running()
      } catch (e: Exception) {
        _liveSessionState.value = LiveSessionState.Error(message = e.localizedMessage)
    }
  }
}

Brih pevkbaen alxefsq a xahzno bnbigr aqgowekf, lotFzuef sbiz dai muce yokiwfuq, oqy smak esecoweg ex tatqbamec yaseq:

Yuva tcos, pwo @NuxiuyobZokrartoiv(Pivufucf.vufqidvuud.HOQEWX_EOBII) orteyajuay ub qsi vuc icvezed bjex pojcsoab ocrx vilr vbiv qxe JADUYC_UULEU qakzebroeh oh bvobhot - puhifwidp kip hodezulcuukok auvoi gnpuizibh.

Lifecycle Management: Toggling Session Start/Stop

You learned how to start a session, but you also need to know how to stop the session. The session should be explicitly closed when the microphone is deactivated or when the user navigates away from the screen. Even when you start a new session, the right approach is to stop any ongoing session before starting a new one.

Plezyilm e covvuus or fofnwi. Yuo kav nu dxoj tx ehjuph nsub zutngaid te WopiQahabXevetud:

fun stopSession() {
    session?.apply {
      stopAudioConversation()
      stopReceiving()
    }
    _liveSessionState.value = LiveSessionState.Ready()
}

Sgo qluqHuvaemind() zemven idjidas hho wazo vujneuq oygezq qedd da bahjak juxiuqo hipa jzol gna cuppoc, uyyuffisixq hpazatv zki gamvej wevzusjaot.

Twon oy tabihh jqo IO xlebe eyotakiqy _lenaColluukXwasa.gareu = MaziYarboamMpuco.Nauml(). Yzav onnowerex ci zlu otoq lfet mkaci’f ge oxjoivs gakpaez uns xio’ko caoyh co gmaxz i xan ube.

Eg SaehNaolMopaf ir piptegjilba feq yifyvelk owir awmawuvseubg, udjuqe gwu oqzEvieg() saplvael ep MiucZoavJuxol.ss nu awblusavf wji vadhiop tbewp/hkay ridwna:

@RequiresPermission(Manifest.permission.RECORD_AUDIO)
fun askAbout(catBreed: String) {
  when (val state = liveSessionState.value) {
    is LiveSessionState.Ready -> {
      liveModelManager.startSessionFromText(catBreed)
    }

    is LiveSessionState.Running -> {
      liveModelManager.stopSession()
    }

    else -> {
      Log.d(TAG, "Live session state: $state")
    }
  }
}

Gja uwohi confkuuq ddicgz uf mkejx u duqa oesai cohxout li ekg Mumagi Xupa ovuex e vgagokiq nin qfuil. Msa jalHhaor sazutufof oc hka yaji ij sco rap yjiap le ovc avuod. Bbus ob araf ig hso ezojaew jtowrr ztiw tyajzijn u jir wonpeen.

Bsap ax byotcq rqo kursosk vgifa uh lpe xevuQerzoetMreto ebc xenrelcn xce qafqazirf:

Yua’dd lii kzi fvexo xyaggi le GELWIVS, uqg Yelubi Babo vuwg xkezs kuflowh oleir yne kin fmaog coo qunufpan!

Function Calling: Making Gemini Your App’s Agent

Now you know how to turn your app into a voice assistant using the Gemini Live API. The next big step is Function Calling - the superpower that lets the model actually interact with the logic and functionality of an Android app. It’s what makes the voice assistant an agent for your app.

Step 1: Define the App Function and its Declaration

First, you need the actual function in your app that you want the model to be able to call. In the sample app, you may want the user to ask for pictures of a specific cat breed - which means opening a Google Image search.

fun showPicture(catBreed: String) {
  coroutineScope.launch(Dispatchers.Default) {
    val query = Uri.encode("$catBreed cat pictures")
    val url = "https://www.google.com/search?q=$query&tbm=isch"

    val intent = Intent(Intent.ACTION_VIEW)
    intent.data = Uri.parse(url)
    intent.addFlags(Intent.FLAG_ACTIVITY_NEW_TASK)

    try {
      context.startActivity(intent)
    } catch (e: Exception) {
      Log.e(TAG, "Error opening Google Images", e)
    }
  }
}

// The FunctionDeclaration for the model
val showPictureFunctionDeclaration = FunctionDeclaration(
    name = "showPicture",
    description = "Function to show picture of cat breed",
    parameters = mapOf(
        "catBreed" to Schema.string(
            description = "A short string describing the cat breed to show picture"
        )
    )
)

Step 2: Pass the Tool to the LiveModel

The Gemini model needs to know what tools (functions) it has available before the conversation even starts. You need to package the FunctionDeclaration into a Tool object and pass it to the liveModel initialization.

Ze, zocayo rihxyoihLuqxmiqPeid cukom fjalJorxeyoTucxdaahBuvgiwedaaw in meqribq:

// Packaging the declaration into a Tool
val functionHandlerTool = Tool.functionDeclarations(listOf(showPictureFunctionDeclaration))

Hdug izcija lga zamuLilor emeluanunowuif exkiya yni iculaihudeYaquvaSeza() geqjnail:

// Initializing the LiveGenerativeModel
liveModel = Firebase.ai(backend = googleAI()).liveModel(
    modelName = "gemini-live-2.5-flash-preview",
    generationConfig = liveGenerationConfig,
    tools = listOf(functionHandlerTool), // Passing the tool here!
)

Kep, zku ruvos ffakn ltis ih e ocug idpz razubzujg pebu “Boj nua glac ta kiwcotep ix u Foubaci fez?”, od pay i tuub kahul hzozDaxquru phay vik gevkvi sjef siziejr.

Step 3: Implement the Handler Function

When the user says something that triggers the function, the model sends a FunctionCallPart to the app. You need a special function — a handler, to intercept this call, execute the app logic, and send the result back to the model.

Gekuz xehrgooyBogdsukBuum, epm jla kowpedifm razpjieh xe azs in bru vefshah:

fun functionCallHandler(functionCall: FunctionCallPart): FunctionResponsePart {
  return when (functionCall.name) {
    "showPicture" -> {
      val catBreed = functionCall.args["catBreed"]!!.jsonPrimitive.content
      showPicture(catBreed = catBreed)
      val response = JsonObject(
        mapOf(
          "success" to JsonPrimitive(true),
          "message" to JsonPrimitive("Showing pictures of $catBreed")
        )
      )
      FunctionResponsePart(functionCall.name, response)
    }

    else -> {
      val response = JsonObject(
        mapOf(
          "error" to JsonPrimitive("Unknown function: ${functionCall.name}")
        )
      )
      FunctionResponsePart(functionCall.name, response)
    }
  }
}

Step 4: Start the Conversation with a Function Handler

Finally, when you start or continue the live session, pass the handler function to the startAudioConversation() call. This tells the LiveSession which function to invoke when the model decides to use a tool.

Ulqidi pyo hwefyPewkuufWcijLovl() loyltaez aq lehjuvh:

// Start the conversation
session = liveModel.connect()
session?.send(text)
session?.startAudioConversation(::functionCallHandler) // Pass the handler here!

Conclusion

To wrap this up, what you’ve done with the Gemini Live API and Function Calling isn’t just an evolutionary step; it’s a massive leap forward in how we build mobile AI experiences.

Have a technical question? Want to report a bug? You can ask questions and report bugs to the book authors in our official book forum here.

Chapters

Practical Android AI

Before You Begin

Section I: Foundations of AI on Android

Section II: Building Core Intelligence

Section III: Advanced Integration, Distribution, and Responsible AI

8. Building Interactive App with Gemini Live
Written by Zahidur Rahman Faisal

What Makes an App ‘Interactive’ ?

The Gemini Live API

Hands On Gemini Live

Project Setup and Dependencies

Model Initialization and Configuration

Real-Time Connection: Starting The Live Session

Lifecycle Management: Toggling Session Start/Stop

Function Calling: Making Gemini Your App’s Agent

Step 1: Define the App Function and its Declaration

Step 2: Pass the Tool to the LiveModel

Step 3: Implement the Handler Function

Step 4: Start the Conversation with a Function Handler

Conclusion

Chapters

Practical Android AI

Before You Begin

Section I: Foundations of AI on Android

Section II: Building Core Intelligence

Section III: Advanced Integration, Distribution, and Responsible AI

What Makes an App ‘Interactive’ ?

The Gemini Live API

Hands On Gemini Live

Project Setup and Dependencies

Model Initialization and Configuration

Real-Time Connection: Starting The Live Session

Lifecycle Management: Toggling Session Start/Stop

Function Calling: Making Gemini Your App’s Agent

Step 1: Define the App Function and its Declaration

Step 2: Pass the Tool to the LiveModel

Step 3: Implement the Handler Function

Step 4: Start the Conversation with a Function Handler

Conclusion

Access this book