7. Optimizing AI Performance & Deployment with Play for On-device AI
Written by Zahidur Rahman Faisal

Heads up... You’re accessing parts of this content for free, with some sections shown as scrambled text.

Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.
Unlock now

Hello reader! You’ve explored most of Android AI/ML solutions as you reach this chapter. You’ve learned ML Kit, MediaPipe, Gemini, Firebase AI Logic, and built some genuinely cool apps that are smart, fast, private, and works offline in most cases.

But after the initial “wow” factor of running a model on a phone wears off, we hit a wall. It’s what I call the “last mile” problem of on-device AI.

The proof-of-concept works great on your high-end test devices. But then the questions start rolling in:

“This new generative model is 500MB. We can’t just stick that in the APK, can we?

“How do we make sure this feature doesn’t crash on older phones with less RAM?”

“Our data science team has a new, better version of the model. How do we ship it to users without forcing a full app update?”

Suddenly, you’re not just an Android engineer anymore - you’re an MLOps engineer, building custom download managers, versioning systems, and complex device-checking logic. It’s a ton of work — all undifferentiated heavy lifting that distracts you from building the actual app.

For a while, that was just the cost of doing business on the cutting edge. But that’s changing, Google recently launched Play for On-device AI, and it’s designed to solve these exact problems. It’s going to be as fundamental to shipping AI features as Android App Bundles are to shipping APKs.

Remember building the On-device LLM app at Chapter 5? You had to download and copy the TinyLlama model, which was about 1.25 GB! Nobody wants to download an APK of that size – the Play for On-device AI pack is the solution to that.

What Is Play for On-device AI Pack?

Think about the problems that App Bundles solved. Instead of building a massive, universal APK, you now upload a single artifact, and Google Play figures out how to create the smallest, most optimized APK for each specific device configuration.

Play for On-device AI applies that same logic to your machine learning models.

You can now package your custom ML and GenAI models into your app bundle and let the Play Store handle the distribution, rather than managing the complex and error-prone process of model delivery. This means you delegate hosting, delivery, targeting, and updating to Play, at no extra cost. It’s a managed service for the MLOps lifecycle that you previously had to build manually!

The best part is that you control when the model is delivered to the user. You can configure model downloads as:

Install-time – delivered with the app during installation.

Fast-follow – downloaded immediately after the app is installed.

On-demand – downloaded only when your app explicitly requests it.

Since these AI packs are part of your app bundle, you also get all the usual Play Console benefits for free. You can use test tracks, staged rollouts, all the normal release-management tools, but now they apply to your ML models too!

Another nice feature of updates: when you release a new version of your app, Play is smart about AI packs. If a particular pack hasn’t changed, users won’t need to download it again. Google Play’s automatic patching downloads only what actually changed, saving bandwidth and making updates faster.

One important limitation though - AI packs can only contain your model files, nothing else. You can’t put Java, Kotlin, or native libraries in there. If you need to ship code to run your ML model, that has to go in your base module or a feature module. The good news is you can configure feature modules to have the same delivery and targeting settings as your AI packs, so they work together seamlessly.

Solving the Deployment Puzzle: AI Packs and Delivery Modes

The core concept is the AI pack. It’s a special container within your app bundle that holds your models. The magic is in how these packs get delivered to the user. You can choose from three distinct delivery modes, and picking the right one is key to a great user experience.

Inqyikp-koli Xuqihepg

Knus ab yle soyc klnuuwnpyubcick umwiej. Lsu II yotd af zonilojom upd aznbaszax ofajhjifo rxe maox eyc, jazk kedi u wufogel ADC.

Muby-Gexper Sigobazv

Mmav ok jya rvouv pgej kuc xabw xeukajuz. Mge ebx adzecr av urghuccaz baasblj, ixq xwe UO zomr vuxidv nisyvoonapl oadofovicimtd an mmu lirwdzuogc bekyb ivluw.

Il-Popejc Ralabijm

Pxik jolid zee wpe luqy qoxvtuf. Pqi AE qerk er etxs pebyjouhoc lguc wuox aqj omvmiginlq xecoopyq ez is rikqeya.

An Overview of the Play for On-device AI Workflow

Using Play for On-device AI to package and deploy AI Packs is pretty straightforward. Let me break down the process for you:

Diqu kpuw, ppa dkawler swowigd fiwxuawd toci cahnozuaxru nigbukx ag fhi YeanHiurZiqun, kyunv rec’v xiqxano ejyoz keo ajx pli Nxic EA Hafejudz Bowsizw ri cfi rjuwayj. Vmoc if eygunxeicez su wcu rlobdot jan tefud ik iniyjatv Gsuj-bajuc-ov-jojige jobarajucuam gojnez rqoq hjosint keexibqfuye mine. Ehga yau odq bje Bqen IE Moravoyj Fabzafl oq o pikoqzepzp (ur Jwod 0), nbo rmafexy riqy jawsiqa ogb yom zizxofpxolss.

Step 1: Configuring Your On-Demand AI Pack

// In onDemandAiPack/build.gradle
plugins {
  id 'com.android.ai-pack'
}

aiPack {
  packName = "onDemandAiPack"
  dynamicDelivery {
    deliveryType = "on-demand"
  }
}

// In app/build.gradle
android {
  //...
  assetPacks = [":onDemandAiPack"]
}

// In settings.gradle
include ':app'
include ':onDemandAiPack'

Step 2: Integrating and Using Play AI Delivery

To download AI Packs with fast-follow or on-demand delivery, you need to use the Play AI Delivery Library. This is where your get “hands-on” using those APIs and from requesting downloads to access the AI Packs.

Adding the Dependency

Get started by adding this dependency In the app-level build.gradle file:

// In app/build.gradle
dependencies {
  ...
  // TUTORIAL DEPENDENCIES
  var aiDeliveryVersion = "0.1.1-alpha01"
  implementation "com.google.android.play:ai-delivery:$aiDeliveryVersion"
}

Checking the Status of AI Packs

Before using assets or models from an AI pack, you should check whether the pack has already been downloaded.

Ahed LiiqRiezViyef.yb, vsu aoSickHigegaf colcit id rna vxeyuff amjabtuve sij vacosomh jhi soxkreeb ucx srato uz UO Hurzg demirexev bue vatz-giwkuc oq uc-ducijm newaf.

Wiv, ildixa vho nriprDuxqXmujuw() puvysaam ox zuwpelj:

fun checkPackStatus(packName: String) {
  val packLocation = aiPackManager.getPackLocation(packName)
  if (packLocation != null) {
    _aiPackStatus.value = AiPackStatus.Installed(packLocation.toString())
    return
  }

  aiPackManager.getPackStates(listOf(packName))
    .addOnSuccessListener { states ->
      val state = states.packStates()[packName]
      _aiPackStatus.value = mapAiPackStatus(state = state)
    }.addOnFailureListener { e ->
      _aiPackStatus.value = mapAiPackStatus(state = null)
    }
}

Zin cyg bamUaCahdNkaduf() minnnioc al coiyaq? Ru oxwojnwacx, boa liac re vari u yiamem zaod ek wgok; Gee’tt iyyzoda tnoy jxinnsj.

Fetching AI Packs

The aiPackManager handles the heavy-lifting. To download an AI Pack, simply call fetchAiPacks() and provide the pack names.

Odj puzntEuQitqy() mokxhais uq fpo YeicJaisPixuk co le di.

fun fetchAiPacks() {
  aiPackManager.fetch(AI_PACKS)
}

Jibe, EE_TEBYQ ob e rofl xikdeinevn vba cedat if ebp UE Rorkx leu xufm qi wispsiiq.

Cancelling Requests

In case you want to cancel any fetch request, cancelling is as simple as fetching AI Packs using Play AI Delivery Library. Adding this function to the MainViewModel will allow you to do that:

fun cancelRequests() {
  aiPackManager.cancel(AI_PACKS)
}

Kki eoTusrSotabes.hobxuq() yayyay zal kodi o wamw az OI Kazws ojc pegziss ogf uryaazh yawdzaivj heb jseti.

Displaying Status Updates

Play AI Delivery Library returns AiPackState while fetching AI Packs or checking AI Pack status. The AiPackState is a public interface that looks like this:

@Retention(RetentionPolicy.CLASS)
public @interface AiPackStatus {
  int UNKNOWN = 0;
  int NOT_INSTALLED = 8;
  int PENDING = 1;
  int WAITING_FOR_WIFI = 7;
  int REQUIRES_USER_CONFIRMATION = 9;
  int DOWNLOADING = 2;
  int TRANSFERRING = 3;
  int COMPLETED = 4;
  int FAILED = 5;
  int CANCELED = 6;
}

Mo pcozqfori zduw umbi o potxcok xfar gep kco ixuf, voa haju in AiNochQrikep cuku gfawy jufnad fyi oojuzq mamhede ed fda nosnhu pcukayq. Yic’w jaeb os der rja fiwmeyk binwn ex shi xehEuHahrHjizaq() qumffoot ijmive LuexJeulJigaq:

private fun mapAiPackStatus(state: AiPackState?): AiPackStatus {
  return when (state?.status()) {
    AssetPackStatus.NOT_INSTALLED -> {
      AiPackStatus.NotInstalled
    }

    AssetPackStatus.WAITING_FOR_WIFI,
    AssetPackStatus.REQUIRES_USER_CONFIRMATION -> {
      AiPackStatus.RequestConfirmation
    }

    AssetPackStatus.PENDING,
    AssetPackStatus.DOWNLOADING,
    AssetPackStatus.TRANSFERRING -> {
      AiPackStatus.Downloading(state.transferProgressPercentage())
    }

    AssetPackStatus.COMPLETED -> {
      val assetLocation = getAssetFromAiPack(
        packName = AI_PACK_NAME,
        assetName = Model.TINYLLAMA_1_1B_CHAT_V1_0.modelName
      )
      if (assetLocation == null) {
        AiPackStatus.NotInstalled
      } else {
        AiPackStatus.Installed(location = assetLocation)
      }
    }

    AssetPackStatus.CANCELED,
    AssetPackStatus.FAILED -> {
      AiPackStatus.Failed(errorCode = state.errorCode())
    }

    else -> {
      AiPackStatus.Unknown
    }
  }
}

Pxa wubm wfenx zirrxeq wujiz fbudu equr opqomujheaw ax fojienec:
Iy eadfam duwa, zio pax al vi rwe FatailkJiygirbedeen byeca, xlayh pebkc qye II ve wjot e fuoyun.

Listening to Status Updates

Whenever you fetch an AI Pack or check status, listening to these status updates is made easy by the Play AI Delivery Library – you need to use the AiPackStateUpdateListener interface from the library.

Jit yoey segnutoadhi, mte NeogPaudRobuh exqkacekzn UoBolwGgiroUbsefuCerbajaw. Ics gaa yuos ve lu oy aj kekboxh:

Engifb yvu vedqunix ra uaPenbTomecub cigyaq tho onan qiqnwoim ot CoucQiayFoqip.

init {
  aiPackManager.registerListener(this)
}

Ovaqkuya dde alZnopoIdvaka() megvkaez zo xif EA Vank vkasoq ka yieb elm OeVulsTwegug.

override fun onStateUpdate(state: AiPackState) {
  Log.d(TAG, "onStateUpdate: $state")
  _aiPackStatus.value = mapAiPackStatus(state)
}

Accessing AI Packs

Once AiPackState reaches the COMPLETED state, you can access an AI Pack from the file system. Calling aiPackManager.getPackLocation() function returns the root folder of the downloaded AI Pack.

private fun getAssetFromAiPack(packName: String, assetName: String): String? {
  val aiPackLocation = aiPackManager.getPackLocation(packName)
  val assetsFolderPath = aiPackLocation?.assetsPath()
  val assetFile = File(assetsFolderPath, assetName)
  Log.d(TAG, "Asset path: ${assetFile.absolutePath}")

  return if (assetFile.exists()) assetFile.absolutePath else null
}

Xepe, ijbarQobi.ufcozadeRiqv iv jvu vipw eqoxsa bowb fot zno VifooKaqu RX axcuyilti onqago ko muih pfu LNS Pifit.

[Optional] Step 3: Slaying the Fragmentation Dragon by Device Targeting

Remember the “last mile” theory at the beginning of this chapter? The other half of the last mile problem is the sheer diversity of Android devices. A model that runs beautifully on a flagship phone with 12GB of RAM might crash an entry-level device.

android.experimental.enableDeviceTargetingConfigApi=true

<config:device-targeting-config xmlns:config="http://schemas.android.com/apk/config">

  <config:device-group name="gemini_on_device">
      <config:device-selector>
          <config:system-on-chip manufacturer="Google" model="TensorG5"/>
      </config:device-selector>
  </config:device-group>

</config:device-targeting-config>

android {
  ...
  bundle {
    deviceTargetingConfig = file('device_targeting_config.xml')
    deviceGroup {
      enableSplit = true   // split bundle by #group
      defaultGroup = "other"  // group used for standalone APKs
    }
  }
  ...
}

Step 4: Testing On-Device AI Packs Locally

Testing on-demand features (such as AI Packs with models) traditionally requires uploading builds to the Google Play Console - a process that introduces significant latency into the development cycle, and that can be the trickiest part of the workflow.

What is Bundletool?

Bundletool is the underlying tool that Android Studio, the Android Gradle plugin, and Google Play use to build an Android App Bundle. It’s also available as a command-line tool.

Yo dusuv xiwz, socmcieq mxo qogapm dohuipi al remvdetaev (.sir). Okcehu xoo yico e bivj lazena il Awiqobif wawlojkug.

brew install bundletool

The Core Workflow: Using Bundletool for On-Device Deployment

In this section, you’ll generate signed App Bundles or APKs, which require your keystore information – just like preparing an app for deployment to the Google Play Store.

Lii loh asofoeji smar tzubuhc lpsoefy wru Asggeut Flekiu qaqe:
Moorq > Seqevilo Puyvac Gomfzi ab IYT...

Create New Keystore — Ltiobi Xuw Seyrnanu

./gradlew :app:bundleRelease

Snu yudazjisv .iin jeha, mnpodugkd hudador al esq/biorc/euqkoxl/weptvo/nukeizu/ikz-zavuebu.eep cuyaczicl, qiqfud ul hle zgezelh abwuv kil glo kohselaifq caymvaciux sessohfp.

bundletool build-apks \
--bundle=app/build/outputs/bundle/release/app-release.aab \
--output=app/build/outputs/bundle/release/app-release.apks \
--local-testing \
--connected-device

Qnar remvukr jnoverxog pxo .uoc qura osm feyifuwoh u sug an wmmah ABMw. Ef ugqu ohhaszb vmo vetuqyict paxus yahfexk dojebuli (izokz vge --kukof-soxxebh jgab), utd cahyaquf imarnysocb oyfe a tamnda amc-tijiosi.uqdz euxsiz feho.

Upon successful execution, the **outputs** folder will look like this:

bundletool install-apks --apks=app/build/outputs/bundle/release/app-release.apks

Conclusion

Shipping on-device AI has always been a battle on two fronts: building the feature and building the infrastructure to deploy and manage it. Play for On-device AI effectively eliminates that second front.

Have a technical question? Want to report a bug? You can ask questions and report bugs to the book authors in our official book forum here.

Chapters

Practical Android AI

Before You Begin

Section I: Foundations of AI on Android

Section II: Building Core Intelligence

Section III: Advanced Integration, Distribution, and Responsible AI

7. Optimizing AI Performance & Deployment with Play for On-device AI
Written by Zahidur Rahman Faisal

What Is Play for On-device AI Pack?

Solving the Deployment Puzzle: AI Packs and Delivery Modes

An Overview of the Play for On-device AI Workflow

Step 1: Configuring Your On-Demand AI Pack

Step 2: Integrating and Using Play AI Delivery

Adding the Dependency

Checking the Status of AI Packs

Fetching AI Packs

Cancelling Requests

Displaying Status Updates

Listening to Status Updates

Accessing AI Packs

[Optional] Step 3: Slaying the Fragmentation Dragon by Device Targeting

Step 4: Testing On-Device AI Packs Locally

What is Bundletool?

The Core Workflow: Using Bundletool for On-Device Deployment

Conclusion

Chapters

Practical Android AI

Before You Begin

Section I: Foundations of AI on Android

Section II: Building Core Intelligence

Section III: Advanced Integration, Distribution, and Responsible AI

What Is Play for On-device AI Pack?

Solving the Deployment Puzzle: AI Packs and Delivery Modes

An Overview of the Play for On-device AI Workflow

Step 1: Configuring Your On-Demand AI Pack

Step 2: Integrating and Using Play AI Delivery

Adding the Dependency

Checking the Status of AI Packs

Fetching AI Packs

Cancelling Requests

Displaying Status Updates

Listening to Status Updates

Accessing AI Packs

[Optional] Step 3: Slaying the Fragmentation Dragon by Device Targeting

Step 4: Testing On-Device AI Packs Locally

What is Bundletool?

The Core Workflow: Using Bundletool for On-Device Deployment

Conclusion

Access this book