Chapters

Hide chapters

Practical Android AI

First Edition · Android 13 · Kotlin 2.0 · Android Studio Otter

7. Optimizing AI Performance & Deployment with Play for On-device AI
Written by Zahidur Rahman Faisal

Heads up... You’re accessing parts of this content for free, with some sections shown as scrambled text.

Heads up... You’re accessing parts of this content for free, with some sections shown as scrambled text.

Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.

Unlock now

Hello reader! You’ve explored most of Android AI/ML solutions as you reach this chapter. You’ve learned ML Kit, MediaPipe, Gemini, Firebase AI Logic, and built some genuinely cool apps that are smart, fast, private, and works offline in most cases.

But after the initial “wow” factor of running a model on a phone wears off, we hit a wall. It’s what I call the “last mile” problem of on-device AI.

The proof-of-concept works great on your high-end test devices. But then the questions start rolling in:

“This new generative model is 500MB. We can’t just stick that in the APK, can we?

“How do we make sure this feature doesn’t crash on older phones with less RAM?”

“Our data science team has a new, better version of the model. How do we ship it to users without forcing a full app update?”

Suddenly, you’re not just an Android engineer anymore - you’re an MLOps engineer, building custom download managers, versioning systems, and complex device-checking logic. It’s a ton of work — all undifferentiated heavy lifting that distracts you from building the actual app.

For a while, that was just the cost of doing business on the cutting edge. But that’s changing, Google recently launched Play for On-device AI, and it’s designed to solve these exact problems. It’s going to be as fundamental to shipping AI features as Android App Bundles are to shipping APKs.

Remember building the On-device LLM app at Chapter 5? You had to download and copy the TinyLlama model, which was about 1.25 GB! Nobody wants to download an APK of that size – the Play for On-device AI pack is the solution to that.

What Is Play for On-device AI Pack?

Think about the problems that App Bundles solved. Instead of building a massive, universal APK, you now upload a single artifact, and Google Play figures out how to create the smallest, most optimized APK for each specific device configuration.

Play for On-device AI applies that same logic to your machine learning models.

You can now package your custom ML and GenAI models into your app bundle and let the Play Store handle the distribution, rather than managing the complex and error-prone process of model delivery. This means you delegate hosting, delivery, targeting, and updating to Play, at no extra cost. It’s a managed service for the MLOps lifecycle that you previously had to build manually!

The best part is that you control when the model is delivered to the user. You can configure model downloads as:

  • Install-time – delivered with the app during installation.

  • Fast-follow – downloaded immediately after the app is installed.

  • On-demand – downloaded only when your app explicitly requests it.

Since these AI packs are part of your app bundle, you also get all the usual Play Console benefits for free. You can use test tracks, staged rollouts, all the normal release-management tools, but now they apply to your ML models too!

Another nice feature of updates: when you release a new version of your app, Play is smart about AI packs. If a particular pack hasn’t changed, users won’t need to download it again. Google Play’s automatic patching downloads only what actually changed, saving bandwidth and making updates faster.

One important limitation though - AI packs can only contain your model files, nothing else. You can’t put Java, Kotlin, or native libraries in there. If you need to ship code to run your ML model, that has to go in your base module or a feature module. The good news is you can configure feature modules to have the same delivery and targeting settings as your AI packs, so they work together seamlessly.

Solving the Deployment Puzzle: AI Packs and Delivery Modes

The core concept is the AI pack. It’s a special container within your app bundle that holds your models. The magic is in how these packs get delivered to the user. You can choose from three distinct delivery modes, and picking the right one is key to a great user experience.

An Overview of the Play for On-device AI Workflow

Using Play for On-device AI to package and deploy AI Packs is pretty straightforward. Let me break down the process for you:

Step 1: Configuring Your On-Demand AI Pack

Project Structure
Zbucagb Ygjocluyo

// In onDemandAiPack/build.gradle
plugins {
  id 'com.android.ai-pack'
}

aiPack {
  packName = "onDemandAiPack"
  dynamicDelivery {
    deliveryType = "on-demand"
  }
}
// In app/build.gradle
android {
  //...
  assetPacks = [":onDemandAiPack"]
}
// In settings.gradle
include ':app'
include ':onDemandAiPack'
AI Pack Structure
IU Rexd Kcgiqqako

Step 2: Integrating and Using Play AI Delivery

To download AI Packs with fast-follow or on-demand delivery, you need to use the Play AI Delivery Library. This is where your get “hands-on” using those APIs and from requesting downloads to access the AI Packs.

Adding the Dependency

Get started by adding this dependency In the app-level build.gradle file:

// In app/build.gradle
dependencies {
  ...
  // TUTORIAL DEPENDENCIES
  var aiDeliveryVersion = "0.1.1-alpha01"
  implementation "com.google.android.play:ai-delivery:$aiDeliveryVersion"
}
Checking AI Model
Yciwgibq OA Pohej

Checking the Status of AI Packs

Before using assets or models from an AI pack, you should check whether the pack has already been downloaded.

fun checkPackStatus(packName: String) {
  val packLocation = aiPackManager.getPackLocation(packName)
  if (packLocation != null) {
    _aiPackStatus.value = AiPackStatus.Installed(packLocation.toString())
    return
  }

  aiPackManager.getPackStates(listOf(packName))
    .addOnSuccessListener { states ->
      val state = states.packStates()[packName]
      _aiPackStatus.value = mapAiPackStatus(state = state)
    }.addOnFailureListener { e ->
      _aiPackStatus.value = mapAiPackStatus(state = null)
    }
}

Fetching AI Packs

The aiPackManager handles the heavy-lifting. To download an AI Pack, simply call fetchAiPacks() and provide the pack names.

fun fetchAiPacks() {
  aiPackManager.fetch(AI_PACKS)
}

Cancelling Requests

In case you want to cancel any fetch request, cancelling is as simple as fetching AI Packs using Play AI Delivery Library. Adding this function to the MainViewModel will allow you to do that:

fun cancelRequests() {
  aiPackManager.cancel(AI_PACKS)
}

Displaying Status Updates

Play AI Delivery Library returns AiPackState while fetching AI Packs or checking AI Pack status. The AiPackState is a public interface that looks like this:

@Retention(RetentionPolicy.CLASS)
public @interface AiPackStatus {
  int UNKNOWN = 0;
  int NOT_INSTALLED = 8;
  int PENDING = 1;
  int WAITING_FOR_WIFI = 7;
  int REQUIRES_USER_CONFIRMATION = 9;
  int DOWNLOADING = 2;
  int TRANSFERRING = 3;
  int COMPLETED = 4;
  int FAILED = 5;
  int CANCELED = 6;
}
private fun mapAiPackStatus(state: AiPackState?): AiPackStatus {
  return when (state?.status()) {
    AssetPackStatus.NOT_INSTALLED -> {
      AiPackStatus.NotInstalled
    }

    AssetPackStatus.WAITING_FOR_WIFI,
    AssetPackStatus.REQUIRES_USER_CONFIRMATION -> {
      AiPackStatus.RequestConfirmation
    }

    AssetPackStatus.PENDING,
    AssetPackStatus.DOWNLOADING,
    AssetPackStatus.TRANSFERRING -> {
      AiPackStatus.Downloading(state.transferProgressPercentage())
    }

    AssetPackStatus.COMPLETED -> {
      val assetLocation = getAssetFromAiPack(
        packName = AI_PACK_NAME,
        assetName = Model.TINYLLAMA_1_1B_CHAT_V1_0.modelName
      )
      if (assetLocation == null) {
        AiPackStatus.NotInstalled
      } else {
        AiPackStatus.Installed(location = assetLocation)
      }
    }

    AssetPackStatus.CANCELED,
    AssetPackStatus.FAILED -> {
      AiPackStatus.Failed(errorCode = state.errorCode())
    }

    else -> {
      AiPackStatus.Unknown
    }
  }
}

Listening to Status Updates

Whenever you fetch an AI Pack or check status, listening to these status updates is made easy by the Play AI Delivery Library – you need to use the AiPackStateUpdateListener interface from the library.

init {
  aiPackManager.registerListener(this)
}
override fun onStateUpdate(state: AiPackState) {
  Log.d(TAG, "onStateUpdate: $state")
  _aiPackStatus.value = mapAiPackStatus(state)
}

Accessing AI Packs

Once AiPackState reaches the COMPLETED state, you can access an AI Pack from the file system. Calling aiPackManager.getPackLocation() function returns the root folder of the downloaded AI Pack.

private fun getAssetFromAiPack(packName: String, assetName: String): String? {
  val aiPackLocation = aiPackManager.getPackLocation(packName)
  val assetsFolderPath = aiPackLocation?.assetsPath()
  val assetFile = File(assetsFolderPath, assetName)
  Log.d(TAG, "Asset path: ${assetFile.absolutePath}")

  return if (assetFile.exists()) assetFile.absolutePath else null
}

[Optional] Step 3: Slaying the Fragmentation Dragon by Device Targeting

Remember the “last mile” theory at the beginning of this chapter? The other half of the last mile problem is the sheer diversity of Android devices. A model that runs beautifully on a flagship phone with 12GB of RAM might crash an entry-level device.

android.experimental.enableDeviceTargetingConfigApi=true
<config:device-targeting-config xmlns:config="http://schemas.android.com/apk/config">

  <config:device-group name="gemini_on_device">
      <config:device-selector>
          <config:system-on-chip manufacturer="Google" model="TensorG5"/>
      </config:device-selector>
  </config:device-group>

</config:device-targeting-config>
android {
  ...
  bundle {
    deviceTargetingConfig = file('device_targeting_config.xml')
    deviceGroup {
      enableSplit = true   // split bundle by #group
      defaultGroup = "other"  // group used for standalone APKs
    }
  }
  ...
}

Step 4: Testing On-Device AI Packs Locally

Testing on-demand features (such as AI Packs with models) traditionally requires uploading builds to the Google Play Console - a process that introduces significant latency into the development cycle, and that can be the trickiest part of the workflow.

What is Bundletool?

Bundletool is the underlying tool that Android Studio, the Android Gradle plugin, and Google Play use to build an Android App Bundle. It’s also available as a command-line tool.

brew install bundletool

The Core Workflow: Using Bundletool for On-Device Deployment

In this section, you’ll generate signed App Bundles or APKs, which require your keystore information – just like preparing an app for deployment to the Google Play Store.

Existing Keystore
Inidzejx Bocmyure

Create New Keystore
Ltiobi Xuw Seyrnanu

./gradlew :app:bundleRelease
bundletool build-apks \
--bundle=app/build/outputs/bundle/release/app-release.aab \
--output=app/build/outputs/bundle/release/app-release.apks \
--local-testing \
--connected-device
Upon successful execution, the **outputs** folder will look like this:
Build Folder
Haawb Nelwul

bundletool install-apks --apks=app/build/outputs/bundle/release/app-release.apks
Ready for Testing
Beezj voj Fopjugn

Conclusion

Shipping on-device AI has always been a battle on two fronts: building the feature and building the infrastructure to deploy and manage it. Play for On-device AI effectively eliminates that second front.  

Have a technical question? Want to report a bug? You can ask questions and report bugs to the book authors in our official book forum here.
© 2026 Kodeco Inc.

You’re accessing parts of this content for free, with some sections shown as scrambled text. Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.

Unlock now