Using Google’s MLKit and CameraX for Lightweight Barcode Scanning

What is Google’s MLKit?

MLKit is a powerful Machine Learning library optimized for mobile applications. Previously known as ML Kit for Firebase, it is now a standalone library. It includes a variety of APIs — from barcode scanning to text recognition and translation. The Barcode Scanning API is easy to use, runs on the device (no network connection required), and decodes most standard barcode formats, including Linear formats such as UPC-A and UPC-E, as well as 2D formats such as QR Codes.

More documentation here

What is CameraX?

CameraX is a Jetpack library that solves a lot of frustrations Android developers would have to deal with while handling camera functionality. Things like aspect ratio and orientation are handled automatically, leaving the developer free to focus on building robust user experiences rather that worrying about camera configuration.

Similar to MLKit, CameraX is very easy to use. The functionality of CameraX is divided into different “use cases”. Currently there are three: Preview, Image analysis, and Image capture. The Preview use case is used for displaying a live view of the camera; the Image analysis use case lets us analyze that live preview, and the Image capture use case takes and saves a photo. To implement barcode scanning, we will use the Preview use case to display the camera to the user and Image analysis to send image buffers to MLKit to decode the barcodes.

More documentation here

Let’s Build!

  1. Setting Up

First, add the needed MLKit and CameraX dependencies to your build.gradle file:

Note: This MLKit dependency will dynamically download the barcode scanning model from Google Play Services. Alternatively, the model can be bundled with your app. More info here.

Dynamically downloading the model has pros and cons. Our APK size will not be affected but users can end up in a less-than-ideal state if the model fails to download when needed (more on this later).

Next, add the necessary permission to your Manifest so we can get access to the camera:

2. Create the View
We’ll need an XML file that contains a PreviewView to display the camera feed and a TextView to display the results of decoding our barcode.

PreviewView comes from the camera-view dependency we defined above and is the recommended view to use when defining a CameraX Preview use case. The PreviewView is the View that represents our live camera view.

3. Request permissions
Now let’s work on our Activity. First, we need to make sure the user has granted us permission to use the camera. This part is pretty standard, so I will refrain from explaining in depth. You can read the official documentation here for any additional context needed.

Note: We’re using ViewBinding here:

To use ViewBinding, make sure it is enabled in your app level build.gradle file:

4. Set up the Preview Use Case
Now, let’s flesh out our bindCameraUseCases() method.

The most important parts here are our previewUseCase object and our call to cameraProvider.bindToLifecycle(). Remember the Preview use case allows us to show the live camera feed to the user. We make an instance of this use case (Preview.Builder.build()), connect it to the PreviewView we defined in our XML (it.setSurfaceProvider(binding.cameraView.surfaceProvider)), and bind it to the lifecycle of our Activity (cameraProvider.bindToLifecycle(this, cameraSelector, previewUseCase)). Calling bindToLifecycle() will complete the setup and allow the camera preview to actually render on the screen.

Your app should now show the live camera preview! Something like this:

5. Set up ML Kit’s BarcodeScanner and the Image Analysis Use Case
In the same block of code where we’re defining our Preview use case, we can add an Image Analysis use case. This is what will allow us to take the live preview and perform logic with its contents.

We also need to update our call to cameraProvider.bindToLifecycle() to include the new analysisUseCase :

Within setBarcodeFormats()I’m defining my BarcodeScanner to accept a number of 2D barcode formats as well as QR codes. The desired formats will depend on what type of barcodes you will need to support. In order to optimize MLKit’s barcode scanning abilities, only the needed formats should be specified.

6. Decoding the barcodes
Our processImageProxy() function that we pass into analysisUseCase.setAnalyzer() will contain our actual meaningful logic. In this method, we will pass our image over to MLKit’s process method in order to decode the barcode:

Here, we’re taking our ImageProxy object that we received from the analysisUseCase Analyzer and using it to create an InputImage object — the type that MLKit is expecting. We’re then passing that InputImage into barcodeScanner.process() which instructs MLKit to decode any barcode present in the image. We’ll take the first barcode detected in the frame, pull out the rawValue (in most cases, a string of numbers, but in some cases, such as QR codes, plain text or URLs), and set that rawValue to be the text of our textView. In this basic example, doing so will allow us to ensure the decoding is working properly. For a real app we’d likely want to use this rawValue to pass to an API call and get more information to display, such as product images and titles.

Once this is complete, we clean up by closing the imageProxy. This is important, as not doing so may leave the user in a stalled state. Equally important is the failure case. This failure could happen if the user fails to connect to Google Play Services and download the needed barcode scanning model. If this happens, no analysis will take place and your app will do nothing. This would be a good spot to show some sort of error to the user and provide steps for them to resolve it (finding a better network connection, making sure they’re logged into the PlayStore, or upgrading their PlayStore version — all of which, though rare, could result in a failure the first time they use the app; after successfully downloading the model the first time, it will not need to be downloaded on subsequent usages).

Voilà!

Now, running the app, you should see a live camera preview. When a barcode is presented in that preview, MLKit should decode the barcode’s value and our TextView should show the results! Something a little like this:

Check out the entire code sample on Github here

Android Engineer