The rise of Machine Learning on mobile platforms

The time for Mobile ML is here, and the possibilities are many. If you’ve not yet given much thought to how Machine Learning technology can make your mobile software better, now is the time!

Machine Learning has long been a big part of our lives (even if we don’t often think about it). Estimating a customer’s likelihood to pay a bill or ranking pages in a web search result are common ML implementations we use often but rarely think about.

In part due to the expense of processing power (CPU/GPU) and data storage requirements, ML has for decades been the domain of darkened data centers rarely seen by end-users. This is rapidly changing, and mobile developers now have a plethora of new tools and platforms to choose from to make their current mobile solutions more valuable and open up new solution possibilities.

We’re in a golden era where all platform mega-vendors providing mobile infrastructure are rolling out mobile-accessible tools for mobile developers. For example:

Apple CoreML

Amazon Machine Learning for Android & iOS

Google ML Kit for Firebase

Microsoft Custom Vision export to CoreML

IBM Watson Services for CoreML

All of these are excellent offerings. In future posts I’ll be reviewing many of them, highlighting their relative strengths and exploring use cases — so stay tuned!

What’s Machine Learning, anyway?

Machine Learning is an idea that has deep and ancient roots in computer science, dating back to the term’s coining by Arthur Samuel in 1959. But what is Machine Learning, and why is it now coming to mobile computing platforms?

Machine learning (ML) is a field of artificial intelligence that uses statistical techniques to give computer systems the ability to “learn” (e.g., progressively improve performance on a specific task) from data, without being explicitly programmed. — Wikipedia

The definition of ML brings to mind SkyNet in the Terminator movie series — where ML-enabled mobile devices run amok and plot to destroy humankind. But for the most part ML is about using computational techniques to improve the effectiveness of computer software that address everyday computing problems for which conventional techniques have often failed.

Why is Machine Learning landing everywhere now?

ML has been around for decades, but the cost of computing power and data storage has kept it mostly locked in data centers with 7-digit budgets. Today, the CPUs and GPUs shipping with mobile devices have more computing power than web browsers and e-mail clients really need. Using that extra capacity to drive ML techniques on mobile is now possible. The answer to “Why now?” is simply: “Because now we can.”

From a programming point of view, ML is all about applying our current abundance of CPU/GPU power to solve problems that aren’t efficient or possible using traditional, declarative programming techniques.

The rapidly increasing computing capacity on mobile devices today begs the opportunity to bring ML right down to the device-level. The mobile platform providers mentioned above (and others) are responding with at once compelling and cost effective solutions we can use at any layer of our tech stack — from server to mobile phone.

How Machine Learning Works (essentially)

ML is a broad, broad topic — much broader than I can cover in a single blog post. But from a software engineering point of view, ML is really about using statistical likelihood rather than deterministic procedural code to calculate answers given some input data.

ML without ML

Let’s say we didn’t have ML, and just wanted to classify images by writing some code. Our initial pass might look something like this:

func IdentifyObjectInPhoto(image: UIImage) -> String {
   if imageHasLotsOf(UIColor.blue) {
      if imageHasSun() {
         if imageTwoHasDiagonalLines() {
            return “Mountain”
         } else {
            return “Sky”
         }
      } 
    } else if imageHasLotsOf(UIColor.green) {
         if imageHasVerticalBrownLine() {
            return “Tree”
         } else {
            return “Grass”
         }
    }
    return “I have no Idea”
}

This code might actually work — sometimes. But it would fail too often to be reliable. Trying to replicate the human brain’s ability to recognize an image using procedural code alone is doomed to failure. Even if we could do it, our project sponsors couldn’t afford to pay us to develop this type of solution.

ML with ML

The Machine Learning approach alternative uses statistics to build a mathematical model (typically using an Artificial Neural Network algorithm). Basically the network “train itself” using a large “training data set” of images to classify images.

At the most basic level, the neural network is actually kind of like the coding solution. It’s not code, but it would still develop a sort of evaluation algorithm that finds correlations between image traits it observes and “the right answer” — e.g. “Tree”, “Mountain”, “Grass”, etc.

The biggest difference is that the machine (the computer) uses training data to learn which traits should be paid attention to in order to classify the image.

The machine learning process that contrasts with the above code looks like this:

There could be literally millions of branches in the logic that correctly classifies an image — more than a human programmer could ever produce by writing code. But training process might only take a few minutes to build the model.

When building the model, the Machine Learning process follows a somewhat brute-force, trial-and-error process to find a set of tests that accurately predicts what the image is.

Note that predicts is a key word here, since ML models that are 100% accurate are actually rare.

Not Just for Images

Most mobile ML examples deal with the image classification domain — for example above determining if a landscape photo is a tree or mountain. And this is an important domain for mobile — image data is notoriously difficult to work with as a data processing source, and mobile devices have cameras that serve as excellent data collection devices.

But the same training and model deployment strategy can work for all kinds of data. For example, text recognition is essentially accomplished in exactly the same way:

Text recognition begins by allowing a Machine Learning training process examine lots and lots and lots of letters in lots of lots of fonts (and even hand-written text), to build a model that can predict what symbol (letter, in Latin text) each image of a letter actually is.

As with images, text recognition isn’t perfect — we’ve probably all seen OCR output that misspells words! But ML models can be improved over time by feeding new data into subsequent training iterations, along with continued evolution in ML research and increases in CPU/GPU power.

Does ML live On the Mobile Device or in the Cloud?

This article started with the idea that ML historically has lived in the data center due to high CPU/GPU and high data storage requirements. Does all this move onto mobile now that devices are so much more powerful than before?

This question really has an “it depends” answer, or maybe a “yes and no” answer. In practice, many ML models — once trained — are not large or processor-intensive, and can easily live “on the device”. In other words, it’s is very common now to take the output asset of the ML training process and embed it into a mobile application.

Why put the model “on the device”?

Several good reasons:

  • If the model is on the device, it doesn’t need an Internet connection to be used. This can be greatly important in many applications (especially B2B and commercial deployments where work is done without Internet connectivity).
  • Hosting ML models in the cloud isn’t free — so if the model can be embedded, both the hosting cost and cost of systems administration are eliminated.

Apple’s Core ML architecture, for example, only (at time of this writing) supports models that are deployed to the device, i.e. embedded with the application (though models can be originally created in a variety of ways).

Why put the model “in the cloud”?

While ML models can, and often should, be embedded on device, there are some scenarios where it would make more sense for them to live in the cloud:

  • Typically, models that are deployed on device can’t be trained after they’re deployed. If your ML model needs frequent training, letting the mobile device app use the model as a remote resource at the end of a cloud API may make more sense.
  • Though many (maybe the majority) of ML models are relatively compact and “fit” on a modern mobile device storage, some may be very large — and others may require more CPU/GPU power than exists on an end-user mobile device.

Can I have it both ways?

Yes, of course. Platform suppliers are already iterating their architectures to allow models to be used on device and/or in cloud data centers. For example, Apple itself provides tools to translate server-based models to Core ML for iOS distribution (you can still use the model on server).

Others are developing architectures where local models can be used as a “fallback” when cloud models are unavailable. In this scenario, the most recently trained server model would be used whenever available, but if unavailable then an older iteration local model could be used as backup.

The best solution of course depends on the application’s needs — and a variety of factors.

Summary and Call to Action

If your company develops mobile apps for in-house or customer use, and you’ve not yet given much thought to how Machine Learning technology can make your software better, now is the time! Tools, platforms and tech to integrate ML into your app have never been better, more affordable or more accessible as they are today.

How could you use ML to make your app better? Start brainstorming with these:

  • Use text recognition to allow users to enter data with the camera rather than a virtual keyboard.
  • Add barcode scanning to your application to remove barriers to data entry.
  • Use a conversational bot to enable resolution and/or intelligent routing of customer service requests right from your mobile app.
  • Use image classification to recognize products visually (even products that don’t have barcodes).
  • Use image landmark detection to provide context-specific information to mobile users.
  • Train an ML model to recognize product failures (or possible failures) based on installed product photos, allowing customers to self-diagnose product failures and identify preventative maintenance oversights.
  • And many more!

The time for Mobile ML is here, and the possibilities are many. With ML being addressed in complimentary ways on the platform side (IBM, Microsoft, AWS) and on the device side (Apple, Google), the stars are truly aligning for ML on mobile.

Share your thoughts! What are you using Mobile ML for, or what would you like it to do for your app?

Leave a Reply

Your email address will not be published. Required fields are marked *