Machine Learning Deployment – Part I: Awareness

 

One of the most critical pieces to a successful data science project is taking a model from prototype and deploying it into production. This series is primarily geared toward automated re-training and re-deployments, with an ultimate goal of online learning. There are many tools to consider and it’s important to recognize the amount of “unknown-unknowns” that still exist in this field. Best practices and industry standards aren’t fully defined yet and just beginning to emerge. We hope to provide insights and links to various articles we have found through research over the past year. We believe this can help give you exposure to the model deployment landscape and guide your journey.

Screen+Shot+2019-07-03+at+11.13.45+AM.jpg

Sample Problem Statement

A data scientist has just finished building a predictive model and it is time for deployment. The model needs to score data in real-time when a certain page on your company’s website is landed on, with the goal being to provide the user with a better experience. In this example, we know the company is targeting new customer segments, each having different preferences in the coming months. We would like the model to automatically be re-trained and deployed once a day in order to provide these new customers with an optimal experience, as well as adjust to existing customer preferences that may shift. This project presents an exciting challenge without an immediately obvious solution.

It’s important to note this model will drive business decisions and therefore should be well documented and maintainable going forward so we can continue providing good customer service in the years to come. In short, those who originally built and deployed the model may not be the ones to maintain the model long term as they could move roles or leave the company. As with other software, the training and deployment should be done in a way that someone with fresh eyes on the code can understand what is going in a timely manner to be able to apply hot fixes and any necessary updates. After all, if these models become a critical part of your business then downtime means potential lost revenue – and the more impact the model has, the more potential loss of revenue there is. There are a number of ways we can mitigate that risk which we will discuss in this series. Since data scientists enjoy experimentation, we should be able to easily roll back to previous versions of the model, and create a deployment process that allows data scientists to add and remove new features if possible and allow for A/B testing. Undoubtedly, there are a number of possible pitfalls that should be automatically tested for.

Partaking in data mining competitions in the past, I had always wondered how automated re-training and deployments of the models would work when real systems depend on them. As an avid learner, this was particularly interesting to me. There are a number of risks we need to consider when we deploy a model, including: the combination of metrics and associated acceptable thresholds the models should be tested for; foreseeing potential upstream changes that could impact the model; and monitoring the features and output of the model to signal an out of control process.

A common way to refer to what we’ve been discussing is “technical debt”. It is important to appreciate technical debt and understand when it makes sense to take it on and the pros and cons for doing so. Look for a future post when we’ll discuss when and why we choose to take on technical debt at The General®. A resource we have found particularly helpful in guiding our decisions is a white paper Google published, Hidden Technical Debt in Machine Learning Systems. Specifically, we appreciated how Google breaks out technical debt at the system level versus the code level. Especially when working with data scientists, it is important to differentiate how you may accumulate technical debt both from the pipelines the engineers build and the requirements of a predictive model re-training itself.

Part II of this series has been posted.

Mike Kehayes