Scratching the surface of ML Deployment

Pooja Mahajan
3 min readAug 14, 2021

In this article we will be discussing some of the deployment considerations and patterns. Deployment being the last step of machine learning cycle has gained lot of traction recently, so let’s try to get hang of some of the concepts.

We will be discussing some of the common deployment patterns i.e. how to start consuming your new algorithm in production. Apart from this, we will touch some of the concepts related to changing data distributions that affect model predictions.

Image Courtesy — Unsplash

Deployment Patterns

By deployment pattern we mean how to start consuming your new algorithm in production. It can be a replacement of an old algorithm, or replacement of old methods(manual/conventional),etc.

A) Shadow deployment :

  • The aim of shadow deployment is to deploy and evaluate new algorithm’s performance but not to use it for real-time predictions.
  • Currently used method is utilized for actual predictions and performance comparison is done (e.g. comparison with another deployed model or conventional methods using human predictions, etc.).
  • It’s a decent way to judge a new model and risk free.
  • No impact on current production and new algorithm can be tested with production load.

B) Canary deployment:

  • The main idea of canary deployment is to roll out predictions for a small proportion of traffic using new algorithm and evaluate performance.
  • It helps to monitor and spot problems early, if any.
  • It can be ramped up gradually (i.e. incremental increase in the traffic proportion for newer algorithm).

C) Blue green deployment:

  • Here blue signifies current prediction service and green as new prediction service.
  • We can set up new prediction service separately without stopping current one and can shift to the new prediction service. If it doesn’t go well, we can point to blue again.
  • Easier to rollback and no downtime while cost and operational overheads will be there.
Image — by Author

Deployment considerations

Once the model is deployed it is important to understand changes in statistical distribution of the data being used. There are two aspects that should be considered.

A) Data drift

  • It arises when there are changes in distribution of underlying variables which results in degrading model predictions, or in simple terms when distribution of independent variables change.
  • This drift can arise from changes in underlying business logics, data quality challenges, etc. E.g. in case of image recognition data drift can arise due to setup changes like lighting, device, etc. , or in speech recognition systems due to upgradation of microphone.

B) Concept drift

  • It arises when relation between predictor and target variables has changed resulting in model prediction degradation, or simply put when relation of x -> y changes (x being independent variable and y being dependent variable).
  • A common example of concept drift can be online shopping patterns of customers before and after COVID as there has been lot of changes in consumer buying behavior. Another example can be a price prediction model where the target variable relation(price) has changed due to inflation.

So that’s it, we have made to the end of this article. It’s just the tip of the deployment iceberg, there is a lot underneath!

References

--

--