Let’s look at how this practice helped Google Play improve app install rate:
By comparing the statistics of serving logs and training data on the same day, Google Play discovered a few features that were always missing from the logs, but always present in training. The results of an online A/B experiment showed that removing this skew improved the app install rate on the main landing page of the app store by 2%.
Thus, one of the most important MLOps lessons Google has learned is: continuously monitor model input data for changes. For a production ML application, this is just as important as writing unit tests.
Let’s take a look at how skew detection works in Vertex AI.
How is skew identified
Vertex AI enables skew detection for numerical and categorical features. For each feature that is monitored, first the statistical distribution of the feature’s values in the training data is computed. Let’s call this the “baseline” distribution.
The production (i.e. serving) feature inputs are logged and analyzed at a user determined time interval. This time interval is set to 24 hours by default, and can be set to any value greater than 1 hour. For each time window, the statistical distributions of each monitored feature’s values are computed and compared against the aforementioned training baseline. A statistical distance score is computed between the serving feature distribution and training baseline distribution. JS divergence is used for numerical features and L-infinity distance is used for categorical features. When this distance score exceeds a user configurable threshold, it is indicative of skew between the training and production feature values.