“Deploying something useless into production, as soon as you can, is the right way to start a new project. It pulls unknown risk forward, opens up parallel streams of work, and establishes good habits.”
This is a quote by Pete Hodgson, from his article ‘Hello, production’. It in a nutshell, it explains the benefits of taking deployment pains early on in a software development project, and then using the initial deployment skeleton as the basis for rapidly delivering useful functionality into production.
The idea of making an initial ‘Hello, production’ release has had a big influence on how we think about the development of machine learning systems. We’ve mapped ‘Hello, Production’ into the machine learning space, as follows,
Train the simplest model conceivable and deploy it into production, as soon as you can.
A reasonable ‘Hello, production’ model could be one that returns the most frequent class (for classification tasks), or the mean value (for regression tasks). Scikit-Learn provides models for precisely this situation, in the sklearn.dummy sub-module. If the end-goal is to serve predictions via a web API, then the next step is to develop the server and deploy it into a production environment. Alternatively, if the model is going to be used as part of a batch job, then the next step is to develop the job and deploy that into production.
The advantage of following this process, is that it forces you to confront the following issues early on:
Each one of these issues is likely to involve input from people in other teams and is critical to overall success. Failure on any one of these can signal the end for a machine learning project, regardless of how well the models are performing. Success also demonstrates an ability to deliver functional software, which in our experience creates trust in a project, and often leads to more time being made available to experiment with training more complex model types.
Bodywork is laser-focused on making the deployment of machine learning projects, to Kubernetes, quick and easy. In what follows, we are going to show you how to use Bodywork to deploy a ‘Hello, production’ release for a hypothetical prediction service, using Scikit-Learn and FastAPI. We claim that it will take your under 15 minutes to work through the steps below, which includes setting-up a local Kubernetes cluster for testing.
Deploying machine learning projects using Bodywork requires you to have a GitHub account, Python 3.8 installed on your local machine and access to a Kubernetes cluster. If you already have access to Kubernetes, then skip to Step 1, otherwise read-on to setup a single node Kubernetes cluster on your local machine, using Minikube.
If you don’t have access to a Kubernetes cluster, then an easy way to get started is with Minikube. If you are running on MacOS and with the Homebrew package manager available, then installing Minikube is as simple as running,
If you’re running on Windows or Linux, then see the appropriate installation instructions. Once you have Minikube installed, start a cluster using the latest version of Kubernetes that Bodywork supports,
And then enable ingress, so we can route HTTP requests to services deployed using Bodywork.
You’ll also need the cluster’s IP address, which you can get using,
When you’re done with this tutorial, the cluster can be powered-down using.
Head over to GitHub and create a new public repository for this project - we called ours bodywork-scikit-fastapi-project. If you want to use Bodywork with private repos, you’ll have to configure Bodywork to authenticate with GitHub via SSH. The Bodywork User Guide contains details on how to do this, but we recommend that you come back to this at a later date and continue with a public repository for now.
Next, clone your new repository locally,
Create a dedicated Python 3.8 virtual environment in the root directory, and then activate it,
Finally, install the packages required for this project, as shown below,
Then open-up an IDE to continue developing the service.
We want to demonstrate a ‘Hello, production’ release, so we’ll train a Scikit-Learn `DummyRegressor`, configured to return the mean value of the labels in a training dataset, regardless of the feature data passed to it. This will still require you to acquire some data, one way or another.
For the purposes of this article, we have opted to create a synthetic one-dimensional regression dataset, where the only feature, `X`, has a 42% correlation with the labels, `y`, and both features and labels are distributed normally. We have added this step to our training script, `train_model.py`, reproduced below. When you run the training script, it will train a `DummyRegressor` and save it in the project’s root directory as `dummy_model.joblib`.
Beyond use in ‘Hello, production’ releases, models such as this represent the most basic benchmark that any more sophisticated model type must out-perform - which is why the script also persists the model metrics in `dummy_model_metrics.txt`, for comparisons with future iterations.
The ultimate aim for our chosen machine learning system, is to serve predictions via a web API. Consequently, our initial ‘Hello, production’ release will need us to develop a skeleton web service that exposes the dummy model trained in Step 2. This is achieved in a Python module we’ve named `serve_model.py` , reproduced below, which you should also add to your project.
This module loads the trained model created in Step 2 and then configures FastAPI to start a server with an HTTP endpoint at `/api/v1/`. Instances of data, serialised as JSON, can be sent to this endpoint as HTTP POST requests. The schema for the JSON data payload is defined by the `FeatureDataInstanace` class, which for our example only expects a single `float` field named `X`. For more information on defining JSON schemas using Pydantic and FastAPI, see the FastAPI docs.
Test the service locally by running `serve_model.py`,
And then in a new terminal, send the endpoint some data using `curl`,
Which confirms that the service is working as expected.
All configuration for Bodywork deployments must be kept in a YAML file, named `bodywork.yaml` and stored in the project’s root directory. The `bodywork.yaml` required to deploy our ‘Hello, production’ release is reproduced below - add this file to your project.
Bodywork will interpret this file as follows:
Refer to the Bodywork User Guide for a complete discussion of all the options available for deploying machine learning systems using Bodywork.
The project is now ready to deploy, so the files must be committed and pushed to the remote repository we created on GitHub.
When triggered, Bodywork will clone the remote repository directly from GitHub, analyse the configuration in `bodywork.yaml` and then execute the deployment plan contained within it.
The first thing we need to do, is to create and setup a Kubernetes namespace for our deployment. A namespace can be thought of as a virtual cluster (within the cluster), where related resources can be grouped together. Use the Bodywork CLI to do this,
The easiest way to run your first deployment, is to execute the Bodywork workflow-controller locally,
This will orchestrate deployment on your cluster and stream the logs to your terminal. Refer to the Bodywork User Guide to run the workflow-controller remotely.
Once the deployment has completed, the prediction service will be ready for testing. Bodywork will create ingress routes to your endpoint using the following scheme: `/K8S_NAMESPACE/PROJECT_NAME--STAGE_NAME/`. Such that we can make a request for a prediction using,
Returning the same value we got when testing the service earlier on. Congratulations, you have just deployed your ‘Hello, production’ release!
If you used Minikube to test Bodywork locally, then the next logical step would be to deploy to a remote Kubernetes cluster. There are many options for creating managed Kubernetes clusters in the cloud - see our recommendations
If a web service isn’t a suitable ‘Hello, production’ release for your project, then check out the Deployment Templates for other project types that may be a better fit - e.g. batch jobs or Jupyter notebook pipelines.
When your ‘Hello, production’ release is operational and available within your organisation, it’s then time to start thinking about monitoring your service and collecting data to enable the training of the next iteration. Godspeed!
If you run into any trouble, then don't hesitate to ask a question on our discussion board and we'll step-in to help you out.
Learn about the latest features and releases.