dmlflow is a repository containing infrastructure as code and application code to deploy an MLFlow server to AWS. The infrastructure as code language used is Terraform. Mlflow is a framework for systematically recording experiments and acts as a model registry. AWS fargate is used to host the Mlflow server remotely, AWS S3 as a artifact store (model weights, ids-to-tokens, etc.) and AWS Aurora as a database management service for the backend store of Mlflow (model metrics, version info, etc. )
Clone the repository. Then, to ensure you have all the required CLI tools, run the script:
./check_cli_tools.sh
To remotely deploy the service, run the script:
./deploy.sh
Incoming requests come through the internet gateway and are sent to the application load balancer, which forwards the request to the fargate task serving the MLFlow tracking server. Any responses from the server are routed through the network address translation gateway between the private and public subnets, and back out the internet gate to the tracking service user.
Note no authentication is set up yet, will be using basic single user authentication via an Nginx Proxy
Note When wanting to configure the tracking server as below, there is a conflict between the flags: --default-artifact-root
and --artifacts-destination
.
Option 'default-artifact-root' is required, when backend store is not local file based
The default-artifact-rool
specifies where artifacts are stored. S3 is used as the artifact root location in this deployment.
To restrict public access to the artifact and backend store, a remote host is used as a proxy to interact with the storage services.
Note: Using Mysql backend not PostgreSQL as the diagram suggests.
- write
deploy.sh
andremove.sh
to easily deploy and remove the service