Streamlining MLflow Deployment on Kubernetes with Local Storage
Configuring Persistent Storage
In the realm of MLOps, the combination of Kubeflow and MLflow stands out as a powerful duo, empowering data scientists and engineers with a comprehensive platform for managing machine learning workflows. To enhance this synergy, let's delve into deploying MLflow on Kubernetes with a focus on utilizing local storage for enhanced efficiency and reliability.
Persistent Storage Configuration
apiVersion: v1
kind: PersistentVolume
metadata:
name: mlflow-pv
namespace: mlflow
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: mlflow-storage
hostPath:
path: /mnt/data/mlflow
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mlflow-pvc
namespace: mlflow
spec:
accessModes:
- ReadWriteOnce
storageClassName: mlflow-storage
resources:
requests:
storage: 10Gi
Deploying MLflow Service
With storage in place, we proceed to deploy the MLflow service. Our deployment configuration ensures seamless integration with Kubernetes, utilizing the defined PVC for data storage.
MLflow Service Deployment Configuration
apiVersion: apps/v1
kind: Deployment
metadata:
name: mlflow-deployment
namespace: mlflow
labels:
app: mlflow
spec:
replicas: 1
selector:
matchLabels:
app: mlflow
template:
metadata:
labels:
app: mlflow
spec:
automountServiceAccountToken: false
serviceAccountName: mlflow-sa
containers:
- name: mlflow
image: ghcr.io/mlflow/mlflow:v2.2.1
imagePullPolicy: Always
command: ["/bin/bash", "-c", "mlflow server --host 0.0.0.0"]
ports:
- containerPort: 5000
resources:
requests:
memory: "512Mi"
cpu: "100m"
limits:
memory: "1Gi"
cpu: "1"
volumeMounts:
- name: mlflow-storage
mountPath: /mnt/mlflow
volumes:
- name: mlflow-storage
persistentVolumeClaim:
claimName: mlflow-pvc
Integrating with Kubeflow's Central Dashboard
To seamlessly integrate MLflow into Kubeflow's ecosystem, we configure a VirtualService and update the central dashboard to include an MLflow tab. This enables users to access MLflow's features directly from Kubeflow's centralized interface.
VirtualService Configuration
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: mlflow
namespace: mlflow
spec:
gateways:
- kubeflow/kubeflow-gateway
hosts:
- "*"
http:
- match:
- uri:
prefix: /mlflow/
rewrite:
uri: /
route:
- destination:
host: mlflow-service.mlflow.svc.cluster.local
port:
number: 5000
Modifying Kubeflow Central Dashboard ConfigMap
Additionally, we need to modify the configmap of centraldashboard for Kubeflow to include the MLflow tab. This ensures that users have easy access to MLflow's functionalities from within Kubeflow's central dashboard.
kubectl edit cm centraldashboard-config -n kubeflow
# add this under the other menu items
{
"type": "item",
"link": "/mlflow/",
"text": "MlFlow",
"icon": "icons:cached"
}
With these configurations in place, MLflow seamlessly integrates into Kubeflow's ecosystem, enriching the platform with robust experiment tracking and management capabilities. This holistic approach ensures smooth collaboration and enhanced productivity for MLOps practitioners leveraging Kubernetes and MLflow.