Configuring Persistent Storage
    In the realm of MLOps, the combination of Kubeflow and MLflow stands out as a powerful duo, empowering data scientists and engineers with a comprehensive platform for managing machine learning workflows. To enhance this synergy, let's delve into deploying MLflow on Kubernetes with a focus on utilizing local storage for enhanced efficiency and reliability.
Persistent Storage Configuration
apiVersion: v1
kind: PersistentVolume
metadata:
  name: mlflow-pv
  namespace: mlflow
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: mlflow-storage
  hostPath:
    path: /mnt/data/mlflow
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mlflow-pvc
  namespace: mlflow
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: mlflow-storage
  resources:
    requests:
      storage: 10Gi
    Deploying MLflow Service
With storage in place, we proceed to deploy the MLflow service. Our deployment configuration ensures seamless integration with Kubernetes, utilizing the defined PVC for data storage.
MLflow Service Deployment Configuration
apiVersion: apps/v1
kind: Deployment
metadata:
  name: mlflow-deployment
  namespace: mlflow
  labels:
    app: mlflow
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mlflow
  template:
    metadata:
      labels:
        app: mlflow
    spec:
      automountServiceAccountToken: false
      serviceAccountName: mlflow-sa
      containers:
        - name: mlflow
          image: ghcr.io/mlflow/mlflow:v2.2.1
          imagePullPolicy: Always
          command: ["/bin/bash", "-c", "mlflow server --host 0.0.0.0"]
          ports:
            - containerPort: 5000
          resources:
            requests:
              memory: "512Mi"
              cpu: "100m"
            limits:
              memory: "1Gi"
              cpu: "1"
          volumeMounts:
            - name: mlflow-storage
              mountPath: /mnt/mlflow
      volumes:
        - name: mlflow-storage
          persistentVolumeClaim:
            claimName: mlflow-pvc
    Integrating with Kubeflow's Central Dashboard
To seamlessly integrate MLflow into Kubeflow's ecosystem, we configure a VirtualService and update the central dashboard to include an MLflow tab. This enables users to access MLflow's features directly from Kubeflow's centralized interface.
VirtualService Configuration
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: mlflow
  namespace: mlflow
spec:
  gateways:
    - kubeflow/kubeflow-gateway
  hosts:
    - "*"
  http:
    - match:
        - uri:
            prefix: /mlflow/
      rewrite:
        uri: /
      route:
        - destination:
            host: mlflow-service.mlflow.svc.cluster.local
            port:
              number: 5000
    Modifying Kubeflow Central Dashboard ConfigMap
Additionally, we need to modify the configmap of centraldashboard for Kubeflow to include the MLflow tab. This ensures that users have easy access to MLflow's functionalities from within Kubeflow's central dashboard.
kubectl edit cm centraldashboard-config -n kubeflow
# add this under the other menu items
{
  "type": "item",
  "link": "/mlflow/",
  "text": "MlFlow",
  "icon": "icons:cached"
}
With these configurations in place, MLflow seamlessly integrates into Kubeflow's ecosystem, enriching the platform with robust experiment tracking and management capabilities. This holistic approach ensures smooth collaboration and enhanced productivity for MLOps practitioners leveraging Kubernetes and MLflow.
