Using Incremental Learning Job In Mnist

This document introduces how to use the inference of incremental learning job in Mnist. Using the incremental learning inference job, our application can automatically retrains, evaluates, and updates models based on the data generated at the edge.

Mnist Experiment

Prepare Model

Link：https://pan.baidu.com/s/1Gi5BJ_NQzqj66R8N5OXPzA 
Extract code：OSPP

Prepare dataset

Link：https://pan.baidu.com/s/1Gi5BJ_NQzqj66R8N5OXPzA 
Extract code：OSPP

Prepare Image

This example uses the image:

ymh13383894400/mnist-new:v1

This image is generated by the script used for creating training, eval and inference worker.

Project creation and running

Create a Mnist project

├─flowunit：# Flowunit directory
│  ├─mnist_preprocess：# Preprocessing functional unit
│  ├─mnist_infer：# TensorFlow Inference Functional Unit
│  ├─mnist_response：# HTTP responses construct functional units
└─graph：# Flowchart catalog
│  ├─mnist.toml：# Inference flowchart
│  └─test_mnist.py # Inference python file

create the job

WORKER_NODE="edge-node1"
INFER_NODE="edge-node2"

Create Dataset

kubectl create -f - <<EOF
apiVersion: sedna.io/v1alpha1
kind: Dataset
metadata:
  name: incremental-dataset
spec:
  url: "/data/train_data.txt"
  format: "txt"
  nodeName: $WORKER_NODE
EOF

Create Initial Model to simulate the initial model in incremental learning scenario.

kubectl create -f - <<EOF
apiVersion: sedna.io/v1alpha1
kind: Model
metadata:
  name: initial-model
spec:
  url : "/models/base_model"
  format: "ckpt"
EOF

Create Deploy Model

kubectl create -f - <<EOF
apiVersion: sedna.io/v1alpha1
kind: Model
metadata:
  name: deploy-model
spec:
  url : "/models/deploy_model/saved_model.pb"
  format: "pb"
EOF

Start The Incremental Learning Job

The inference part uses the modelbox image to run the pod.

IMAGE=ymh13383894400/mnist-new:v1

kubectl create -f - <<EOF
apiVersion: sedna.io/v1alpha1
kind: IncrementalLearningJob
metadata:
  name: Mnist-demo
spec:
  initialModel:
    name: "initial-model"
  dataset:
    name: "incremental-dataset"
    trainProb: 0.8
  trainSpec:
    template:
      spec:
        nodeName: $WORKER_NODE
        containers:
          - image: $IMAGE
            name:  train-worker
            imagePullPolicy: IfNotPresent
            args: ["train.py"]
    trigger:
      checkPeriodSeconds: 60
      timer:
        start: 02:00
        end: 20:00
      condition:
        operator: ">"
        threshold: 500
        metric: num_of_samples
  evalSpec:
    template:
      spec:
        nodeName: $WORKER_NODE
        containers:
          - image: $IMAGE
            name:  eval-worker
            imagePullPolicy: IfNotPresent
            args: ["eval.py"]
  deploySpec:
    model:
      name: "deploy-model"
      hotUpdateEnabled: true
      pollPeriodSeconds: 60
    trigger:
      condition:
        operator: ">"
        threshold: 0.1
        metric: precision_delta
    hardExampleMining:
      name: "IBT"
      parameters:
        - key: "threshold_img"
          value: "0.9"
        - key: "threshold_box"
          value: "0.9"
    template:
      spec:
        nodeName: $INFER_NODE
        containers:
        - image: $IMAGE
          name:  infer-worker
          imagePullPolicy: IfNotPresent
          args: ["test_mnist.py"]
          volumeMounts:
          - name: localvideo
            mountPath: /video/
          - name: hedir
            mountPath: /he_saved_url
          resources:  # user defined resources
            limits:
              memory: 2Gi
        volumes:   # user defined volumes
          - name: localvideo
            hostPath:
              path: /incremental_learning/video/
              type: DirectoryOrCreate
          - name: hedir
            hostPath:
              path:  /incremental_learning/he/
              type: DirectoryOrCreate
  outputDir: "/output"
EOF

Check Incremental Learning Job

Query the service status:

kubectl get incrementallearningjob Mnist-detection-demo

4.2 KiB Raw Permalink Blame History