Tutorial

In this tutorial, we will learn how to deploy RiD-kit with dflow and kubenete and run a simple case of alanine dipeptide.

Installation of `dflow` and `rid-kit`

With the power of dflow, users can easily minitor the whole workflow of RiD tasks and dispatch their tasks to various computational resources. Before you use it, you should have dflow installed on your host computer (your PC or a remote server).

It it necessary to emphasize that, the computational nodes and monitor nodes are seperated. With dflow, you can deploy dflow and rid on your PC and achieve expensive computation on other resources (like Slurm and Cloud Platform) without any further effort.

Instructions of dflow installation are provided in detail on its Github page. Prerequisites of dflow usage are Docker and Kubenetes, where their main pages (Docker & Kubenetes) include how you can install them. Besides, dflow repo also provides with easy-install shell scripts on dflow/scripts to install Docker & Kubenetes & dflow and make port-forwarding.

Here, we try to use the easy-install scripts provided by dflow to install these dependencies. Download scripts at dflow/scripts and run with the privileges of the User:

Note:

Don’t try to run minikube with root privileges, otherwise an error may occur:

Exiting due to DRV_AS_ROOT: The "docker" driver should not be used with root privileges.

[1]:

# for users in China, please use `-cn.sh` version to accelerate the installation process.
! chmod 755 install-linux-cn.sh
! ./install-linux-cn.sh

[INFO] Found docker executable at /usr/bin/docker
[INFO] Found minikube binary at /usr/local/bin/minikube
[INFO] Minikube has been started
--2022-08-05 21:05:16--  https://raw.githubusercontent.com/deepmodeling/dflow/master/manifests/quick-start-postgres-stable-cn.yaml
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.110.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... ^C

Instead of using the easy-install scripts, it is recommended to configure your own minikube enviroment as follows

[ ]:

#choose the location of the minikube enviroment
export MINIKUBE_HOME=~/.minikube
#allocate enough memory in case of high parallelism
minikube start --cpus 8 --memory 8192mb --kubernetes-version=1.23.9 --image-mirror-country='cn'
# mount the storage path if you are using machine with shared memory storage system, change the minio host path accordingly
minikube start --cpus 8 --memory 8192mb --kubernetes-version=1.23.9 --mount --mount-string="/mnt/vepfs/jiahao/data:/data2" --image-mirror-country='cn'

A further step to configure argo service is to run:

[ ]:

! kubectl create ns argo
! kubectl apply -n argo -f https://raw.githubusercontent.com/deepmodeling/dflow/master/manifests/quick-start-postgres.yaml

Now you should have installed Docker and minikube properly. Run commands to check their status. For minikube, you should wait util all servers keep running. This may take a couple of minutes.

[1]:

! minikube status

minikube
type: Control Plane
host: Running
kubelet: Running
apiserver: Running
kubeconfig: Configured

Installation of RiD-kit

Now we install rid-kit on the host machine. To meet the minimum requirments, the prerequisites of third-party python package should be installed:

tensorflow-cpu or gpu
mdtraj
numpy
scikit-learn

which are also listed in rid-kit/requirements.txt. Then change directory to rid-kit repo and run:

[9]:

# the rid-kit repo path
! cd .. && pip install .

Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Processing /mnt/vepfs/yanze/dflow_project/rid-kit
  Preparing metadata (setup.py) ... done
Requirement already satisfied: numpy in /home/yanze/miniconda3/lib/python3.9/site-packages (from rid==1.1.dev195+gb743cd0) (1.19.5)
Requirement already satisfied: scikit-learn in /home/yanze/miniconda3/lib/python3.9/site-packages (from rid==1.1.dev195+gb743cd0) (1.1.1)
Requirement already satisfied: pydflow in /home/yanze/miniconda3/lib/python3.9/site-packages (from rid==1.1.dev195+gb743cd0) (1.2.1)
Requirement already satisfied: tensorflow in /home/yanze/miniconda3/lib/python3.9/site-packages (from rid==1.1.dev195+gb743cd0) (2.4.1)
Requirement already satisfied: google in /home/yanze/miniconda3/lib/python3.9/site-packages/google-3.0.0-py3.9.egg (from rid==1.1.dev195+gb743cd0) (3.0.0)
Requirement already satisfied: mdtraj in /home/yanze/miniconda3/lib/python3.9/site-packages (from rid==1.1.dev195+gb743cd0) (1.9.7)
Requirement already satisfied: beautifulsoup4 in /home/yanze/miniconda3/lib/python3.9/site-packages (from google->rid==1.1.dev195+gb743cd0) (4.11.1)
Requirement already satisfied: astunparse in /home/yanze/miniconda3/lib/python3.9/site-packages (from mdtraj->rid==1.1.dev195+gb743cd0) (1.6.3)
Requirement already satisfied: scipy in /home/yanze/miniconda3/lib/python3.9/site-packages (from mdtraj->rid==1.1.dev195+gb743cd0) (1.8.1)
Requirement already satisfied: pyparsing in /home/yanze/miniconda3/lib/python3.9/site-packages (from mdtraj->rid==1.1.dev195+gb743cd0) (3.0.9)
Requirement already satisfied: pyyaml in /home/yanze/miniconda3/lib/python3.9/site-packages (from pydflow->rid==1.1.dev195+gb743cd0) (6.0)
Requirement already satisfied: python-dateutil in /home/yanze/miniconda3/lib/python3.9/site-packages (from pydflow->rid==1.1.dev195+gb743cd0) (2.8.2)
Requirement already satisfied: certifi in /home/yanze/miniconda3/lib/python3.9/site-packages (from pydflow->rid==1.1.dev195+gb743cd0) (2022.6.15)
Requirement already satisfied: argo-workflows==5.0.0 in /home/yanze/miniconda3/lib/python3.9/site-packages (from pydflow->rid==1.1.dev195+gb743cd0) (5.0.0)
Requirement already satisfied: six in /home/yanze/miniconda3/lib/python3.9/site-packages (from pydflow->rid==1.1.dev195+gb743cd0) (1.15.0)
Requirement already satisfied: typeguard in /home/yanze/miniconda3/lib/python3.9/site-packages (from pydflow->rid==1.1.dev195+gb743cd0) (2.13.3)
Requirement already satisfied: urllib3 in /home/yanze/miniconda3/lib/python3.9/site-packages (from pydflow->rid==1.1.dev195+gb743cd0) (1.26.7)
Requirement already satisfied: kubernetes in /home/yanze/miniconda3/lib/python3.9/site-packages (from pydflow->rid==1.1.dev195+gb743cd0) (24.2.0)
Requirement already satisfied: cloudpickle in /home/yanze/miniconda3/lib/python3.9/site-packages (from pydflow->rid==1.1.dev195+gb743cd0) (2.1.0)
Requirement already satisfied: jsonpickle in /home/yanze/miniconda3/lib/python3.9/site-packages (from pydflow->rid==1.1.dev195+gb743cd0) (2.2.0)
Requirement already satisfied: minio in /home/yanze/miniconda3/lib/python3.9/site-packages (from pydflow->rid==1.1.dev195+gb743cd0) (7.1.9)
Requirement already satisfied: threadpoolctl>=2.0.0 in /home/yanze/miniconda3/lib/python3.9/site-packages (from scikit-learn->rid==1.1.dev195+gb743cd0) (3.1.0)
Requirement already satisfied: joblib>=1.0.0 in /home/yanze/miniconda3/lib/python3.9/site-packages (from scikit-learn->rid==1.1.dev195+gb743cd0) (1.1.0)
Requirement already satisfied: termcolor~=1.1.0 in /home/yanze/miniconda3/lib/python3.9/site-packages (from tensorflow->rid==1.1.dev195+gb743cd0) (1.1.0)
Requirement already satisfied: h5py~=2.10.0 in /home/yanze/miniconda3/lib/python3.9/site-packages (from tensorflow->rid==1.1.dev195+gb743cd0) (2.10.0)
Requirement already satisfied: grpcio~=1.32.0 in /home/yanze/miniconda3/lib/python3.9/site-packages (from tensorflow->rid==1.1.dev195+gb743cd0) (1.32.0)
Requirement already satisfied: protobuf>=3.9.2 in /home/yanze/miniconda3/lib/python3.9/site-packages (from tensorflow->rid==1.1.dev195+gb743cd0) (3.19.4)
Requirement already satisfied: tensorflow-estimator<2.5.0,>=2.4.0 in /home/yanze/miniconda3/lib/python3.9/site-packages (from tensorflow->rid==1.1.dev195+gb743cd0) (2.4.0)
Requirement already satisfied: opt-einsum~=3.3.0 in /home/yanze/miniconda3/lib/python3.9/site-packages (from tensorflow->rid==1.1.dev195+gb743cd0) (3.3.0)
Requirement already satisfied: wrapt~=1.12.1 in /home/yanze/miniconda3/lib/python3.9/site-packages (from tensorflow->rid==1.1.dev195+gb743cd0) (1.12.1)
Requirement already satisfied: typing-extensions~=3.7.4 in /home/yanze/miniconda3/lib/python3.9/site-packages (from tensorflow->rid==1.1.dev195+gb743cd0) (3.7.4.3)
Requirement already satisfied: wheel~=0.35 in /home/yanze/miniconda3/lib/python3.9/site-packages (from tensorflow->rid==1.1.dev195+gb743cd0) (0.37.1)
Requirement already satisfied: keras-preprocessing~=1.1.2 in /home/yanze/miniconda3/lib/python3.9/site-packages (from tensorflow->rid==1.1.dev195+gb743cd0) (1.1.2)
Requirement already satisfied: absl-py~=0.10 in /home/yanze/miniconda3/lib/python3.9/site-packages (from tensorflow->rid==1.1.dev195+gb743cd0) (0.15.0)
Requirement already satisfied: gast==0.3.3 in /home/yanze/miniconda3/lib/python3.9/site-packages (from tensorflow->rid==1.1.dev195+gb743cd0) (0.3.3)
Requirement already satisfied: tensorboard~=2.4 in /home/yanze/miniconda3/lib/python3.9/site-packages (from tensorflow->rid==1.1.dev195+gb743cd0) (2.6.0)
Requirement already satisfied: google-pasta~=0.2 in /home/yanze/miniconda3/lib/python3.9/site-packages (from tensorflow->rid==1.1.dev195+gb743cd0) (0.2.0)
Requirement already satisfied: flatbuffers~=1.12.0 in /home/yanze/miniconda3/lib/python3.9/site-packages (from tensorflow->rid==1.1.dev195+gb743cd0) (1.12)
Requirement already satisfied: markdown>=2.6.8 in /home/yanze/miniconda3/lib/python3.9/site-packages (from tensorboard~=2.4->tensorflow->rid==1.1.dev195+gb743cd0) (3.3.7)
Requirement already satisfied: werkzeug>=0.11.15 in /home/yanze/miniconda3/lib/python3.9/site-packages (from tensorboard~=2.4->tensorflow->rid==1.1.dev195+gb743cd0) (2.1.2)
Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in /home/yanze/miniconda3/lib/python3.9/site-packages (from tensorboard~=2.4->tensorflow->rid==1.1.dev195+gb743cd0) (1.8.1)
Requirement already satisfied: requests<3,>=2.21.0 in /home/yanze/miniconda3/lib/python3.9/site-packages (from tensorboard~=2.4->tensorflow->rid==1.1.dev195+gb743cd0) (2.27.1)
Requirement already satisfied: tensorboard-data-server<0.7.0,>=0.6.0 in /home/yanze/miniconda3/lib/python3.9/site-packages (from tensorboard~=2.4->tensorflow->rid==1.1.dev195+gb743cd0) (0.6.0)
Requirement already satisfied: google-auth<2,>=1.6.3 in /home/yanze/miniconda3/lib/python3.9/site-packages (from tensorboard~=2.4->tensorflow->rid==1.1.dev195+gb743cd0) (1.35.0)
Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /home/yanze/miniconda3/lib/python3.9/site-packages (from tensorboard~=2.4->tensorflow->rid==1.1.dev195+gb743cd0) (0.4.6)
Requirement already satisfied: setuptools>=41.0.0 in /home/yanze/miniconda3/lib/python3.9/site-packages (from tensorboard~=2.4->tensorflow->rid==1.1.dev195+gb743cd0) (61.2.0)
Requirement already satisfied: soupsieve>1.2 in /home/yanze/miniconda3/lib/python3.9/site-packages (from beautifulsoup4->google->rid==1.1.dev195+gb743cd0) (2.3.1)
Requirement already satisfied: requests-oauthlib in /home/yanze/miniconda3/lib/python3.9/site-packages (from kubernetes->pydflow->rid==1.1.dev195+gb743cd0) (1.3.1)
Requirement already satisfied: websocket-client!=0.40.0,!=0.41.*,!=0.42.*,>=0.32.0 in /home/yanze/miniconda3/lib/python3.9/site-packages (from kubernetes->pydflow->rid==1.1.dev195+gb743cd0) (1.3.3)
Requirement already satisfied: cachetools<5.0,>=2.0.0 in /home/yanze/miniconda3/lib/python3.9/site-packages (from google-auth<2,>=1.6.3->tensorboard~=2.4->tensorflow->rid==1.1.dev195+gb743cd0) (4.2.4)
Requirement already satisfied: rsa<5,>=3.1.4 in /home/yanze/miniconda3/lib/python3.9/site-packages (from google-auth<2,>=1.6.3->tensorboard~=2.4->tensorflow->rid==1.1.dev195+gb743cd0) (4.8)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /home/yanze/miniconda3/lib/python3.9/site-packages (from google-auth<2,>=1.6.3->tensorboard~=2.4->tensorflow->rid==1.1.dev195+gb743cd0) (0.2.8)
Requirement already satisfied: importlib-metadata>=4.4 in /home/yanze/miniconda3/lib/python3.9/site-packages (from markdown>=2.6.8->tensorboard~=2.4->tensorflow->rid==1.1.dev195+gb743cd0) (4.11.4)
Requirement already satisfied: idna<4,>=2.5 in /home/yanze/miniconda3/lib/python3.9/site-packages (from requests<3,>=2.21.0->tensorboard~=2.4->tensorflow->rid==1.1.dev195+gb743cd0) (3.3)
Requirement already satisfied: charset-normalizer~=2.0.0 in /home/yanze/miniconda3/lib/python3.9/site-packages (from requests<3,>=2.21.0->tensorboard~=2.4->tensorflow->rid==1.1.dev195+gb743cd0) (2.0.4)
Requirement already satisfied: oauthlib>=3.0.0 in /home/yanze/miniconda3/lib/python3.9/site-packages (from requests-oauthlib->kubernetes->pydflow->rid==1.1.dev195+gb743cd0) (3.2.0)
Requirement already satisfied: zipp>=0.5 in /home/yanze/miniconda3/lib/python3.9/site-packages (from importlib-metadata>=4.4->markdown>=2.6.8->tensorboard~=2.4->tensorflow->rid==1.1.dev195+gb743cd0) (3.8.0)
Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /home/yanze/miniconda3/lib/python3.9/site-packages (from pyasn1-modules>=0.2.1->google-auth<2,>=1.6.3->tensorboard~=2.4->tensorflow->rid==1.1.dev195+gb743cd0) (0.4.8)
Building wheels for collected packages: rid
  Building wheel for rid (setup.py) ... done
  Created wheel for rid: filename=rid-1.1.dev195+gb743cd0-py3-none-any.whl size=77609 sha256=9f325b928e7c8f74cab0bda043a026949c5a427e0a7db1832043b42667f31707
  Stored in directory: /home/yanze/.cache/pip/wheels/ec/59/93/934a1323bb160c606bf1000cac32eae183a198305cb06cb1b0
Successfully built rid
Installing collected packages: rid
  Attempting uninstall: rid
    Found existing installation: rid 1.1.dev195+gb743cd0
    Uninstalling rid-1.1.dev195+gb743cd0:
      Successfully uninstalled rid-1.1.dev195+gb743cd0
Successfully installed rid-1.1.dev195+gb743cd0

Configuration of Computational Environment

In RiD workflow, dflow helps send computation tasks to resources with peoper environment configured.

There are four main modules and several workflow steps in RiD procedures and each module or step needs different environments:

Exploration/Sampling: Gromacs, PLUMED2 modified by DeepFE.cpp, Tensorflow C++ interface. (prefer GPU)
Selection: Tensroflow Python interface.
Labeling: Gromacs, PLUMED2. (prefer GPU)
Training: Tensroflow Python interface. (prefer GPU).
Workflow steps: Python.

dflow supports different resources including Slurm clusters, K8S local machines and Cloud Server.

For Slurm, configure computational environments on your Slurm following the instructions in Environment settings at front page of the Rid document. With dflow, rid-kit send tasks to Slurm nodes from the host machines remotely without manually logging in the cluster.
For local resources, just use the docker images we have built. No further manual configuration needed. We also provide Dockerfile of our images to enable flexible modification.
For Cloud Server, like Lebesgue, use public images and no further manual configuration needed.

Prepare machine configuration JSON.

rid-kit uses JSON file to manage resources. In machine.json, define your own resources and dispatch tasks to them.

Generally, we would like to run low-cost tasks on cpu nodes or locally and submit high-cost tasks to Slurm or Clouds. If submitting to Slurm environment, a machine.json may look like:

{
    "resources": {
        "local_k8s": {
            "template_config" : {
                "image": "pkufjhdocker/rid-tf-cpu:latest",
                "image_pull_policy" : "IfNotPresent"
            }
        },
        "remote_slurm": {
            "executor": {
                "type": "slurm",
                "host": "",
                "port": 22,
                "username": "",
                "password": "",
                "header": [
                    "#!/bin/bash",
                    "#SBATCH --partition GPU_2080Ti",
                    "#SBATCH -N 1",
                    "#SBATCH --ntasks-per-node 8",
                    "#SBATCH -t 120:0:0",
                    "#SBATCH --gres=gpu:1",
                    "conda init bash",
                    "source ~/.bashrc",
                    "source /path/to/rid-kit.env"
                ]
            }
        }
    },

    "tasks": {
        "prep_exploration_config": "local_k8s",
        "run_exploration_config": "remote_slurm",
        "prep_label_config": "local_k8s",
        "run_label_config": "remote_slurm",
        "prep_select_config": "local_k8s",
        "run_select_config": "local_k8s",
        "prep_data_config": "local_k8s",
        "run_train_config": "remote_slurm",
        "workflow_steps_config": "local_k8s"
    }
}

In key resources, you define your own resources types. Resource names and their numbers are custom.
In key tasks, you distribute resources you have defined to tasks of RiD. Do not change task names in tasks as they are fixed in codes. If you submit jobs to Slurm enviroment, you will have to compile the computation enviroment on Slurm server for yourself. So we recommend to submit jobs to Cloud enviroment like Bohrium, in this case the machine.json may look like:

{
    "resources": {
        "local_k8s": {
            "template_config" : {
                "image": "pkufjhdocker/rid-tf-cpu:latest",
                "image_pull_policy" : "IfNotPresent"
            }
        },
        "bohrium1": {
            "executor":{
            "machine_dict":{
                "batch_type": "DpCloudServer",
                "context_type": "DpCloudServerContext",
                "local_root" : "./",
                "remote_profile":{
                    "email": "",
                    "password": "",
                    "program_id": "",
                    "input_data":{
                        "api_version":2,
                        "job_type": "container",
                        "log_file": "tmp_log",
                        "grouped":true,
                        "job_name": "test_rid",
                        "disk_size": 100,
                        "_instance_group_id": 4,
                        "scass_type":"c8_m32_1 * NVIDIA V100",
                        "platform": "ali",
                        "image_name":"pkufjhdocker/rid-gmx-exploration:latest",
                        "on_demand":0
                        }
                }
            },
            "resources_dict":{
                "number_node": 1,
                "cpu_per_node": 8,
                "gpu_per_node": 1,
                "queue_name": "GPU",
                "group_size": 1
            }
        }
        },
        "bohrium2": {
            "executor":{
            "machine_dict":{
                "batch_type": "DpCloudServer",
                "context_type": "DpCloudServerContext",
                "local_root" : "./",
                "remote_profile":{
                    "email": "",
                    "password": "",
                    "program_id": "",
                    "input_data":{
                        "api_version":2,
                        "job_type": "container",
                        "log_file": "tmp_log",
                        "grouped":true,
                        "job_name": "test_rid",
                        "disk_size": 100,
                        "_instance_group_id": 4,
                        "scass_type":"c8_m32_1 * NVIDIA V100",
                        "platform": "ali",
                        "image_name":"pkufjhdocker/rid-gmx-plumed:latest",
                        "on_demand":0
                        }
                }
            },
            "resources_dict":{
                "number_node": 1,
                "cpu_per_node": 8,
                "gpu_per_node": 1,
                "queue_name": "GPU",
                "group_size": 1
            }
        }
        },
        "bohrium3": {
            "executor":{
            "machine_dict":{
                "batch_type": "DpCloudServer",
                "context_type": "DpCloudServerContext",
                "local_root" : "./",
                "remote_profile":{
                    "email": "",
                    "password": "",
                    "program_id": "",
                    "input_data":{
                        "api_version":2,
                        "job_type": "container",
                        "log_file": "tmp_log",
                        "grouped":true,
                        "job_name": "test_rid",
                        "disk_size": 100,
                        "_instance_group_id": 4,
                        "scass_type":"c8_m32_1 * NVIDIA V100",
                        "platform": "ali",
                        "image_name":"pkufjhdocker/rid-tf-cpu:latest",
                        "on_demand":0
                        }
                }
            },
            "resources_dict":{
                "number_node": 1,
                "cpu_per_node": 8,
                "gpu_per_node": 1,
                "queue_name": "GPU",
                "group_size": 1
            }
        }
        }
    },

    "tasks": {
        "prep_exploration_config": "local_k8s",
        "run_exploration_config": "bohrium1",
        "prep_label_config": "local_k8s",
        "run_label_config": "bohrium2",
        "prep_select_config": "local_k8s",
        "run_select_config": "bohrium3",
        "prep_data_config": "local_k8s",
        "run_train_config": "bohrium3",
        "workflow_steps_config": "local_k8s"
    }
}

Get Started!

Assume you have learn the basic knowledge of reinforced dynamics which we won’t describe again here.

Users can monitor workflows from browser UI. To enable that, you should forward ports of argo and minio. These could be achieved by rid port-forward.

[14]:

! rid port-forward

2022-08-06 19:34:09 | INFO | rid.entrypoint.server | Port "agro-server" has been launched and running.
2022-08-06 19:34:09 | INFO | rid.entrypoint.server | Port "minio-server" has been launched and running.
2022-08-06 19:34:09 | INFO | rid.entrypoint.server | Port "minio-ui" has been launched and running.

In this case, we try to explore the phase space of alanine dipeptide. Prepare your initial conformation files in .gro format, topology file in .top format and configuration file rid.json. For convenience, we have prepared on at rid/template/rid.json. Remember also provide your own forcefield files. Collect all these files into a directory and feed its path to rid-kit by flag -i.

A minimum case was prepared in rid-kit/tests/data/000. Then run rid submit:

[13]:

! rid submit -i ../tests/data/000 -c ../rid/template/rid.json -m /mnt/vepfs/yanze/dflow_project/test_dflow/template/machine.json

2022-08-06 19:29:53 | INFO | rid.entrypoint.main | Preparing RiD ...
Workflow has been submitted (ID: reinforced-dynamics-cx4rd)
2022-08-06 19:30:17 | INFO | rid.entrypoint.main | The task is displayed on "https://127.0.0.1:2746".
2022-08-06 19:30:17 | INFO | rid.entrypoint.main | Artifacts (Files) are listed on "https://127.0.0.1:9001".

INFO indicates that this task has been submitted succussfully. Record this workflow ID as we may use it later.

Visit the url given by the last two lines, all workflows and corresponding files are listed on UI.

Command lines are also supported. Run rid ls to list your workflows and their status.

[15]:

! rid ls

2022-08-06 19:35:45 | INFO | rid.entrypoint.cli |

        Reinforced Dynamics Workflow

NAME                        STATUS    AGE   DURATION   PRIORITY
reinforced-dynamics-cx4rd   Running   5m    5m         0
reinforced-dynamics-pi65n   Running   8h    8h         0
reinforced-dynamics-yk3wl   Failed    8h    2m         0
reinforced-dynamics-bsc7j   Failed    9h    31m        0

rid-kit is based on dflow, argo and minikube. So further complex and flexible managements of workflows can be achieved by their command lines. like kubectl get pods -n argo and argo show.

For failed tasks, you may want to remove them or resubmit them from the failure steps.

For remove:

[ ]:

# rid rm task-ID
! rid rm reinforced-dynamics-bsc7j

For resubmit to modify and continue workflow:

[ ]:

! rid resubmit -i your_dir -c path_to_rid.json -m path_to_machine.json Workflow-ID