This guide shows how to use dstack on Hot Aisle for orchestrating AMD GPU workloads, using AMD MI300X examples throughout. It focuses on automation: provisioning infrastructure, running containers, and managing development, inference, and training runs with versioned YAML.
You will set up a Hot Aisle backend, create fleets for on-demand and reserved capacity, and run dev-environment, service, and task configurations using one CLI workflow.
Why GPU orchestration?
Without orchestration, teams usually provision VMs manually, run containers with ad-hoc commands, and keep critical run parameters in shell history. That increases setup time, makes cost control harder, and creates drift across users and environments.
dstack addresses this by making both infrastructure and workloads declarative. You define provisioning policies in fleets, describe workloads as run configurations, and apply them through one workflow. This is especially useful for GPU workloads where automatic provisioning, management of idle instances, and predictable container startup directly improve cost control and iteration speed.
Setup
On-demand VMs
Install the CLI first:
uv tool install "dstack[all]"
Then configure the Hot Aisle backend in ~/.dstack/server/config.yml:
projects:
- name: main
backends:
- type: hotaisle
team_handle: ${HOTAISLE_TEAM_HANDLE}
creds:
type: api_key
api_key: ${HOTAISLE_API_KEY}
After updating server config, start or restart dstack server so backend changes are picked up.
Now check available MI300X offers:
dstack offer --gpu MI300X
Typical output (truncated):
# BACKEND REGION INSTANCE RESOURCES SPOT PRICE
1 hotaisle us-east-1 <mi300x-offer-1> 32xCPU, 256GB, 1xAMD MI300X (192GB) no $X.XX
2 hotaisle us-east-1 <mi300x-offer-2> 64xCPU, 512GB, 2xAMD MI300X (192GB) no $Y.YY
...
Create a backend fleet with a node range:
type: fleet
name: mi300x-on-demand
# Keep at most 4 VMs; provision on demand
nodes: 0..4
idle_duration: 30m
resources:
# from 1 to 8 MI300X GPUs per VM
gpu: MI300X:1..8
Apply it:
dstack apply -f fleet.dstack.yml
With nodes: 0..4, dstack creates a fleet template and provisions new VMs only when runs require capacity. When instances stay idle longer than idle_duration, they are terminated automatically.
Reserved clusters
If you reserve a cluster from Hot Aisle with fast interconnect, use an SSH fleet. This is useful for distributed training, large-model inference that exceeds one node, or prefill/decode disaggregation.
type: fleet
name: mi300x-reserved
nodes: 2
placement: cluster
ssh_config:
user: ubuntu
identity_file: ~/.ssh/id_ed25519
hosts:
- 203.0.113.11
- 203.0.113.12
Apply:
dstack apply -f fleet-mi300x-reserved.dstack.yml
For AMD hosts, ensure AMDGPU-DKMS and ROCm userspace components are installed before attaching hosts to dstack.
IDEs & notebooks
For AMD runs, use a custom Docker image. The default dstack image is CUDA-oriented, so AMD workflows should specify an image explicitly.
Use a dev-environment to launch a remote IDE session on MI300X:
type: dev-environment
name: mi300x-dev
image: rocm/dev-ubuntu-24.04:latest
ide: cursor
resources:
gpu: MI300X:1
Launch:
dstack apply -f dev.dstack.yml
dstack apply attaches automatically. Typical output includes an IDE link:
Launching `mi300x-dev`...
---> 100%
To open in Cursor Desktop, use this link:
cursor://vscode-remote/ssh-remote+mi300x-dev/workspace
Model inference
Use a service run when you need a stable model endpoint. The example below follows the same MI300X + ROCm pattern used in opencode-vllm-hotaisle.md.
type: service
name: qwen3-coder-mi300x
image: rocm/vllm:latest
env:
- HF_TOKEN
commands:
- |
vllm serve Qwen/Qwen3-Coder-30B-A3B-Instruct \
--port 8000 \
--block-size 256 \
--max-model-len 131272 \
--enable-auto-tool-choice \
--tool-call-parser qwen3_xml
port: 8000
model: Qwen/Qwen3-Coder-30B-A3B-Instruct
resources:
gpu: MI300X:1
Apply:
HF_TOKEN=... dstack apply -f qwen.dstack.yml
dstack tracks service health and keeps replicas in the desired state. To inspect status:
dstack ps -v
If service auth is enabled (default), call the model endpoint with a dstack token:
curl -sS -X POST "http://localhost:3000/proxy/services/main/qwen3-coder-mi300x/v1/chat/completions" \
-H "Authorization: Bearer ${DSTACK_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen/Qwen3-Coder-30B-A3B-Instruct",
"messages": [{"role": "user", "content": "Write a one-line health check."}],
"max_tokens": 64
}'
For a more advanced way to expose endpoints (custom domain, HTTPS, and routing policies), configure a dstackgateway and publish the service through it. In that setup, the model endpoint is typically https://<service-name>.<gateway-domain>/v1 (for this example: https://qwen3-coder-mi300x.<gateway-domain>/v1).
Training & web-apps
Use task runs for finite training jobs and internal web apps. For distributed training, run tasks on a fleet configured with placement: cluster and set nodes in the task.
A single-node MI300X training task:
type: task
name: mi300x-sft
image: rocm/pytorch:rocm7.2_ubuntu24.04_py3.12_pytorch_release_2.7.1
env:
- HF_TOKEN
commands:
- python -m pip install -U transformers datasets accelerate trl peft
- python train_sft.py --model Qwen/Qwen2.5-7B-Instruct --epochs 3
resources:
gpu: MI300X:1
For failed runs:
dstack ps -v
dstack logs -d mi300x-sft
Tasks can expose ports; while attached, dstack forwards remote ports to localhost.
type: task
name: openclaw-mi300x
python: "3.12"
env:
- OPENCLAW_TOKEN
- MODEL_BASE_URL
- DSTACK_MODEL_TOKEN
commands:
- curl -fsSL https://openclaw.ai/install.sh | bash -s -- --no-onboard
- openclaw config set gateway.mode local
- openclaw config set gateway.auth.mode token
- openclaw config set gateway.auth.token "$OPENCLAW_TOKEN"
- openclaw config set gateway.trustedProxies '["127.0.0.1"]'
- |
openclaw config set models.providers.mi300x '{
"baseUrl":"'"$MODEL_BASE_URL"'",
"apiKey":"'"$DSTACK_MODEL_TOKEN"'",
"api":"openai-completions",
"models":[
{
"id":"Qwen/Qwen3-Coder-30B-A3B-Instruct",
"name":"Qwen3-Coder-30B-A3B-Instruct",
"reasoning":true,
"input":["text"],
"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0},
"contextWindow":131072,
"maxTokens":8192
}
]
}' --json
- openclaw gateway
ports:
- 18789
The OpenClaw task above starts a local gateway on localhost:18789 and points it to your model service endpoint.
Use these variables when applying:
MODEL_BASE_URL: gateway service/v1endpoint (for example,https://qwen3-coder-mi300x.<gateway-domain>/v1)DSTACK_MODEL_TOKEN: yourdstackuser token used by built-in service authentication
dstack apply \
-e OPENCLAW_TOKEN=... \
-e MODEL_BASE_URL=https://qwen3-coder-mi300x.<gateway-domain>/v1 \
-e DSTACK_MODEL_TOKEN=... \
-f task-openclaw.dstack.yml
Summary
dstack automates GPU provisioning and workload orchestration, and it simplifies container-based execution for day-to-day development, inference, and training.
Two operational behaviors matter most in day-to-day use:
- Fleet idle control: with
idle_duration,dstackcan automatically terminate idle VMs, which helps avoid wasting GPU resources. - Team operation: projects, fleets, and runs are shared across members, so one team can use a consistent control plane and auth model.
Core commands to keep in regular use: dstack apply, dstack fleet, dstack ps, dstack logs, and dstack stop.
Useful references:
dstackrepository: https://github.com/dstackai/dstackdstackdocs: https://dstack.aidstackskill (for agent-assisted YAML/CLI workflows): https://skills.sh/dstackai/dstack/dstack
About the Author
This is a guest post from a friend of Hot Aisle. This content is not sponsored or paid.
Andrey Cheptsov is the founder and a core maintainer of dstack, focused on open-source tooling for AI infrastructure orchestration across cloud and on-prem GPU environments.
Connect with Andrey:
- LinkedIn:linkedin.com/in/andrey-cheptsov
- X (Twitter):@andrey_cheptsov
- GitHub:github.com/peterschmidt85
Contribute to Hot Aisle
The Hot Aisle website is open source under the MIT License and welcomes contributions from the community. Whether you want to fix a typo, improve documentation, or share your own technical content, we'd love to have your input.
Visit our GitHub repository:github.com/hotaisle/hotaisle-website
