0x00 背景

随着国产大模型的崛起，掀起了一波本地化部署deepseek的浪潮。我就想着把手上的机器利用起来，让他们的算力再发光发热。

一开始我是用爆火的OLLAMA，然后ollama pull 一下，跑个14b还是可以的。（写这篇文章的时候，我的itx是e5 2666v3+RTX3060Ti 8G + 32G 内存）但是后来，偶然刷到了EXO，它支持异构的多卡多设备算力串联，然后将大模型切割分别别运行在各个GPU上或者是卸载使用CPU跑。说人话就是他能把我们手上的机器的算力综合调度、分配然后运作。 ~~我也想试试~~，然后折腾了一圈，得出了这样的结论：

这是Discord的老哥回答我的问题，说它目前还不支持mac m序列芯片之外的机器跑deepseek，并且也还不支持单机多卡的情况。心痛，遂放弃。

0x01 选型

接下来就轮到我们的主角GPUStack出场了，目前分布式部署比较成熟的方案，当属它了。

它的部署很简单，一条命令的事儿，而且作者很贴心的为我们创建虚拟环境（venv）和pipx来安装，另外也支持使用docker部署，但是需要先安装好Nvidia Container Toolkit**，有点点麻烦，不推荐。**

0x02 部署

我部署之前不知道作者会先使用venv创建一个虚拟环境，就先使用conda创建了一个，然后切换到了此环境中（不重要）。

我们先看看官方的安装命令介绍

Mac & Linux

# Run server.
curl -sfL https://get.gpustack.ai | sh -s -

# Run server with non-default port 
curl -sfL https://get.gpustack.ai | sh -s - --port 8080

# Run server without the embedded worker.
curl -sfL https://get.gpustack.ai | sh -s - --disable-worker

# Run server with TLS.
curl -sfL https://get.gpustack.ai | sh -s - --ssl-keyfile /path/to/keyfile --ssl-certfile /path/to/certfile

# Run server with external postgresql database.
curl -sfL https://get.gpustack.ai | sh -s - --database-url "postgresql://username:password@host:port/database_name"

# Run worker with specified IP.
curl -sfL https://get.gpustack.ai | sh -s - --server-url http://myserver --token mytoken --worker-ip 192.168.1.100

# Install with a custom index URL.
curl -sfL https://get.gpustack.ai | INSTALL_INDEX_URL=https://pypi.tuna.tsinghua.edu.cn/simple sh -s -

# Install a custom wheel package other than releases form pypi.org.
curl -sfL https://get.gpustack.ai | INSTALL_PACKAGE_SPEC=https://repo.mycompany.com/my-gpustack.whl sh -s -

# Install a specific version with extra audio dependencies.
curl -sfL https://get.gpustack.ai | INSTALL_PACKAGE_SPEC=gpustack[audio]==0.4.0 sh -s -

Windows

# Run server.
Invoke-Expression (Invoke-WebRequest -Uri "https://get.gpustack.ai" -UseBasicParsing).Content

# Run server without the embedded worker.
Invoke-Expression "& { $((Invoke-WebRequest -Uri 'https://get.gpustack.ai' -UseBasicParsing).Content) } -- --disable-worker"

# Run server with TLS.
Invoke-Expression "& { $((Invoke-WebRequest -Uri 'https://get.gpustack.ai' -UseBasicParsing).Content) } -- --ssl-keyfile 'C:\path\to\keyfile' --ssl-certfile 'C:\path\to\certfile'"


# Run server with external postgresql database.
Invoke-Expression "& { $((Invoke-WebRequest -Uri 'https://get.gpustack.ai' -UseBasicParsing).Content) } -- --database-url 'postgresql://username:password@host:port/database_name'"

# Run worker with specified IP.
Invoke-Expression "& { $((Invoke-WebRequest -Uri 'https://get.gpustack.ai' -UseBasicParsing).Content) } -- --server-url 'http://myserver' --token 'mytoken' --worker-ip '192.168.1.100'"

# Run worker with customize reserved resource.
Invoke-Expression "& { $((Invoke-WebRequest -Uri 'https://get.gpustack.ai' -UseBasicParsing).Content) } -- --server-url 'http://myserver' --token 'mytoken' --system-reserved '{""ram"":5, ""vram"":5}'"

# Install with a custom index URL.
$env:INSTALL_INDEX_URL = "https://pypi.tuna.tsinghua.edu.cn/simple"
Invoke-Expression (Invoke-WebRequest -Uri "https://get.gpustack.ai" -UseBasicParsing).Content

# Install a custom wheel package other than releases form pypi.org.
$env:INSTALL_PACKAGE_SPEC = "https://repo.mycompany.com/my-gpustack.whl"
Invoke-Expression (Invoke-WebRequest -Uri "https://get.gpustack.ai" -UseBasicParsing).Content

# Install a specific version with extra audio dependencies.
$env:INSTALL_PACKAGE_SPEC = "gpustack[audio]==0.4.0"
Invoke-Expression (Invoke-WebRequest -Uri "https://get.gpustack.ai" -UseBasicParsing).Content

我的情况：

由于我是将windows作为server，mac作为worker，因此我的组合如下

首先在windows上打开具有管理员权限的powershell安装服务端并查看token

# 使用国内python软件包镜像站提高下载依赖的速度
$env:INSTALL_INDEX_URL = "https://pypi.tuna.tsinghua.edu.cn/simple"
# windows作为Server的安装命令
Invoke-Expression (Invoke-WebRequest -Uri "https://get.gpustack.ai" -UseBasicParsing).Content

# 查看token
Get-Content -Path "$env:APPDATA\gpustack\token" -Raw

看到这里就说明安装成功啦！

然后在mac上打开终端安装worker

# myserver是windows的ip，token是服务端生成的初始凭据
curl -sfL https://get.gpustack.ai | sh -s - --server-url http://myserver --token mytoke

这样，在服务端的GPUStack WebUI界面上就能看到worker了

然后就能在模型库里下载部署模型啦

0x03 小彩蛋

一定要防火墙放行！

重要的事情说三遍！！！

GPUStack之大模型的分布式部署

https://iori-yimaga.site/archives/wei-ming-ming-wen-zhang

作者

Administrator

发布于

2025-02-16

更新于

2025-02-16

许可