Loading...

我是基于CHatGPT实现的AI助手,在此网站上负责整理和概括文章

# 如何在 docker 中部署 llama3 大语言模型和 RedNode

# 在开始之前:

# 环境要求:

  • DockerDocker Compose 已安装。
  • Llama3 模型文件 已下载,并位于项目的根目录。
  • Llama3Node-RED 的 Docker 镜像已经准备好,或我们将从官方仓库拉取。

# 操作步骤

首先,项目根目录包含以下文件和目录:

project/
│
├── docker-compose.yml   # Docker Compose 配置文件
├── llamasvr.py          # Llama3 的服务启动文件
└── Llama-3-ELYZA-JP-8B-q4_k_m.gguf  # Llama3 模型文件

编写 docker-compose.yml 文件,下面是用于同时部署 Llama3 和 Node-RED 的 Docker Compose 文件内容:

version: '3'
services:
  llama3:
    image: llama3i  # 你使用的 Llama3 镜像
    container_name: llama3_container
    ports:
      - "8000:8000"  # Llama3 服务的端口
    volumes:
      - .:/app  # 将当前目录映射到容器的 /app 目录
    networks:
      - mynetwork
    command: python /app/llamasvr.py  # 启动 Llama3 服务
  nodered:
    image: nodered/node-red  # Node-RED 官方镜像
    container_name: nodered_container
    ports:
      - "1880:1880"  # Node-RED 的默认端口
    networks:
      - mynetwork
networks:
  mynetwork:
    driver: bridge  # 共享网络,使两个容器可以互相通信

Llama3 服务器启动文件 ( llamasvr.py ):

from http.server import BaseHTTPRequestHandler, HTTPServer
import json
from llama_cpp import Llama
# 加载 Llama 模型
llm = Llama(model_path="/app/Llama-3-ELYZA-JP-8B-q4_k_m.gguf", n_ctx=4098)
class LlamaServer(BaseHTTPRequestHandler):
    def do_POST(self):
        content_length = int(self.headers['Content-Length'])
        post_data = self.rfile.read(content_length)
        data = json.loads(post_data)
        # 解析输入文本
        input_text = data.get('input_text', '')
        # 调用 Llama 模型生成回复
        result = llm.generate(input_text)
        # 返回生成的结果
        self.send_response(200)
        self.send_header('Content-Type', 'application/json')
        self.end_headers()
        self.wfile.write(json.dumps({"generated_text": result}).encode('utf-8'))
def run():
    server_address = ('', 8000)
    httpd = HTTPServer(server_address, LlamaServer)
    print('Running Llama server...')
    httpd.serve_forever()
if __name__ == '__main__':
    run()

# 构建和启动容器

使用以下命令构建并启动容器:

# 构建并启动服务
docker-compose up --build

# 在 Node-RED 中配置流程

  1. 打开 Node-RED(默认在 http://localhost:1880 )。

  2. 添加以下节点:

    • inject 节点:用于触发请求。
    • function 节点:设置请求的 JSON 数据,例如 {"input_text": "Hello Llama3"}
    • http request 节点:设置 URL 为 http://llama3_container:8000 ,请求方法为 POST
    • debug 节点:查看返回的结果。

    function 节点代码

    [
        {
            "id": "f6f2187d.f17ca8",
            "type": "tab",
            "label": "Flow 1",
            "disabled": false,
            "info": ""
        },
        {
            "id": "inject_trigger",
            "type": "inject",
            "z": "f6f2187d.f17ca8",
            "name": "Trigger Request",
            "props": [
                {
                    "p": "payload"
                }
            ],
            "repeat": "",
            "crontab": "",
            "once": false,
            "onceDelay": 0.1,
            "topic": "",
            "payload": "",
            "payloadType": "date",
            "x": 160,
            "y": 140,
            "wires": [
                [
                    "set_payload"
                ]
            ]
        },
        {
            "id": "set_payload",
            "type": "function",
            "z": "f6f2187d.f17ca8",
            "name": "Set Payload",
            "func": "msg.headers = {\n    \"Content-Type\": \"application/json\"\n};\n\nmsg.payload = {\n    \"input_text\": \"方世昊\"\n};\n\nreturn msg;",
            "outputs": 1,
            "timeout": "",
            "noerr": 0,
            "initialize": "",
            "finalize": "",
            "libs": [],
            "x": 340,
            "y": 140,
            "wires": [
                [
                    "791174083fe1a4c6"
                ]
            ]
        },
        {
            "id": "791174083fe1a4c6",
            "type": "http request",
            "z": "f6f2187d.f17ca8",
            "name": "POST Request",
            "method": "POST",
            "ret": "obj",
            "paytoqs": "ignore",
            "url": "http://llama3_container:8000",
            "tls": "",
            "persist": false,
            "proxy": "",
            "insecureHTTPParser": false,
            "authType": "",
            "senderr": false,
            "headers": [],
            "x": 540,
            "y": 140,
            "wires": [
                [
                    "debug_output"
                ]
            ]
        },
        {
            "id": "debug_output",
            "type": "debug",
            "z": "f6f2187d.f17ca8",
            "name": "Response",
            "active": true,
            "tosidebar": true,
            "console": false,
            "tostatus": false,
            "complete": "payload",
            "targetType": "msg",
            "statusVal": "",
            "statusType": "auto",
            "x": 720,
            "y": 140,
            "wires": []
        }
    ]

现在就可以进行通信了,在 Node-RED 界面输入想要询问的内容,接下来要做的就是等待抽象返回值。