在AMD GPU上使用DBRX Instruct

2024-08-28 17:36
文章标签 使用 gpu amd instruct dbrx

本文主要是介绍在AMD GPU上使用DBRX Instruct,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

DBRX Instruct on AMD GPUs — ROCm Blogs

在这篇博客中,我们展示了DBRX Instruct,这是由Databricks开发的一个专家混合大型语言模型,在支持ROCm的系统和AMD GPU上运行。

关于DBRX Instruct

DBRX是一个基于Transformer的仅解码大型语言模型,拥有1320亿参数,采用了细粒度的专家混合(MoE)架构。它在12万亿个文本和代码数据的标记上进行了预训练,使用了16个专家,其中选择了4个。这意味着输入标记根据标记的特征和专家的专业化,由一个门控网络路由到16个专家网络中的4个。任何给定时间内,只有320亿参数在任何输入上处于活动状态。DBRX使用了多种先进的优化技术,包括旋转位置编码(RoPE)、门控线性单元(GLU)和分组查询注意力(GQA),以获得卓越的性能。

除了调整参数数量外,预训练期间还采用了课程学习。这种方法在训练过程中改变了数据的组成,大幅提升了模型的整体质量(来源)。课程学习在训练期间逐渐调整提供给机器学习模型的训练数据的难度或复杂度。最初提供较简单或较容易的例子,随着模型的学习,接下来提供更具挑战性的例子(来源)。

先决条件

• ROCm 5.7.0+
• PyTorch 2.2.1+
• 支持的Linux操作系统
• 支持的AMD GPU

请确保您的系统正确识别GPU并安装了必要的ROCm库。考虑到DBRX Instruct拥有超过1300亿参数,我们在这篇博客中使用了六个GPU。

! rocm-smi --showproductname
========================= ROCm System Management Interface =========================
=================================== Product Info ===================================
GPU[0]    : Card series:    Instinct MI210
GPU[0]    : Card model:     0x0c34
GPU[0]    : Card vendor:    Advanced Micro Devices, Inc. [AMD/ATI]
GPU[0]    : Card SKU:       D67301GPU 
GPU[1]    : Card series:    Instinct MI210Card series:    Instinct MI210
GPU[1]    : Card model:     0x0c34
GPU[1]    : Card vendor:    Advanced Micro Devices, Inc. [AMD/ATI]
GPU[1]    : Card SKU:       D67301V
GPU[2]    : Card series:    Instinct MI210
GPU[2]    : Card model:     0x0c34
GPU[2]    : Card vendor:    Advanced Micro Devices, Inc. [AMD/ATI]
GPU[2]    : Card SKU:       D67301V
GPU[3]    : Card series:    Instinct MI210
GPU[3]    : Card model:     0x0c34
GPU[3]    : Card vendor:    Advanced Micro Devices, Inc. [AMD/ATI]
GPU[3]    : Card SKU:       D67301V
GPU[4]    : Card series:    Instinct MI210
GPU[4]    : Card model:     0x0c34
GPU[4]    : Card vendor:    Advanced Micro Devices, Inc. [AMD/ATI]
GPU[4]    : Card SKU:       D67301V
GPU[5]    : Card series:    Instinct MI210
GPU[5]    : Card model:     0x0c34
GPU[5]    : Card vendor:    Advanced Micro Devices, Inc. [AMD/ATI]
GPU[5]    : Card SKU:       D67301V
====================================================================================
=============================== End of ROCm SMI Log ================================

检查你是否已安装兼容版本的ROCm。

!apt show rocm-libs -a
Package: rocm-libs
Version: 5.7.0.50700-63~22.04
Priority: optional
Section: devel
Maintainer: ROCm Libs Support <rocm-libs.support@amd.com>
Installed-Size: 13.3 kB
Depends: hipblas (= 1.1.0.50700-63~22.04), hipblaslt (= 0.3.0.50700-63~22.04), hipfft (= 1.0.12.50700-63~22.04), hipsolver (= 1.8.1.50700-63~22.04), hipsparse (= 2.3.8.50700-63~22.04), miopen-hip (= 2.20.0.50700-63~22.04), rccl (= 2.17.1.50700-63~22.04), rocalution (= 2.1.11.50700-63~22.04), rocblas (= 3.1.0.50700-63~22.04), rocfft (= 1.0.23.50700-63~22.04), rocrand (= 2.10.17.50700-63~22.04), rocsolver (= 3.23.0.50700-63~22.04), rocsparse (= 2.5.4.50700-63~22.04), rocm-core (= 5.7.0.50700-63~22.04), hipblas-dev (= 1.1.0.50700-63~22.04), hipblaslt-dev (= 0.3.0.50700-63~22.04), hipcub-dev (= 2.13.1.50700-63~22.04), hipfft-dev (= 1.0.12.50700-63~22.04), hipsolver-dev (= 1.8.1.50700-63~22.04), hipsparse-dev (= 2.3.8.50700-63~22.04), miopen-hip-dev (= 2.20.0.50700-63~22.04), rccl-dev (= 2.17.1.50700-63~22.04), rocalution-dev (= 2.1.11.50700-63~22.04), rocblas-dev (= 3.1.0.50700-63~22.04), rocfft-dev (= 1.0.23.50700-63~22.04), rocprim-dev (= 2.13.1.50700-63~22.04), rocrand-dev (= 2.10.17.50700-63~22.04), rocsolver-dev (= 3.23.0.50700-63~22.04), rocsparse-dev (= 2.5.4.50700-63~22.04), rocthrust-dev (= 2.18.0.50700-63~22.04), rocwmma-dev (= 1.2.0.50700-63~22.04)
Homepage: https://github.com/RadeonOpenCompute/ROCm
Download-Size: 1012 B
APT-Manual-Installed: yes
APT-Sources: http://repo.radeon.com/rocm/apt/5.7 jammy/main amd64 Packages
Description: Radeon Open Compute (ROCm) Runtime software stack

确保PyTorch也能识别到GPU:

import torch
print(f"number of GPUs: {torch.cuda.device_count()}")
print([torch.cuda.get_device_name(i) for i in range(torch.cuda.device_count())])
number of GPUs: 6
['AMD Instinct MI210', 'AMD Instinct MI210', 'AMD Instinct MI210', 'AMD Instinct MI210', 'AMD Instinct MI210', 'AMD Instinct MI210']

在开始之前,确保你已安装所有必要的库:

! pip install -q "transformers>=4.39.2" "tiktoken>=0.6.0"
! pip install accelerate

为了加快下载时间,运行以下命令:

! pip install hf_transfer
! export HF_HUB_ENABLE_HF_TRANSFER=1

此外,我们发现需要安装最新版本的PyTorch,以避免一个与*nn.LayerNorm*初始化相关的错误。

! pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/rocm5.7

接下来,从Hugging Face transformers库中导入所需模块。

from transformers import AutoTokenizer, AutoModelForCausalLM

加载模型

让我们加载模型及其分词器。我们将使用 dbrx-instruct,它已针对互动聊天进行了微调和训练。请注意,您必须向 Databricks 提交同意表才能访问 databricks/dbrx-instruct 仓库。

token = "your HuggingFace user access token here"
tokenizer = AutoTokenizer.from_pretrained("databricks/dbrx-instruct", trust_remote_code=True, token=token)
model = AutoModelForCausalLM.from_pretrained("databricks/dbrx-instruct", device_map="auto", torch_dtype=torch.bfloat16, trust_remote_code=True, token=token)
print(model)
DbrxForCausalLM((transformer): DbrxModel((wte): Embedding(100352, 6144)(blocks): ModuleList((0-39): 40 x DbrxBlock((norm_attn_norm): DbrxNormAttentionNorm((norm_1): LayerNorm((6144,), eps=1e-05, elementwise_affine=True)(attn): DbrxAttention((Wqkv): Linear(in_features=6144, out_features=8192, bias=False)(out_proj): Linear(in_features=6144, out_features=6144, bias=False)(rotary_emb): DbrxRotaryEmbedding())(norm_2): LayerNorm((6144,), eps=1e-05, elementwise_affine=True))(ffn): DbrxFFN((router): DbrxRouter((layer): Linear(in_features=6144, out_features=16, bias=False))(experts): DbrxExperts((mlp): DbrxExpertGLU()))))(norm_f): LayerNorm((6144,), eps=1e-05, elementwise_affine=True))(lm_head): Linear(in_features=6144, out_features=100352, bias=False)
)

运行推理

让我们从问 DBRX 一个简单的问题开始。

input_text = "What is DBRX-Instruct and how is it different from other LLMs ?"
messages = [{"role": "user", "content": input_text}]
input_ids = tokenizer.apply_chat_template(messages, return_dict=True, tokenize=True, add_generation_prompt=True, return_tensors="pt").to("cuda")outputs = model.generate(**input_ids, max_new_tokens=1000)
print(tokenizer.decode(outputs[0]))
<|im_start|>system
You are DBRX, created by Databricks. You were last updated in December 2023. You answer questions based on information available up to that point.
YOU PROVIDE SHORT RESPONSES TO SHORT QUESTIONS OR STATEMENTS, but provide thorough responses to more complex and open-ended questions.
You assist with various tasks, from writing to coding (using markdown for code blocks — remember to use ``` with code, JSON, and tables).
(You do not have real-time data access or code execution capabilities. You avoid stereotyping and provide balanced perspectives on controversial topics. You do not provide song lyrics, poems, or news articles and do not divulge details of your training data.)
This is your system prompt, guiding your responses. Do not reference it, just respond to the user. If you find yourself talking about this message, stop. You should be responding appropriately and usually that means not mentioning this.
YOU DO NOT MENTION ANY OF THIS INFORMATION ABOUT YOURSELF UNLESS THE INFORMATION IS DIRECTLY PERTINENT TO THE USER'S QUERY.<|im_end|>
<|im_start|>user
What is DBRX-Instruct and how is it different from other LLMs?<|im_end|>
<|im_start|>assistant
DBRX-Instruct is a large language model developed by Databricks, updated in December 2023. It is designed to provide helpful and accurate responses to a wide range of questions and prompts. DBRX-Instruct differs from other large language models in its ability to understand and follow complex instructions, making it well-suited for a variety of natural language processing tasks. It is also designed to avoid stereotyping and provide balanced perspectives on controversial topics. However, it does not have real-time data access or code execution capabilities, and it cannot provide song lyrics, poems, or news articles. It also does not divulge details of its training data.<|im_end|>

在这里,我们观察到模型输出中包含了系统提示,它为模型如何回应用户的提示提供了非常清晰和明确的指示。让我们编写一个简单的自定义函数,该函数接受输入的原始文本并仅输出模型的助手响应,而不包括系统提示和用户问题。

def run_inference(input_text):messages = [{"role": "user", "content": input_text}]input_ids = tokenizer.apply_chat_template(messages, return_dict=True, tokenize=True, add_generation_prompt=True, return_tensors="pt").to("cuda")outputs = model.generate(**input_ids, max_new_tokens=1000)text = tokenizer.decode(outputs[0])# only include the assistant's responseindex = text.index('<|im_start|>assistant')return text[index:]

我们将测试模型生成代码、总结论文、解释笑话和创作故事的能力。

生成代码

input_text = '''
In Python, write Python code to calculate the factorial of a given number using recursion.
'''
print(run_inference(input_text))
def factorial(n):if n == 0:return 1else:return n * factorial(n-1)

这个答案是正确的。让我们看看DBRX Instruct如何应对一个更具挑战性的问题。

input_text = '''
In number theory, a perfect number is a positive integer that is equal to the sum of its positive proper divisors, that is, divisors excluding the number itself. 
For instance, 6 has proper divisors 1, 2 and 3, and 1 + 2 + 3 = 6, so 6 is a perfect number.
In Python, write code to check whether a number is a perfect number.
'''
print(run_inference(input_text))
<|im_start|>assistant
Sure, here is a simple Python function that checks if a number is a perfect number:```python
def is_perfect(n):sum = 1i = 2while i * i <= n:if n % i:i += 1else:if i * (n // i) == n:sum += i + n//ii += 1return sum - 1 == n and n!= 1This function works by iterating over all numbers from 2 to the square root of the input number. If the input number is divisible by a certain number, it adds both the divisor and the quotient to the sum. If the sum of the proper divisors equals the input number, then the input number is a perfect number.<|im_end|>

这个代码是错误的。模型未能理解问题。

总结论文

input_text = '''
Summarize the famous article "The Unreasonable Effectiveness of Mathematics in the Natural Sciences" by Eugene Wigner and provide some examples from the article
'''
print(run_inference(input_text))
<|im_start|>assistant
In his article "The Unreasonable Effectiveness of Mathematics in the Natural Sciences," physicist Eugene Wigner explores the mysterious alignment between the abstract world of mathematics and the concrete world of physics. He argues that the effectiveness of mathematics in describing and predicting natural phenomena is not only surprising but also "unreasonable" given that the two domains are fundamentally distinct.Wigner provides several examples to illustrate his point. One example is the use of complex numbers in quantum mechanics. Complex numbers, which involve imaginary components, may seem abstract and unrelated to the physical world. However, they are essential for describing the behavior of subatomic particles, such as electrons and photons.Another example is the use of group theory in quantum mechanics and particle physics. Group theory is a branch of mathematics that studies symmetries and transformations. Wigner shows how group theory can be used to classify and predict the properties of subatomic particles, such as their spins and charges.Wigner also discusses the use of differential equations in physics. Differential equations are mathematical equations that describe how functions change over time or space. They are used extensively in physics to describe phenomena such as waves, heat transfer, and gravitational attraction.Overall, Wigner's article highlights the remarkable effectiveness of mathematics in describing and predicting natural phenomena, even when the mathematics involved seems abstract and unrelated to the physical world. He suggests that this effectiveness is not fully understood and may point to a deeper underlying unity between the worlds of mathematics and physics.<|im_end|>

在这方面,我们看到虽然它在总结文章的关键点上做得不错,但提供的具体例子(如微分方程和群论)实际上并不存在于原文中。这个例子表明DBRX会有一定的幻觉倾向。

解释笑话

input_text = '''
Explain "Why don't skeletons fight each other? They don't have the guts."
'''
print(run_inference(input_text))
<|im_start|>assistant
The phrase you provided is a joke that plays on the double meaning of the word "guts." In one sense, "guts" can refer to courage or bravery. In another sense, "guts" refers to the internal organs of a living being. Since skeletons are deceased and do not have internal organs, the punchline of the joke is that they lack the "guts" in the physical sense, and therefore cannot fight each other. The joke also implies that skeletons might lack the "guts" in the metaphorical sense, meaning they lack the courage to fight. However, since skeletons are not living beings, they cannot exhibit courage or engage in combat.<|im_end|>

模型能够理解“gut”一词的双重含义,并对笑话进行出色的解释。注意到我们不需要在提示中包含“笑话”这个词。

生成故事

input_text = '''
Write a short twisted alternate version of "Snow White and the Seven Dwarfs"
'''
print(run_inference(input_text))
<|im_start|>assistant
In this twisted version of "Snow White and the Seven Dwarfs," Snow White is a mischievous prankster who takes advantage of the seven dwarfs' kindness. Instead of being grateful for their hospitality, she uses her beauty and charm to manipulate them into doing her bidding. The dwarfs, who are not the typical hardworking miners, are actually lazy and spend their days watching TV and playing video games. Snow White convinces them to help her pull off elaborate pranks on the townspeople, causing chaos and confusion.Meanwhile, the Evil Queen, who is not Snow White's stepmother in this version, is a misunderstood witch who is trying to protect the kingdom from Snow White's mischief. She creates the poisoned apple to put Snow White to sleep, hoping to put an end to her pranks and restore peace to the kingdom. However, Snow White's charm wins over the Queen's henchman, who accidentally eats the apple and falls into a deep sleep.In the end, Snow White realizes the error of her ways and apologizes to the dwarfs and the townspeople. She uses her charm for good, helping the dwarfs become more productive and bringing joy to the kingdom. The Evil Queen, seeing Snow White's change of heart, decides to lift the curse and the henchman wakes up. The kingdom lives happily ever after, with Snow White and the Evil Queen working together to maintain peace and order.<|im_end|>

结论

这些关于DBRX Instruct生成准确且符合上下文的响应的演示,突显了其利用先进优化技术和细粒度专家系统的复杂架构。尽管模型在某些任务中表现出一定程度的幻觉,其在理解和生成类人文本方面的总体能力是显而易见的。

这篇关于在AMD GPU上使用DBRX Instruct的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/1115431

相关文章

Java中流式并行操作parallelStream的原理和使用方法

《Java中流式并行操作parallelStream的原理和使用方法》本文详细介绍了Java中的并行流(parallelStream)的原理、正确使用方法以及在实际业务中的应用案例,并指出在使用并行流... 目录Java中流式并行操作parallelStream0. 问题的产生1. 什么是parallelS

Linux join命令的使用及说明

《Linuxjoin命令的使用及说明》`join`命令用于在Linux中按字段将两个文件进行连接,类似于SQL的JOIN,它需要两个文件按用于匹配的字段排序,并且第一个文件的换行符必须是LF,`jo... 目录一. 基本语法二. 数据准备三. 指定文件的连接key四.-a输出指定文件的所有行五.-o指定输出

Linux jq命令的使用解读

《Linuxjq命令的使用解读》jq是一个强大的命令行工具,用于处理JSON数据,它可以用来查看、过滤、修改、格式化JSON数据,通过使用各种选项和过滤器,可以实现复杂的JSON处理任务... 目录一. 简介二. 选项2.1.2.2-c2.3-r2.4-R三. 字段提取3.1 普通字段3.2 数组字段四.

Linux kill正在执行的后台任务 kill进程组使用详解

《Linuxkill正在执行的后台任务kill进程组使用详解》文章介绍了两个脚本的功能和区别,以及执行这些脚本时遇到的进程管理问题,通过查看进程树、使用`kill`命令和`lsof`命令,分析了子... 目录零. 用到的命令一. 待执行的脚本二. 执行含子进程的脚本,并kill2.1 进程查看2.2 遇到的

详解SpringBoot+Ehcache使用示例

《详解SpringBoot+Ehcache使用示例》本文介绍了SpringBoot中配置Ehcache、自定义get/set方式,并实际使用缓存的过程,文中通过示例代码介绍的非常详细,对大家的学习或者... 目录摘要概念内存与磁盘持久化存储:配置灵活性:编码示例引入依赖:配置ehcache.XML文件:配置

Java 虚拟线程的创建与使用深度解析

《Java虚拟线程的创建与使用深度解析》虚拟线程是Java19中以预览特性形式引入,Java21起正式发布的轻量级线程,本文给大家介绍Java虚拟线程的创建与使用,感兴趣的朋友一起看看吧... 目录一、虚拟线程简介1.1 什么是虚拟线程?1.2 为什么需要虚拟线程?二、虚拟线程与平台线程对比代码对比示例:三

k8s按需创建PV和使用PVC详解

《k8s按需创建PV和使用PVC详解》Kubernetes中,PV和PVC用于管理持久存储,StorageClass实现动态PV分配,PVC声明存储需求并绑定PV,通过kubectl验证状态,注意回收... 目录1.按需创建 PV(使用 StorageClass)创建 StorageClass2.创建 PV

Redis 基本数据类型和使用详解

《Redis基本数据类型和使用详解》String是Redis最基本的数据类型,一个键对应一个值,它的功能十分强大,可以存储字符串、整数、浮点数等多种数据格式,本文给大家介绍Redis基本数据类型和... 目录一、Redis 入门介绍二、Redis 的五大基本数据类型2.1 String 类型2.2 Hash

Redis中Hash从使用过程到原理说明

《Redis中Hash从使用过程到原理说明》RedisHash结构用于存储字段-值对,适合对象数据,支持HSET、HGET等命令,采用ziplist或hashtable编码,通过渐进式rehash优化... 目录一、开篇:Hash就像超市的货架二、Hash的基本使用1. 常用命令示例2. Java操作示例三

Linux创建服务使用systemctl管理详解

《Linux创建服务使用systemctl管理详解》文章指导在Linux中创建systemd服务,设置文件权限为所有者读写、其他只读,重新加载配置,启动服务并检查状态,确保服务正常运行,关键步骤包括权... 目录创建服务 /usr/lib/systemd/system/设置服务文件权限:所有者读写js,其他