Qualcomm AI Hub-示例（二）模型性能分析

本文主要是介绍Qualcomm AI Hub-示例（二）模型性能分析，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

文章介绍

模型性能分析（Profiling）

当模型尝试部署到设备时，会面临许多重要问题：

目标硬件的推理延迟是多少？
该模型是否符合一定的内存预算？
模型能够利用神经处理单元吗？

通过在云端的物理设备运行模型完成性能分析，能够解答这些疑问。

编译模型

Qualcomm AI Hub支持分析已编译好的模型。在本例中，我们优化并评测了先前使用submit_compile_job()编译的模型。请注意，我们是如何利用compile_job使用get_target_model()的方法编译的模型。

import qai_hub as hub

# Profile the previously compiled model

profile_job = hub.submit_profile_job(

model=compile_job.get_target_model(),

device=hub.Device("Samsung Galaxy S23"),

)

assert isinstance(profile_job, hub.ProfileJob)

返回值是ProfileJob的一个实例。要查看所有任务的列表，请转到/jobs/。

分析PyTorch模型性能

此示例需要PyTorch，可以按如下方式进行安装。

pip3 install "qai-hub[torch]"

在本例中，我们使用Qualcomm AI Hub优化和评测PyTorch模型。

from typing import List, Tuple

import torch

import qai_hub as hub

class SimpleNet(torch.nn.Module):

def __init__(self):

super().__init__()

self.linear = torch.nn.Linear(5, 2)

def forward(self, x):

return self.linear(x)

input_shapes: List[Tuple[int, ...]] = [(3, 5)]

torch_model = SimpleNet()

# Trace the model using random inputs

torch_inputs = tuple(torch.randn(shape) for shape in input_shapes)

pt_model = torch.jit.trace(torch_model, torch_inputs)

# Submit compile job

compile_job = hub.submit_compile_job(

model=pt_model,

device=hub.Device("Samsung Galaxy S23 Ultra"),

input_specs=dict(x=input_shapes[0]),

)

assert isinstance(compile_job, hub.CompileJob)

# Submit profile job using results form compile job

profile_job = hub.submit_profile_job(

model=compile_job.get_target_model(),

device=hub.Device("Samsung Galaxy S23 Ultra"),

)

assert isinstance(profile_job, hub.ProfileJob)

有关上传、编译和提交任务时选项的更多信息，请参考upload_model(), submit_compile_job() 和submit_profile_job().

分析TorchScript模型性能

如果您已经保存了traced或脚本化的torch模型（使用torch.jit.save保存），则可以直接提交。我们将以mobilenet_v2.pt为例。与前面的示例类似，只有在将TorchScript模型编译到合适的目标之后，才能对其进行概要评测。

import qai_hub as hub

# Compile previously saved torchscript model

compile_job = hub.submit_compile_job(

model="mobilenet_v2.pt",

device=hub.Device("Samsung Galaxy S23 Ultra"),

input_specs=dict(image=(1, 3, 224, 224)),

)

assert isinstance(compile_job, hub.CompileJob)

profile_job = hub.submit_profile_job(

model=compile_job.get_target_model(),

device=hub.Device("Samsung Galaxy S23 Ultra"),

)

assert isinstance(profile_job, hub.ProfileJob)

分析ONNX模型性能

Qualcomm AI Hub还支持ONNX。与前面的示例类似，只有在ONNX模型编译到合适的目标之后，才能对其进行评测。我们将以 mobilenet_v2.onnx为例。

import qai_hub as hub

compile_job = hub.submit_compile_job(

model="mobilenet_v2.onnx",

device=hub.Device("Samsung Galaxy S23 Ultra"),

)

assert isinstance(compile_job, hub.CompileJob)

profile_job = hub.submit_profile_job(

model=compile_job.get_target_model(),

device=hub.Device("Samsung Galaxy S23"),

)

assert isinstance(profile_job, hub.ProfileJob)

分析TensorFlow Lite模型性能

Qualcomm AI Hub还支持以.tflite格式对模型Profiling。我们将使用SqueezeNet10 model。

import qai_hub as hub

# Profile TensorFlow Lite model (from file)

profile_job = hub.submit_profile_job(

model="SqueezeNet10.tflite",

device=hub.Device("Samsung Galaxy S23 Ultra"),

)

在多个设备上分析模型

通常，对多个设备的性能进行建模是很重要的。在本例中，我们介绍了最近的Snapdragon®8 Gen 1和Snapdragon™8 Gen 2设备，以获得良好的测试覆盖率。我们重用TensorFlow Lite示例中的SqueezeNet model，但这次我们在两个设备上对其进行了评测。

import qai_hub as hub

devices = [

hub.Device("Samsung Galaxy S23 Ultra"), # Snapdragon 8 Gen 2

hub.Device("Samsung Galaxy S22 Ultra 5G"), # Snapdragon 8 Gen 1

]

jobs = hub.submit_profile_job(model="SqueezeNet10.tflite", device=devices)

为每个设备创建一个单独的评测任务。

上传模型以进行评测

可以在不提交评测任务的情况下上传模型（例如SqueezeNet10.tflite）。

import qai_hub as hub

hub_model = hub.upload_model("SqueezeNet10.tflite")

print(hub_model)

现在，您可以使用上传的模型的model_id来运行评测任务。

import qai_hub as hub

# Retrieve model using ID

hub_model = hub.get_model("mabc123")

# Submit job

profile_job = hub.submit_profile_job(

model=hub_model,

device=hub.Device("Samsung Galaxy S23 Ultra"),

input_shapes=dict(x=(1, 3, 224, 224)),

)

分析已编译好的模型

我们可以重用以前作业中的模型来启动新的评测任务（例如，在不同的设备上）。这样可以避免多次上传同一个模型。

import qai_hub as hub

# Get the model from the profile job

profile_job = hub.get_job("jabc123")

hub_model = profile_job.model

# Run the model from the job

new_profile_job = hub.submit_profile_job(

model=hub_model,

device=hub.Device("Samsung Galaxy S22 Ultra 5G"),

)

作者：高通工程师，戴忠忠（Zhongzhong Dai）

这篇关于Qualcomm AI Hub-示例（二）模型性能分析的文章就介绍到这儿，希望我们推荐的文章对编程师们有所帮助！

Qualcomm AI Hub-示例（二）模型性能分析

文章介绍

模型性能分析（Profiling）

编译模型

分析PyTorch模型性能

分析TorchScript模型性能

分析ONNX模型性能

分析TensorFlow Lite模型性能

在多个设备上分析模型

上传模型以进行评测

分析已编译好的模型

相关文章

python panda库从基础到高级操作分析

MySQL中EXISTS与IN用法使用与对比分析

MySQL常用字符串函数示例和场景介绍

MySQL 内存使用率常用分析语句

深度解析Nginx日志分析与499状态码问题解决

SQL Server 中的 WITH (NOLOCK) 示例详解

MySQL CTE (Common Table Expressions)示例全解析

Spring AI使用tool Calling和MCP的示例详解

go动态限制并发数量的实现示例

PyTorch中的词嵌入层(nn.Embedding)详解与实战应用示例