diff --git a/AscendIE/TorchAIE/built-in/audio/espnet/README.md b/AscendIE/TorchAIE/built-in/audio/espnet/README.md
index 0852680cccf7a184e385c4b8c24c00b370c1e486..7300ebd1cb24d3b10e3ce671c85796d5b64eae28 100644
--- a/AscendIE/TorchAIE/built-in/audio/espnet/README.md
+++ b/AscendIE/TorchAIE/built-in/audio/espnet/README.md
@@ -1,46 +1,44 @@
# Espnet_conformer模型-推理指导
-- [概述](#ZH-CN_TOPIC_0000001172161501)
+- [概述](#概述)
+ - [代码获取](#代码获取)
+ - [权重文件获取](#权重文件获取)
+ - [准备数据集](#准备数据集)
+ - [输入输出数据](#输入输出数据)
-- [推理环境准备](#ZH-CN_TOPIC_0000001126281702)
+- [推理环境准备](#推理环境准备)
-- [快速上手](#ZH-CN_TOPIC_0000001126281700)
-
- - [获取源码](#section183221994400)
- - [准备数据集](#section183221994411)
+- [快速上手](#快速上手)
- [模型推理](#section741711594517)
- [模型推理性能精度](#ZH-CN_TOPIC_0000001172201573)
-
-
-
-
-# 概述
+# 概述
Espnet_conformer模型是一个使用conformer结构的ASR模型。
-- 参考实现:
+参考实现:
+
+```
+url=https://github.com/espnet/espnet
+branch=v.0.10.5
+model_name=tacotron2
+```
- ```
- url=git clone https://github.com/espnet/espnet
- branch=v.0.10.5
- model_name=tacotron2
- ```
-
+## 代码获取
-通过Git获取对应commit\_id的代码方法如下:
+通过Git获取代码方法如下:
```
- git clone {repository_url} # 克隆仓库的代码
- cd {repository_name} # 切换到模型的代码仓目录
- git checkout {branch/tag} # 切换到对应分支
- git reset --hard {commit_id} # 代码设置到对应的commit_id(可选)
- cd {code_path} # 切换到模型代码所在路径,若仓库下只有该模型,则无需切换
+git clone https://github.com/espnet/espnet.git # 克隆仓库的代码
+cd espnet # 切换到模型的代码仓目录
+git checkout master # 切换到对应分支
```
+**同时将本工程下的所有文件移动至克隆仓库的路径下**
+
EspNet安装比较复杂,请参考https://espnet.github.io/espnet/installation.html
若安装mkl失败,则去launchpad.net/ubuntu/+source/intel-mkl/2020.0.166-1
@@ -49,123 +47,107 @@ EspNet安装比较复杂,请参考https://espnet.github.io/espnet/installation
注意:mkl arm不适用于arm版本安装,推荐适用x86环境
-## 输入输出数据
-
-- encoder输入数据
-
- | 输入数据 | 大小 | 数据类型 | 数据排布格式 |
- | -------- | -------- | ------------------------- | ------------ |
- | input | input_dynamic_axes_1 x 83 | FLOAT32 | ND |
-
-
-- encoder输出数据
-
- | 输出数据 | 大小 | 数据类型 | 数据排布格式 |
- | -------- | -------- | -------- | ------------ |
- | 2863 | Add2863_dim_0x Add2863_dim_1 x 256 | FLOAT32 | ND |
-
+## 权重文件获取
+下载路径:https://github.com/espnet/espnet/blob/master/egs/aishell/asr1/RESULTS.md
+对应Conformer(kernel size = 15) + SpecAugment + LM weight = 0.0下面的model link即可
+解压,将对应的conf,data, exp文件夹置于espnet/egs/aishell/asr1
-# 推理环境准备\[所有版本\]
+## 准备数据集
-- 该模型需要以下依赖
-
- **表 1** 版本配套表
-
-| 配套 | 版本 |
-|-----------------------|-----------------|
-| CANN | 6.3.RC2.alph002 | - |
-| Python | 3.9.0 |
-| torch | 2.0.1 |
-| Ascend-cann-torch-aie | -
-| Ascend-cann-aie | -
-| 芯片类型 | Ascend310P3 | - |
-
-# 快速上手
-安装依赖。
+在espnet/egs/aishell/asr1/文件夹下运行bash run.sh --stage -1 –stop_stage -1下载数据集
- om_gener安装
+运行bash run.sh --stage 0 --stop_stage 0处理数据集
- ```
- git clone https://gitee.com/peng-ao/om_gener.git
- cd om_gener
- pip3 install .
- ```
+运行bash run.sh --stage 1 --stop_stage 1处理数据集
- acl_infer安装
+运行bash run.sh --stage 2 --stop_stage 2处理数据集
- ```
- git clone https://gitee.com/peng-ao/pyacl.git
- cd pyacl
- pip3 install .
- ```
+运行bash run.sh --stage 3 --stop_stage 3处理数据集
-## 获取源码
+若缺少对应的文件夹,则自己建立文件夹
-在工作目录下执行下述命令获取源码并切换到相应路径。
+## 输入输出数据
-请按照官方指导文档进行代码的安装
+- encoder输入数据
-## 准备数据集
+| 输入数据 | 大小 | 数据类型 | 数据排布格式 |
+| -------- | -------- | ------------------------- | ------------ |
+| input | input_dynamic_axes_1 x 83 | FLOAT32 | ND |
-在espnet/egs/aishell/asr1/文件夹下运行bash run.sh --stage -1 –stop_stage -1下载数据集
+- encoder输出数据
-运行bash run.sh --stage 0 --stop_stage 0处理数据集
+| 输出数据 | 大小 | 数据类型 | 数据排布格式 |
+| -------- | -------- | -------- | ------------ |
+| 2863 | Add2863_dim_0 x Add2863_dim_1 x 256 | FLOAT32 | ND |
-运行bash run.sh --stage 1 --stop_stage 1处理数据集
-运行bash run.sh --stage 2 --stop_stage 2处理数据集
+# 推理环境准备
-运行bash run.sh --stage 3 --stop_stage 3处理数据集
+该模型所用的固件及环境如下:
+| 配套 | 版本 |
+|-----------------------|------------------|
+| CANN | 7.0.RC1.alpha003 |
+| Python | 3.10.0 |
+| torch | 2.1.0 |
+| MindIETorch | 1.0.RC1 |
+| 芯片类型 | Ascend310P3 |
-若缺少对应的文件夹,则自己建立文件夹
-
-## 模型推理
+使用如下命令安装om_gener
+```
+git clone https://gitee.com/peng-ao/om_gener.git
+cd om_gener
+pip3 install .
+```
-1. 模型转换。
+使用如下命令安装acl_infer
+```
+git clone https://gitee.com/peng-ao/pyacl.git
+cd pyacl
+pip3 install .
+```
- 本模型基于开源框架PyTorch训练的Espnet_conformer进行模型转换。使用PyTorch将模型权重文件.pth转换为.onnx文件,再使用ATC工具将.onnx文件转为离线推理模型文件.om文件。
- 1. 在checkpoints目录下获取权重文件。
+# 快速上手
- 下载路径:https://github.com/espnet/espnet/blob/master/egs/aishell/asr1/RESULTS.md
-
- 对应Conformer(kernel size = 15) + SpecAugment + LM weight = 0.0下面的model link即可
-
- 解压,将对应的conf,data, exp文件夹置于espnet/egs/aishell/asr1
-
- 2. 导出torchscript模型,用于编译优化。
+## 模型编译
- 1. 执行以下命令修改源码
- ```shell
- cd espnet
- git checkout v.0.10.5
- patch -p1 < export_onnx.diff
- ```
- 2. 将export.py放在espnet根目录下,运行以下生成espnet_trace.ts
- ```
- python3 export.py --model_path egs/aishell/asr1/exp/train_sp_pytorch_train_pytorch_conformer_kernel15_specaug/results/model.last10.avg.best
- ```
- 2. 运行以下命令编译模型 (注意:编译aie模型依赖的环境和espnet运行环境不同;编译环境参考“推理环境准备”)
- ```shell
- # 分档模型
- python3 compile.py --model_path=./espnet_trace.ts --flag=gear
- # 动态shape模型
- python3 compile.py --model_path=./espnet_trace.ts --flag=dynamic
- ```
- 执行结束,会在当前目录下生成espnet_gear.ts, espnet_dynamic.ts, espnet_gear.om, espnet_dynamic.om文件。
- 两个ts文件用于后续性能测试,两个om文件用于后续精度测试。
+1. 导出torchscript模型,用于编译优化
+ - 执行以下命令修改源码
+ ```shell
+ cd espnet
+ git checkout v.0.10.5
+ patch -p1 < export_onnx.diff
+ ```
+ - 运行以下命令生成espnet_trace.ts
+ ```
+ python3 export.py --model_path egs/aishell/asr1/exp/train_sp_pytorch_train_pytorch_conformer_kernel15_specaug/results/model.last10.avg.best
+ ```
+ - 运行以下命令编译模型
+ ```shell
+ # 分档模型
+ python3 compile.py --model_path=./espnet_trace.ts --flag=gear
+ # 动态shape模型
+ python3 compile.py --model_path=./espnet_trace.ts --flag=dynamic
+ ```
+ 执行结束,会在当前目录下生成espnet_gear.ts, espnet_dynamic.ts, espnet_gear.om, espnet_dynamic.om文件。
+ 两个ts文件用于后续性能测试,两个om文件用于后续精度测试。
-2. 开始推理验证。
+2. 导出fx模型,用于编译优化
+ - 执行以下命令获取模型espnet_dynamic_fx.pt用于后续性能以及精度测试
+ ```shell
+ python compile_fx.py
+ ```
- 1. 获取精度
+## 模型推理
- ①静态shape
+1. 使用torchscript开始推理验证
+ - 精度验证
+ **静态shape**
首先修改acc.diff文件中的om模型路径(约162行)为生成的om路径
-
```shell
cd espnet
git checkout v.0.10.5
@@ -174,10 +156,8 @@ EspNet安装比较复杂,请参考https://espnet.github.io/espnet/installation
bash acc.sh
```
- ②动态shape
-
+ **动态shape**
首先修改acc_dynamic.diff文件中的om模型路径(约162行)为生成的om路径
-
```shell
cd espnet
git checkout v.0.10.5
@@ -186,10 +166,9 @@ EspNet安装比较复杂,请参考https://espnet.github.io/espnet/installation
cd espnet/egs/aishell/asr1
bash acc.sh
```
-
即可打屏获取精度,精度保存在文件espnet/egs/aishell/asr1/exp/train_sp_pytorch_train_pytorch_conformer_kernel15_specaug/decode_test_decode_lm0.0_lm_4/result.txt
- 2. 性能测试
+ - 性能测试
```shell
# 分档模型
python3 perf_test.py --model_path=./espnet_gear.ts
@@ -198,12 +177,35 @@ EspNet安装比较复杂,请参考https://espnet.github.io/espnet/installation
```
执行结束,会打印出性能结果。
+2. 使用fx开始推理验证
+
+ - 精度验证
+ 执行如下代码验证编译后的模型与原始模型输出的余弦相似度
+ ```shell
+ python3 perf_test_fx.py --mode accuracy
+ ```
+
+ - 性能测试
+ 执行如下代码获取PTA基准性能
+ ```shell
+ python3 perf_test_pta.py
+ ```
+ 执行如下代码获得动态模型性能
+ ```shell
+ python3 perf_test_fx.py --mode performance
+ ```
+
-# 模型推理性能精度
+# 模型推理性能精度
-调用aclruntime推理计算,性能精度参考下列数据。
+TorchScript的性能精度参考下列数据
-| 模型 | 310P性能(pt插件) | 310P性能(om) | 310P精度(Err) |
-|-----------------|--------------------|-------------|-------------|
-| Espnet_conformer | 分档:358fps;动态:55fps | 分档:430fps;动态:25fps | 5.4% |
+| 模型 | 310P性能(pt插件) | 310P性能(om) | 310P精度(Err) |
+|-----------------|---------------------------|---------------------------|--------------|
+| Espnet_conformer | 分档:358fps;动态:55fps | 分档:430fps;动态:25fps | 5.4% |
+FX的性能精度参考下列数据,使用FX图编译得到的动态模型的性能超过PTA模型的1.5倍,满足交付要求
+| 模型 | 310P性能 |
+|----------------------|---------------------------|
+| FX编译模型(动态) | 63.13 fps |
+| PTA模型 | 40.35 fps |
diff --git a/AscendIE/TorchAIE/built-in/audio/espnet/compile.py b/AscendIE/TorchAIE/built-in/audio/espnet/compile.py
index e98d668f8022179359a5e21e645fef50c38c1a43..a250950ed3efe95df030e60db13f5a8dc39718b1 100644
--- a/AscendIE/TorchAIE/built-in/audio/espnet/compile.py
+++ b/AscendIE/TorchAIE/built-in/audio/espnet/compile.py
@@ -12,11 +12,11 @@
# See the License for the specific language governing permissions and
# limitations under the License.
+import os
import argparse
import torch
-import torch_aie
-from torch_aie import _enums
+import mindietorch
def parse_args():
@@ -32,7 +32,7 @@ def parse_args():
def main():
args = parse_args()
- torch_aie.set_device(args.device_id)
+ mindietorch.set_device(args.device_id)
model = torch.jit.load(args.model_path)
model.eval()
@@ -41,34 +41,34 @@ def main():
gear_list = [262, 326, 390, 454, 518, 582, 646, 710, 774, 838, 902, 966, 1028, 1284, 1478]
inputs = []
for gear in gear_list:
- inputs.append([torch_aie.Input((gear, 83))])
+ inputs.append([mindietorch.Input((gear, 83))])
elif args.flag == 'dynamic':
min_shape = (1, 83)
max_shape = (1500, 83)
- inputs = [torch_aie.Input(min_shape=min_shape, max_shape=max_shape)]
+ inputs = [mindietorch.Input(min_shape=min_shape, max_shape=max_shape)]
else:
raise ValueError('Invalid model type.')
print('Start compiling model...')
- compiled_model = torch_aie.compile(
+ compiled_model = mindietorch.compile(
model,
inputs=inputs,
- precision_policy=_enums.PrecisionPolicy.FP16,
+ precision_policy=mindietorch._enums.PrecisionPolicy.FP16,
allow_tensor_replace_int=False,
soc_version="Ascend310P3")
print('Model compiled successfully.')
compiled_model.save(f'./espnet_{args.flag}.ts')
print('Start exporting om model...')
- torch_aie.export_engine(
+ compiled_engine = mindietorch.export_engine(
model,
inputs=inputs,
- precision_policy=_enums.PrecisionPolicy.FP16,
+ precision_policy=mindietorch._enums.PrecisionPolicy.FP16,
allow_tensor_replace_int=False,
soc_version="Ascend310P3",
- method_name="forward",
- path=f'./espnet_{args.flag}.om'
- )
+ method_name="forward")
+ with os.fdopen(os.open(f'./espnet_{args.flag}.om', os.O_WRONLY | os.O_CREAT | os.O_EXCL, mode=0o700), 'wb') as file:
+ file.write(compiled_engine)
print('Model exported successfully.')
diff --git a/AscendIE/TorchAIE/built-in/audio/espnet/compile_fx.py b/AscendIE/TorchAIE/built-in/audio/espnet/compile_fx.py
new file mode 100644
index 0000000000000000000000000000000000000000..1f90f2d94e4632752d1f343db1452c6dac2eb63e
--- /dev/null
+++ b/AscendIE/TorchAIE/built-in/audio/espnet/compile_fx.py
@@ -0,0 +1,64 @@
+# Copyright(C) 2023. Huawei Technologies Co.,Ltd. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import argparse
+
+import torch
+from torch._export import export
+import mindietorch
+from espnet.asr.pytorch_backend.asr import load_trained_model
+
+
+def fx_compile_dynamic(torch_model):
+ print("Begin compile dynamic model!")
+ min_shape = (64, 83)
+ max_shape = (1500, 83)
+ input_shape = (262, 83)
+ input_tensor = torch.randn(input_shape)
+
+ ep = export(torch_model.encoder, args=(input_tensor,))
+ print("Finish export dynamic model!")
+
+ inps = [mindietorch.Input(min_shape=min_shape, max_shape=max_shape, dtype=torch.float32)]
+ compiled_model = mindietorch.compile(ep, inputs=inps, ir="dynamo")
+ torch.save(compiled_model, "./espnet_dynamic_fx.pt")
+ print("Finish compile dynamic model!")
+
+
+def main():
+ parser = argparse.ArgumentParser()
+ parser.add_argument('--model_path', type=str, default="egs/aishell/asr1/exp/"
+ "train_sp_pytorch_train_pytorch_conformer_kernel15_specaug/results/model.last10.avg.best")
+ parser.add_argument('--device_id', type=int, default=0, help='NPU device id')
+ args = parser.parse_args()
+
+ mindietorch.set_device(args.device_id)
+
+ model, train_args = load_trained_model(args.model_path)
+ model.eval()
+
+ # apply monkey patch to set assume_static_by_default = False
+ original_export = torch._dynamo.export
+ def patched_export(*args, **kwargs):
+ kwargs['assume_static_by_default'] = False
+ return original_export(*args, **kwargs)
+ torch._dynamo.export = patched_export
+ print("Finish monkey patch!")
+
+ fx_compile_dynamic(model)
+
+
+if __name__ == '__main__':
+ main()
\ No newline at end of file
diff --git a/AscendIE/TorchAIE/built-in/audio/espnet/perf_test.py b/AscendIE/TorchAIE/built-in/audio/espnet/perf_test.py
index 04ac964d4cb029f8a6c45ef6741a40c4e77d762f..e20bf1277ca6d501b1a4e881047856f0ecac121f 100644
--- a/AscendIE/TorchAIE/built-in/audio/espnet/perf_test.py
+++ b/AscendIE/TorchAIE/built-in/audio/espnet/perf_test.py
@@ -16,7 +16,7 @@ import argparse
import time
import torch
-import torch_aie
+import mindietorch
import numpy as np
@@ -31,9 +31,9 @@ def parse_args():
if __name__ == '__main__':
args = parse_args()
- torch_aie.set_device(args.device_id)
+ mindietorch.set_device(args.device_id)
device = f'npu:{args.device_id}'
- stream = torch_aie.npu.Stream(device)
+ stream = mindietorch.npu.Stream(device)
model = torch.jit.load(args.model_path)
model.eval()
@@ -43,7 +43,7 @@ if __name__ == '__main__':
num_warmup = 20
random_input = torch.rand(1478, 83).to(device)
for _ in range(num_warmup):
- with torch_aie.npu.stream(stream):
+ with mindietorch.npu.stream(stream):
model(random_input)
stream.synchronize()
print('warmup done')
@@ -60,7 +60,7 @@ if __name__ == '__main__':
cur_time = 0
random_input = torch.rand(shape, 83).to(device)
for i in range(num_infer_per_shape):
- with torch_aie.npu.stream(stream):
+ with mindietorch.npu.stream(stream):
infer_start = time.time()
model(random_input)
stream.synchronize()
diff --git a/AscendIE/TorchAIE/built-in/audio/espnet/perf_test_fx.py b/AscendIE/TorchAIE/built-in/audio/espnet/perf_test_fx.py
new file mode 100644
index 0000000000000000000000000000000000000000..a584bb6ab3f5a793122d41b35fa44115300a6577
--- /dev/null
+++ b/AscendIE/TorchAIE/built-in/audio/espnet/perf_test_fx.py
@@ -0,0 +1,119 @@
+# Copyright(C) 2023. Huawei Technologies Co.,Ltd. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import argparse
+import time
+
+import torch
+import mindietorch
+import numpy as np
+
+from espnet.asr.pytorch_backend.asr import load_trained_model
+
+
+COSINE_THRESHOLD = 0.999
+
+def cosine_similarity(gt_tensor, pred_tensor):
+ gt_tensor = gt_tensor.flatten().to(torch.float32)
+ pred_tensor = pred_tensor.flatten().to(torch.float32)
+ if torch.sum(gt_tensor) == 0.0 or torch.sum(pred_tensor) == 0.0:
+ if torch.allclose(gt_tensor, pred_tensor, atol=1e-4, rtol=1e-4, equal_nan=True):
+ return 1.0
+ res = torch.nn.functional.cosine_similarity(gt_tensor, pred_tensor, dim=0, eps=1e-6)
+ res = res.cpu().detach().item()
+ return res
+
+
+def accuracy(model, torch_model, device):
+ # performance test
+ print('Start accuracy test.')
+ compare_res = 0
+ num_infer_per_shape = 20
+ shapes = [262, 326, 390, 454, 518, 582, 646, 710, 774, 838, 902, 966, 1028, 1284, 1478]
+ for shape in shapes:
+ for i in range(num_infer_per_shape):
+ random_input = torch.rand(shape, 83).to(device)
+
+ mindie_res = model(random_input)
+ torch_res = torch_model.encoder(random_input.to("cpu"))
+
+ for m, t in zip(mindie_res, torch_res):
+ res = cosine_similarity(m.to("cpu"), t)
+ if res < COSINE_THRESHOLD:
+ compare_res += 1
+
+ if compare_res == 0:
+ print("Compare success! Compiled model has the same output with origin torch model!")
+ else:
+ print("Compare failed! {} samples are not equal with origin torch model!".format(compare_res))
+
+
+def performance(model, device):
+ # warm up
+ num_warmup = 20
+ random_input = torch.rand(1478, 83).to(device)
+ for _ in range(num_warmup):
+ with mindietorch.npu.stream(stream):
+ model(random_input)
+ stream.synchronize()
+ print('warmup done')
+
+ # performance test
+ print('Start performance test.')
+ num_infer_per_shape = 20
+ shapes = [262, 326, 390, 454, 518, 582, 646, 710, 774, 838, 902, 966, 1028, 1284, 1478]
+ shape_num = [96, 682, 1260, 1230, 1052, 940, 656, 462, 303, 207, 132, 67, 38, 48, 3]
+ shape_t = []
+ total_time = 0
+ FPS = 0
+ for shape in shapes:
+ cur_time = 0
+ random_input = torch.rand(shape, 83).to(device)
+ for i in range(num_infer_per_shape):
+ with mindietorch.npu.stream(stream):
+ infer_start = time.time()
+ model(random_input)
+ stream.synchronize()
+ infer_end = time.time()
+ cur_time += infer_end - infer_start
+ shape_t.append(cur_time / num_infer_per_shape)
+ total_time = np.multiply(np.array(shape_t), np.array(shape_num))
+ total_time = total_time.tolist()
+ fps = 1 / (sum(total_time) / 7176)
+ print("fps:", fps)
+
+
+if __name__ == '__main__':
+ parser = argparse.ArgumentParser()
+ parser.add_argument('--model_path', type=str, default='./espnet_dynamic_fx.pt',
+ help='Compiled model path')
+ parser.add_argument('--torch_model_path', type=str, default="egs/aishell/asr1/exp/"
+ "train_sp_pytorch_train_pytorch_conformer_kernel15_specaug/results/model.last10.avg.best")
+ parser.add_argument('--device_id', type=int, default=0, help='NPU device id')
+ parser.add_argument('--mode', type=str, default="performance")
+ args = parser.parse_args()
+
+ mindietorch.set_device(args.device_id)
+ device = f'npu:{args.device_id}'
+ stream = mindietorch.npu.Stream(device)
+
+ model = torch.load(args.model_path)
+ print('Model loaded successfully.')
+
+ if args.mode == "performance":
+ performance(model, device)
+ elif args.mode == "accuracy":
+ torch_model, train_args = load_trained_model(args.torch_model_path)
+ torch_model.eval()
+ accuracy(model, torch_model, device)
diff --git a/AscendIE/TorchAIE/built-in/audio/espnet/perf_test_pta.py b/AscendIE/TorchAIE/built-in/audio/espnet/perf_test_pta.py
new file mode 100644
index 0000000000000000000000000000000000000000..4c8a8cc9e21fc9d8edac4cafaa1c44bdb2cad3ed
--- /dev/null
+++ b/AscendIE/TorchAIE/built-in/audio/espnet/perf_test_pta.py
@@ -0,0 +1,71 @@
+# Copyright(C) 2023. Huawei Technologies Co.,Ltd. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import argparse
+import time
+
+import torch
+import numpy as np
+from espnet.asr.pytorch_backend.asr import load_trained_model
+
+
+def parse_args():
+ parser = argparse.ArgumentParser()
+ parser.add_argument('--model_path', type=str, default="egs/aishell/asr1/exp/"
+ "train_sp_pytorch_train_pytorch_conformer_kernel15_specaug/results/model.last10.avg.best")
+ parser.add_argument('--device_id', type=int, default=0, help='NPU device id')
+ return parser.parse_args()
+
+
+if __name__ == '__main__':
+ args = parse_args()
+
+ device = "npu:0"
+ model, train_args = load_trained_model(args.model_path)
+ model.eval()
+ model.to("npu:0")
+ print('Model loaded successfully.')
+
+ # warm up
+ num_warmup = 20
+ random_input = torch.rand(1478, 83).to(device)
+ for _ in range(num_warmup):
+ torch.npu.synchronize()
+ model.encoder(random_input, None)
+ torch.npu.synchronize()
+ print('warmup done')
+
+ # performance test
+ print('Start performance test.')
+ num_infer_per_shape = 20
+ shapes = [262, 326, 390, 454, 518, 582, 646, 710, 774, 838, 902, 966, 1028, 1284, 1478]
+ shape_num = [96, 682, 1260, 1230, 1052, 940, 656, 462, 303, 207, 132, 67, 38, 48, 3]
+ shape_t = []
+ total_time = 0
+ FPS = 0
+ for shape in shapes:
+ cur_time = 0
+ random_input = torch.rand(shape, 83).to(device)
+ for i in range(num_infer_per_shape):
+ torch.npu.synchronize()
+ infer_start = time.time()
+ model.encoder(random_input, None)
+ infer_end = time.time()
+ torch.npu.synchronize()
+ cur_time += infer_end - infer_start
+ shape_t.append(cur_time / num_infer_per_shape)
+ total_time = np.multiply(np.array(shape_t), np.array(shape_num))
+ total_time = total_time.tolist()
+ fps = 1 / (sum(total_time) / 7176)
+ print("fps:", fps)