diff --git a/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/README.md b/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/README.md index e3d5cafbe8d26f37f43144049d41613ac5b2fba8..46a5a33fa512d641fa38d19c0de799253b6848c6 100644 --- a/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/README.md +++ b/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/README.md @@ -8,7 +8,9 @@ pip3.7 install -r requirements.txt ``` -2. 获取,修改与安装开源模型代码 +其他若缺少,请手动补充 + +1. 获取,修改与安装开源模型代码 ``` git clone https://github.com/wenet-e2e/wenet.git @@ -18,13 +20,17 @@ git reset 9c4e305bcc24a06932f6a65c8147429d8406cc63 --hard 3. 下载网络权重文件并导出onnx -下载链接:http://mobvoi-speech-public.ufile.ucloud.cn/public/wenet/aishell/20210601_u2pp_conformer_exp.tar.gz下载压缩文件,将文件解压,将文件夹内的文件放置到wenet/examples/aishell/s0/exp/conformer_u2文件夹下,若没有该文件夹,则创建该文件夹 +下载链接:http://mobvoi-speech-public.ufile.ucloud.cn/public/wenet/aishell/20210601_u2pp_conformer_exp.tar.gz + +下载压缩文件,将文件解压,将文件夹内的文件放置到wenet/examples/aishell/s0/exp/conformer_u2文件夹下,若没有该文件夹,则创建该文件夹 首先将所有提供的diff文件放到wenet根目录下,patch -p1 < export_onnx.diff文件适配导出onnx的代码,将提供的export_onnx.py、process_encoder_data_flash.py、process_encoder_data_noflash.py、recognize_attenstion_rescoring.py、static.py文件放到wenet/wenet/bin/目录下,将提供的slice_helper.py, acl_net.py文件放到wenet/wenet/transformer文件夹下,将提供的sh脚本文件放到wenet/examples/aishell/s0/目录下,运行bash export_onnx.sh exp/conformer_u2/train.yaml exp/conformer_u2/final.pt导出onnx文件在当前目录下的onnx文件夹下 4. 运行脚本将onnx转为om模型 -首先使用改图工具om_gener改图,该工具链接为https://gitee.com/liurf_hw/om_gener,安装之后使用以下命令修改脚本, +首先使用改图工具om_gener改图,该工具链接为https://gitee.com/liurf_hw/om_gener, + +将修改onnx脚本文件与生成的onnx放到同一文件夹,安装om_gener之后使用以下命令修改脚本, python3 adaptdecoder.py生成decoder_final.onnx @@ -32,9 +38,17 @@ python3 adaptencoder.py生成encoder_revise.onnx python3 adaptnoflashencoder.py生成no_flash_encoder_revise.onnx -配置环境变量,使用atc工具将模型转换为om文件,命令参考提供的encoder.sh, decoder.sh, no_flash_encoder.sh脚本,运行即可生成对应的om文件,若设备为710设备,修改sh脚本中的 +配置环境变量,使用atc工具将模型转换为om文件,命令参考提供的encoder.sh, decoder.sh, no_flash_encoder.sh脚本, + +``` +bash encoder.sh Ascend710 +bash decoder.sh Ascend710 +bash no_flash_encoder.sh Ascend710 +``` + +传入Ascend310即生成310上对应的om模型 + ---soc_version=Ascend710即可 5. 数据集下载: @@ -46,7 +60,7 @@ python3 adaptnoflashencoder.py生成no_flash_encoder_revise.onnx 运行bash run.sh --stage 2 --stop_stage 2处理数据集 - 运行bash run.sh --stage 3 --stop_stage 3处理数据集 + 运行bash run.sh --stage 3 --stop_stage 3处理数据集` ## 2 离线推理 @@ -60,7 +74,7 @@ python3 adaptnoflashencoder.py生成no_flash_encoder_revise.onnx ``` git checkout . - patch –p1 < get_no_flash_encoder_out.diff + patch -p1 < get_no_flash_encoder_out.diff cd examples/aishell/s0/ bash run_no_flash_encoder_out.sh ``` @@ -89,16 +103,18 @@ python3 adaptnoflashencoder.py生成no_flash_encoder_revise.onnx 以上步骤注意,wenet/bin/process_encoder_data_flash.py文件中--bin_path, --json_path分别保存encoder生成的bin文件, encoder生成bin文件的shape信息;注意修改wenet/transformer/encoder.py文件中BaseEncoder类中init函数中encoder_model参数中流式om模型的路径 - 获取流式场景下,decoder处理结果:cd到wenet根目录下 - ``` - git checkout . - patch -p1 < getwer.diff - cd examples/aishell/s0/ - bash run_attention_rescoring.sh - ``` - - 注意wenet/bin/recognize_attenstion_rescoring.py文件中--bin_path, --model_path, --json_path分别是非流式encoder om生成bin文件,即上一步生成的bin文件路径,decoder模型om路径,流式encoder生成bin文件shape信息对应的json文件,即上一步生成的json文件。查看wenet/examples/aishell/s0/exp/conformer/test_attention_rescoring/wer文件的最后,即可获取overall精度。流式场景下测试速度较慢,可以在encoder.py文件中的BaseEncoder中修改,chunk_xs = xs[:, cur:end, :]修改为chunk_xs = xs[:, cur: num_frames, :],同时在for循环最后offset += y.size(1)后面一行加上break**评测结果:** + +获取流式场景下,decoder处理结果:cd·到wenet根目录下 + +``` +git checkout . +patch -p1 < getwer.diff +cd examples/aishell/s0/ +bash run_attention_rescoring.sh +``` + +注意wenet/bin/recognize_attenstion_rescoring.py文件中--bin_path, --model_path, --json_path分别是非流式encoder om生成bin文件,即上一步生成的bin文件路径,decoder模型om路径,流式encoder生成bin文件shape信息对应的json文件,即上一步生成的json文件。查看wenet/examples/aishell/s0/exp/conformer/test_attention_rescoring/wer文件的最后,即可获取overall精度。流式场景下测试速度较慢,可以在encoder.py文件中的BaseEncoder中修改,chunk_xs = xs[:, cur:end, :]修改为chunk_xs = xs[:, cur: num_frames, :],同时在for循环最后offset += y.size(1)后面一行加上break**评测结果:** | 模型 | 官网pth精度 | 710/310离线推理精度 | gpu性能 | 710性能 | 310性能 | | :---: | :----------------------------: | :-------------------------: | :-----: | :-----: | ------- | @@ -111,10 +127,14 @@ python3 adaptnoflashencoder.py生成no_flash_encoder_revise.onnx onnx转om: ``` -bash static_encoder.sh -bash static_decoder.sh +bash static_encoder.sh Ascend710 +bash static_decoder.sh Ascend710 ``` +传入Ascend310即生成310上对应的om模型 + + + 精度测试: 首先export ASCEND_GLOBAL_LOG_LEVEL=3,指定acc.diff中self.encoder_ascend, self.decoder_ascend加载的文件为静态转出的encoder,decoder模型,修改run.sh中average_checkpoint为false, decode_modes修改为attention_rescoring diff --git a/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/decoder.sh b/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/decoder.sh index 9073e9d52de925775bd542b7c9a0fef17520b95d..66411db9f62de1fce450d6e60f51020ed26f11b8 100644 --- a/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/decoder.sh +++ b/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/decoder.sh @@ -3,8 +3,8 @@ export PATH=/usr/local/python3.7.5/bin:${install_path}/atc/ccec_compiler/bin:${i export PYTHONPATH=${install_path}/atc/python/site-packages:$PYTHONPATH export LD_LIBRARY_PATH=${install_path}/atc/lib64:$LD_LIBRARY_PATH export ASCEND_OPP_PATH=${install_path}/opp - +soc_version=$1 atc --model=decoder_final.onnx --framework=5 --output=decoder_final --input_format=ND \ - --input_shape_range="memory:[10,1~1500,256];memory_mask:[10,1,1~1500];ys_in_pad:[10,1~1500];ys_in_lens:[10];r_ys_in_pad:[10,1~1500]" --out_nodes="Add_488:0;Add_977:0" --log=error --soc_version=Ascend710 + --input_shape_range="memory:[10,1~1500,256];memory_mask:[10,1,1~1500];ys_in_pad:[10,1~1500];ys_in_lens:[10];r_ys_in_pad:[10,1~1500]" --out_nodes="Add_488:0;Add_977:0" --log=error --soc_version=$soc_version diff --git a/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/encoder.sh b/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/encoder.sh index b7bcad526cf51bdaca4b36e8d5e11453a2a53e5a..6e947ab84e0a19270cf61f47fff06ac0e5c90e3d 100644 --- a/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/encoder.sh +++ b/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/encoder.sh @@ -4,6 +4,6 @@ export PATH=${install_path}/atc/ccec_compiler/bin:${install_path}/atc/bin:$PATH export ASCEND_OPP_PATH=${install_path}/opp export PYTHONPATH=${install_path}/atc/python/site-packages:$PYTHONPATH export LD_LIBRARY_PATH=${install_path}/acllib/lib64:$LD_LIBRARY_PATH - -atc --model=encoder_revise.onnx --framework=5 --output=encoder_revise --input_format=ND --input_shape_range="input:[1,1~1500,80];offset:[1];subsampling_cache:[1,1~1500,256];elayers_cache:[12,1,1~1500,256];conformer_cnn_cache:[12,1,256,7]" --log=error --soc_version=Ascend710 +soc_version=$1 +atc --model=encoder_revise.onnx --framework=5 --output=encoder_revise --input_format=ND --input_shape_range="input:[1,1~1500,80];offset:[1];subsampling_cache:[1,1~1500,256];elayers_cache:[12,1,1~1500,256];conformer_cnn_cache:[12,1,256,7]" --log=error --soc_version=$soc_version diff --git a/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/no_flash_encoder.sh b/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/no_flash_encoder.sh index 5ff3b4723dc2b9d640b8c3627de1ff5ce7661920..1cba9e022255b8c196bf56f5f5d8e3cbc0992c73 100644 --- a/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/no_flash_encoder.sh +++ b/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/no_flash_encoder.sh @@ -3,5 +3,6 @@ export PATH=/usr/local/python3.7.5/bin:${install_path}/atc/ccec_compiler/bin:${i export PYTHONPATH=${install_path}/atc/python/site-packages:$PYTHONPATH export LD_LIBRARY_PATH=${install_path}/atc/lib64:$LD_LIBRARY_PATH export ASCEND_OPP_PATH=${install_path}/opp -atc --model=no_flash_encoder_revise.onnx --framework=5 --output=no_flash_encoder_revise --input_format=ND --input_shape_range="xs_input:[1,1~1500,80];xs_input_lens:[-1]" --log=error --soc_version=Ascend310 +soc_version=$1 +atc --model=no_flash_encoder_revise.onnx --framework=5 --output=no_flash_encoder_revise --input_format=ND --input_shape_range="xs_input:[1,1~1500,80];xs_input_lens:[-1]" --log=error --soc_version=$soc_version diff --git a/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/process_encoder_data_noflash.py b/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/process_encoder_data_noflash.py index 709d6f199db3fd81d919030a6bccd2f85a6e35b7..d7ac7eb1ad55e8e0aee4a8986a62a48314f05364 100644 --- a/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/process_encoder_data_noflash.py +++ b/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/process_encoder_data_noflash.py @@ -174,6 +174,8 @@ if __name__ == '__main__': #init acl if os.path.exists(args.json_path): os.remove(args.json_path) + if not os.path.exists(args.bin_path): + os.mkdir(args.bin_path) total_t = 0 encoder_dic = {} import time diff --git a/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/requirements.txt b/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/requirements.txt index baf9a5ad64fa535b141494ce1989eb9087030e5d..7f1f0140516b8b6d255a3c93b968bfe197984d37 100644 --- a/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/requirements.txt +++ b/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/requirements.txt @@ -4,4 +4,11 @@ onnxruntime==1.8.1 torchaudio==0.9.0 numpy==1.18.5 Pillow==7.2.0 - +flake8==3.8.2 +pyyaml>=5.1 +sentencepiece +tensorboard +tensorboardX +typeguard +textgrid +pytest diff --git a/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/static_decoder.sh b/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/static_decoder.sh index 3f6c53b365ea6359ee3458a6e2426856bbb3bae1..409b0644db5bd7e1fad8a21814ada3f550b49f86 100644 --- a/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/static_decoder.sh +++ b/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/static_decoder.sh @@ -3,7 +3,7 @@ export PATH=/usr/local/python3.7.5/bin:${install_path}/atc/ccec_compiler/bin:${i export PYTHONPATH=${install_path}/atc/python/site-packages:$PYTHONPATH export LD_LIBRARY_PATH=${install_path}/atc/lib64:$LD_LIBRARY_PATH export ASCEND_OPP_PATH=${install_path}/opp - +soc_version=$1 atc --model=decoder_final.onnx --framework=5 --output=decoder_fendang --input_format=ND \ --input_shape="memory:10,-1,256;memory_mask:10,1,-1;ys_in_pad:10,-1;ys_in_lens:10;r_ys_in_pad:10,-1" --log=error \ --dynamic_dims="96,96,3,3;96,96,4,4;96,96,5,5;96,96,6,6;96,96,7,7;96,96,8,8;96,96,9,9;96,96,10,10;96,96,11,11;\ @@ -15,5 +15,5 @@ atc --model=decoder_final.onnx --framework=5 --output=decoder_fendang --input_fo 384,384,16,16;384,384,17,17;384,384,18,18;384,384,19,19;384,384,20,20;384,384,21,21;384,384,22,22;384,384,23,23;\ 384,384,24,24;384,384,25,25;384,384,26,26;384,384,27,27;384,384,28,28;384,384,29,29;384,384,30,30;384,384,31,31;\ 384,384,32,32;384,384,33,33;384,384,34,34;384,384,35,35;384,384,36,36;384,384,37,37;384,384,38,38;384,384,39,39;384,384,40,40;384,384,41,41;" \ ---soc_version=Ascend710 +--soc_version=$soc_version diff --git a/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/static_encoder.sh b/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/static_encoder.sh index dbc9fff9c20935b5bdf69a2ceff4a2cc71045ed1..4804a18acca3a5b97ce2e90b27b0dd591de6e200 100644 --- a/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/static_encoder.sh +++ b/ACL_PyTorch/built-in/audio/Wenet_for_Pytorch/static_encoder.sh @@ -3,8 +3,8 @@ export PATH=/usr/local/python3.7.5/bin:${install_path}/atc/ccec_compiler/bin:${i export PYTHONPATH=${install_path}/atc/python/site-packages:$PYTHONPATH export LD_LIBRARY_PATH=${install_path}/atc/lib64:$LD_LIBRARY_PATH export ASCEND_OPP_PATH=${install_path}/opp - +soc_version=$1 atc --model=no_flash_encoder_revise.onnx --framework=5 --output=encoder_fendang_262_1478 --input_format=ND \ --input_shape="xs_input:1,-1,80;xs_input_lens:1" --log=error \ --dynamic_dims="262;326;390;454;518;582;646;710;774;838;902;966;1028;1284;1478" \ ---soc_version=Ascend710 +--soc_version=$soc_version