Conformer

Model description

Recently Transformer and Convolution neural network (CNN) based models have shown promising results in Automatic Speech Recognition (ASR), outperforming Recurrent neural networks (RNNs). Transformer models are good at capturing content-based global interactions, while CNNs exploit local features effectively. In this work, we achieve the best of both worlds by studying how to combine convolution neural networks and transformers to model both local and global dependencies of an audio sequence in a parameter-efficient way. To this regard, we propose the convolution-augmented transformer for speech recognition, named Conformer. Conformer significantly outperforms the previous Transformer and CNN based models achieving state-of-the-art accuracies. On the widely used LibriSpeech benchmark, our model achieves WER of 2.1%/4.3% without using a language model and 1.9%/3.9% with an external language model on test/testother. We also observe competitive performance of 2.7%/6.3% with a small model of only 10M parameters.

Step 1:Installation

git clone --recursive -b r1.4 https://github.com/PaddlePaddle/PaddleSpeech.git
cd PaddleSpeech
pip3 install .

Step 2:Preparing datasets

cd examples/aishell/asr1/
bash run.sh --stage 0 --stop_stage 0

run.sh will download and process the datasets, The download process may be slow, you can download the data_aishell.tgz from wenet and put it in the /path/to/PaddleSpeech/dataset/aishell/, then return to execute the above command.

Step 3:Training

bash run.sh --stage 1 --stop_stage 3

Results

GPUs	IPS	CER
BI-V100 x 4	48.5	0.0495(checkpoint 81)

DeepSpark / DeepSparkHub

Conformer

Model description

Step 1:Installation

Step 2:Preparing datasets

Step 3:Training

Results

简介

发行版 (6)

贡献者

近期动态

DeepSpark / DeepSparkHub .gitee-modal { width: 500px !important; }

Conformer

Model description

Step 1:Installation

Step 2:Preparing datasets

Step 3:Training

Results

简介

发行版 (6)

开源评估指数源自 OSS-Compass 评估体系，评估体系围绕以下三个维度对项目展开评估：

贡献者

近期动态

搜索帮助

DeepSpark / DeepSparkHub