18 Star 75 Fork 49

DeepSpark / DeepSparkHub

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
README.md 3.94 KB
一键复制 编辑 原始数据 按行查看 历史
majorli6 提交于 2023-09-18 17:11 . update MindSpore model files path

BERT

Model description

The BERT network was proposed by Google in 2018. The network has made a breakthrough in the field of NLP. The network uses pre-training to achieve a large network structure without modifying, and only by adding an output layer to achieve multiple text-based tasks in fine-tuning. The backbone code of BERT adopts the Encoder structure of Transformer. The attention mechanism is introduced to enable the output layer to capture high-latitude global semantic information. The pre-training uses denoising and self-encoding tasks, namely MLM(Masked Language Model) and NSP(Next Sentence Prediction). No need to label data, pre-training can be performed on massive text data, and only a small amount of data to fine-tuning downstream tasks to obtain good results. The pre-training plus fune-tuning mode created by BERT is widely adopted by subsequent NLP networks.

Paper: Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805.

Paper: Junqiu Wei, Xiaozhe Ren, Xiaoguang Li, Wenyong Huang, Yi Liao, Yasheng Wang, Jiashu Lin, Xin Jiang, Xiao Chen, Qun Liu. NEZHA: Neural Contextualized Representation for Chinese Language Understanding. arXiv preprint arXiv:1909.00204.

Step 1: Installing

pip3 install -r requirements.txt

Step 2: Prepare Datasets

1. Download training dataset(.tf_record), eval dataset(.json), vocab.txt and checkpoint:bert_large_ascend_v130_enwiki_official_nlp_bs768_loss1.1.ckpt

cd scripts
mkdir -p squad

Please BERT download vocab.txt here

  • Create fine-tune dataset
    • Download dataset for fine-tuning and evaluation such as Chinese Named Entity RecognitionCLUENER, Chinese sentences classificationTNEWS, Chinese Named Entity RecognitionChineseNER, English question and answeringSQuAD v1.1 train dataset, SQuAD v1.1 eval dataset, package of English sentences classificationGLUE.
    • We haven't provide the scripts to create tfrecord yet, while converting dataset files from JSON format to TFRECORD format, please refer to run_classifier.py or run_squad.py file in BERT repository or the CLUE official repository CLUE and CLUENER

Pretrained models

We have provided several kinds of pretrained checkpoint.

Step 3: Training

bash scripts/run_squad_gpu_distribute.sh 8

[Evaluation result]

Results on BI-V100

GPUs per step time exact_match F1
1*8 1.898s 71.9678 81.422

性能数据:NV

Results on NV-V100s

GPUs per step time exact_match F1
1*8 1.877s 71.9678 81.422
Python
1
https://gitee.com/deep-spark/deepsparkhub.git
git@gitee.com:deep-spark/deepsparkhub.git
deep-spark
deepsparkhub
DeepSparkHub
master

搜索帮助