18 Star 75 Fork 49

DeepSpark / DeepSparkHub

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
README.md 1.05 KB
一键复制 编辑 原始数据 按行查看 历史
jeff guo 提交于 2022-09-30 15:06 . Initial Commit

T5

Model description

T5, or Text-to-Text Transfer Transformer, is a Transformer based architecture that uses a text-to-text approach. Every task – including translation, question answering, and classification – is cast as feeding the model text as input and training it to generate some target text. This allows for the use of the same model, loss function, hyperparameters, etc. across our diverse set of tasks.

Step 1: Installing packages

cd  <your_project_path>/t5/pytorch
bash examples_ix/init_torch.sh

Step 2: Training

On single GPU

bash examples_ix/train_t5_small_torch.sh

On single GPU (AMP)

bash examples_ix/train_t5_small_amp_torch.sh

Multiple GPUs on one machine

bash examples_ix/train_t5_small_dist_torch.sh

Multiple GPUs on one machine (AMP)

bash examples_ix/train_t5_small_amp_dist_torch.sh

Results on BI-V100

GUSs Samples/s Loss
1x1 339 1.18
1x8 2488 1.18

Reference

https://github.com/huggingface/

Python
1
https://gitee.com/deep-spark/deepsparkhub.git
git@gitee.com:deep-spark/deepsparkhub.git
deep-spark
deepsparkhub
DeepSparkHub
master

搜索帮助