论文标题:DistilBERT, a distilled version of BERT: smaller, faster, cheaper a...
部分内容转载自机器之心 TinyBERT的创新点主要在: 新型蒸馏方式 不同于普通的知识蒸馏(knowledge distillation,KD...
提出机构:阿里巴巴达摩院 论文链接:https://arxiv.org/pdf/1908.04577.pdf 作者认为Bert的预训练任务忽略了...
论文标题:REFORMER: THE EFFICIENT TRANSFORMER 论文链接:https://arxiv.org/abs/2001...
<Paper Reading Series> 本文基于Facebook 2019的文章:Cross-lingual Language Model...
<Paper Reading Series> 本文基于文章:Neural Chinese Medical Named Entity Recogn...
<Paper Reading Series> 本文基于Facebook 2018-ICLR的文章:WORD TRANSLATION WITHOU...
<Paper Reading Series> 本文基于2018 Facebook AI Research的文章:Massively Multil...
<Paper Reading Series> 本文基于2017-ACL的文章:Weakly Supervised Cross-Lingual N...