BERT ====== The BERT model was proposed in `BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding `__ by Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova. It is a bidirectional transformer pre-trained using a combination of masked language modeling objective and next sentence prediction. BertAdapterModel ~~~~~~~~~~~~~~~~~~~~ .. autoclass:: adapters.BertAdapterModel :members: :inherited-members: BertPreTrainedModel