Transformer models combined with self-supervised pre-training (e.g., BERT, GPT-2, RoBERTa, XLNet, ALBERT, T5, ELECTRA) have shown to be a powerful framework for producing general language learning, achieving state-of-the-art performance when fine-tuned on a wide array of language tasks. In prior work, the self-supervised objectives used in pre-training have been somewhat agnostic to the down-stream application in favor of generality; we wondered whether better performance could be achieved if the self-supervised objective more closely mirrored the final task.

Telemus AI Community
!telemus_ai
Create a post
  • 0 users online
  • 1 user / day
  • 1 user / week
  • 1 user / month
  • 1 user / 6 months
  • 10 subscribers
  • 182 Posts
  • 29 Comments
  • Modlog