Language Models are Few-Shot Learners


Resource history | v1 (current) | created by janarez

Details

Language Models are Few-Shot Learners

| created by janarez | Add topic "GPT-3"
Title
Language Models are Few-Shot Learners
Type
Paper
Created
2020-01-01
Description
Humans generally perform a new language task from only a few examples - something which current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic.
Link
http://arxiv.org/abs/2005.14165
Identifier
https://github.com/openai/gpt-3

authors

This resource has no history of related authors.

topics

official for GPT-3
v1 | attached by janarez | Add topic "GPT-3"

resources

This resource has no history of related resources.