Language Models are Few-Shot Learners

Resource | v1 | created by janarez on Jul 21, 2020 |

Type Paper

Created 2020-01-01

Available at arxiv.org/abs/2005.14165

Identifier https://github.com/openai/gpt-3

Description

Humans generally perform a new language task from only a few examples - something which current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic.

Relations

official for GPT-3

GPT-3 is a transformer based text generation neural network released by OpenAI on May 29th 2020.

Edit details Edit relations Attach new author Attach new topic Attach new resource

0.0 /10

useless alright awesome

from 0 reviews

Write comment Rate resource Tip: Rating is anonymous unless you also write a comment.

Resource level 0.0 /10: beginner intermediate advanced
Resource clarity 0.0 /10: hardly clear sometimes unclear perfectly clear
Reviewer's background 0.0 /10: none basics intermediate advanced expert

Comments 0

Currently, there aren't any comments.

Language Models are Few-Shot Learners

Relations

from 0 reviews

Comments 0

Site

Explore

Register

Legal