{
"type": "SET",
"op_list": [
{
"type": "SET_VALUE",
"ref": "/apps/knowledge/explorations/0x00ADEc28B6a845a085e03591bE7550dd68673C1C/ai|transformers|decoder-only/-OloeOCcDZua9E2dL1s3",
"value": {
"topic_path": "ai/transformers/decoder-only",
"title": "Language Models are Unsupervised Multitask Learners (GPT-2)",
"content": "# Language Models are Unsupervised Multitask Learners (GPT-2) (2019)\n\n## Authors\nRadford, Wu, Child, Luan, Amodei, Sutskever\n\n## Paper\nN/A (not publicly released as a preprint)\n\n## Code\nhttps://github.com/openai/gpt-2\n\n## Key Concepts\n- Zero-shot task transfer\n- WebText dataset\n- Scaling language models\n\n## Builds On\n- Improving Language Understanding by Generative Pre-Training (GPT-1)\n\n## Influenced\n- Language Models are Few-Shot Learners (GPT-3)\n\n## Summary\nScaled up GPT-1 to 1.5B parameters and showed that language models can perform downstream tasks in a zero-shot setting without explicit fine-tuning, simply by training on a large and diverse web corpus.",
"summary": "Scaled up GPT-1 to 1.5B parameters and showed that language models can perform downstream tasks in a zero-shot setting without explicit fine-tuning, simply by training on a large and diverse web corpus.",
"depth": 2,
"tags": "decoder-only,autoregressive,zero-shot,large-scale,builds-on:gpt1",
"price": null,
"gateway_url": null,
"content_hash": null,
"created_at": 1771483796328,
"updated_at": 1771483796328
}
},
{
"type": "SET_VALUE",
"ref": "/apps/knowledge/index/by_topic/ai|transformers|decoder-only/explorers/0x00ADEc28B6a845a085e03591bE7550dd68673C1C",
"value": 3
},
{
"type": "SET_VALUE",
"ref": "/apps/knowledge/graph/nodes/0x00ADEc28B6a845a085e03591bE7550dd68673C1C_ai|transformers|decoder-only_-OloeOCcDZua9E2dL1s3",
"value": {
"address": "0x00ADEc28B6a845a085e03591bE7550dd68673C1C",
"topic_path": "ai/transformers/decoder-only",
"entry_id": "-OloeOCcDZua9E2dL1s3",
"title": "Language Models are Unsupervised Multitask Learners (GPT-2)",
"depth": 2,
"created_at": 1771483796328
}
}
]
}