{"id": 820640, "name": "Number of parameters", "unit": "", "createdAt": "2024-02-15T16:16:43.000Z", "updatedAt": "2025-09-25T19:29:50.000Z", "coverage": "", "timespan": "2019-2023", "datasetId": 6385, "columnOrder": 0, "shortName": "model_size__parameters", "catalogPath": "grapher/artificial_intelligence/2024-02-15/epoch_llms/epoch_llms#model_size__parameters", "type": "int", "dataChecksum": "3516549254374301598", "metadataChecksum": "4207546776419586661", "datasetName": "Large Language Model Performance and Compute", "datasetVersion": "2024-02-15", "nonRedistributable": false, "display": {"numDecimalPlaces": 0}, "schemaVersion": 2, "processingLevel": "minor", "presentation": {"topicTagsLinks": ["Artificial Intelligence"]}, "descriptionKey": ["The number of parameters in AI models refers to the total count of learnable variables or weights that the model contains. Parameters are the internal variables that the model adjusts during the training process to optimize its performance and make predictions based on the input data.", "In the context of deep learning models, which are a type of AI model, parameters are typically associated with the connections between the neurons or units in the neural network. Each connection has a weight associated with it, and these weights collectively represent the parameters of the model.", "The number of parameters in a model depends on its architecture and complexity. Deep learning models often have multiple layers, each containing numerous neurons or units. The connections between these units contribute to the overall parameter count. Additionally, other components such as convolutional filters, recurrent connections, and attention mechanisms also add to the parameter count.", "The number of parameters in AI models has a significant impact on model capacity and its ability to learn complex patterns from data. A larger number of parameters can allow the model to capture more intricate relationships and potentially achieve higher accuracy. However, it also increases the risk of overfitting, where the model becomes too specialized to the training data and performs poorly on unseen examples. Balancing the number of parameters to avoid overfitting while maintaining sufficient model capacity is a crucial consideration in model design.", "In recent years, AI models with billions or even trillions of parameters have been developed. These models, known as \"giant models,\" have demonstrated state-of-the-art performance in various tasks but require substantial computational resources for training and inference. Efficiently managing and training models with a large number of parameters is an active area of research in the AI community."], "dimensions": {"years": {"values": [{"id": 2022}, {"id": 2023}, {"id": 2019}, {"id": 2020}, {"id": 2021}]}, "entities": {"values": [{"id": 367072, "name": "BLOOM", "code": null}, {"id": 367081, "name": "BloombergGPT", "code": null}, {"id": 273166, "name": "Chinchilla", "code": null}, {"id": 365992, "name": "GLM-130B", "code": null}, {"id": 367086, "name": "GPT-2 (finetuned)", "code": null}, {"id": 367068, "name": "GPT-3 (davinci)", "code": null}, {"id": 257096, "name": "GPT-NeoX-20B", "code": null}, {"id": 367254, "name": "Gopher (0.4B)", "code": null}, {"id": 367258, "name": "Gopher (1.4B)", "code": null}, {"id": 367255, "name": "Gopher (280B)", "code": null}, {"id": 367257, "name": "Gopher (7B)", "code": null}, {"id": 367426, "name": "LLaMA (13B)", "code": null}, {"id": 367427, "name": "LLaMA (33B)", "code": null}, {"id": 367304, "name": "LLaMA (65B)", "code": null}, {"id": 367425, "name": "LLaMA (7B)", "code": null}, {"id": 367085, "name": "OPT", "code": null}, {"id": 273167, "name": "PaLM (540B)", "code": null}, {"id": 367253, "name": "PaLM (62B)", "code": null}, {"id": 367252, "name": "PaLM (62B+)", "code": null}, {"id": 367256, "name": "PaLM (8B)", "code": null}, {"id": 367071, "name": "PaLM-2", "code": null}]}}, "origins": [{"id": 8745, "title": "Large Language Model Performance and Compute", "description": "Epoch dataset on how performance on a MMLU language benchmark scales with computational resources.", "producer": "Epoch AI", "citationFull": "Owen, David. (2023). Large Language Model performance and compute, Epoch (2023) [Data set]. In Extrapolating performance in language modeling benchmarks. Published online at epoch.ai. Retrieved from: 'https://epoch.ai/blog/extrapolating-performance-in-language-modelling-benchmarks' [online resource].", "urlMain": "https://epoch.ai/blog/extrapolating-performance-in-language-modelling-benchmarks", "dateAccessed": "2024-02-15", "datePublished": "2023-07-12", "license": {"url": "https://epoch.ai/blog/extrapolating-performance-in-language-modelling-benchmarks", "name": "Creative Commons BY 4.0"}}]}