Model Artifact Configuration

Each model artifact has an associated JSON structure which describes some basic information about the model such as name and version, as well as technical metadata such as format, precision and quantization. This content is referred to as Model Artifact Configuration and is identified by the media type application/vnd.cncf.model.config.v1+json.

This section defines application/vnd.cncf.model.config.v1+json media type.

Terminology

The following terms are used in this section:

Layer
Layer DiffID

A layer DiffID is the hash of the layer's uncompressed tar archive.

Properties

descriptor object, REQUIRED

Contains the general information about the model.
- createdAt string, OPTIONAL
  
  The date and time at which the model was created, formatted as defined by RFC 3339, section 5.6.
- authors array of strings, OPTIONAL
  
  A list of contact details for the individuals or organizations responsible for the model (freeform string).
- vendor string, OPTIONAL
  
  The name of the organization or company distributing the model.
- family string, OPTIONAL
  
  The model family or lineage, such as "llama3", "gpt2", or "qwen2".
- name string, OPTIONAL
  
  The name of the model.
- version string, OPTIONAL
  
  The version of the model.
- title string, OPTIONAL
  
  A human-readable title for the model.
- description string, OPTIONAL
  
  A human-readable description of the model.
- docURL string, OPTIONAL
  
  A URL to get more information or details about the model.
- sourceURL string, OPTIONAL
  
  A URL to get the source code or resources needed to build or understand the model's implementation.
- datasetsURL array of string, OPTIONAL
  
  A list of links or references to datasets that the model was trained upon.
- revision string, OPTIONAL
  
  The source control revision identifier for the model.
- licenses array of string, OPTIONAL
  
  A list of licenses under which the model is distributed, represented as SPDX License Expressions.

config object, REQUIRED

Contains the technical metadata for the model.

architecture string, OPTIONAL

The architecture of the model, such as "transformer", "cnn", or "rnn".
format string, OPTIONAL

The format for the model, such as "onnx", "safetensors", "gguf", or "pt"(pytorch format).
paramSize string, OPTIONAL

The model size is represented as a combination of a decimal count and a single-letter scale-prefix in the format of <count><scale-prefix>, which together specify the total number of parameters in the model.
- count: A numeric value representing the base parameter count before scaling. This value may include up to one digit after the decimal point to allow for partial scaling precision. For example: 6.7.
- scale-prefix: A single letter indicating the order of magnitude multiplier applied to the count. The prefix is case-insensitive and must be one of the following:
  - Q or q (Quadrillion)
  - T or t (Trillion)
  - B or b (Billion)
  - M or m (Million)
  - K or k (Thousand)
Some examples: 6.7B(6.7 Billion parameters), 1.0t(1 Trillion parameters), 100m(100 Million parameters).

precision string, OPTIONAL

The computational precision of the model. Supported values include:

Precision	Description
`"float32"`	32-bit floating point
`"float64"`	64-bit floating point
`"float16"`	16-bit floating point. Uses 1 sign, 5 exponent, and 10 significand bits.
`"bfloat16"`	16-bit brain floating point. Uses 1 sign, 8 exponent and 7 significand bits.
`"float8_e4m3"`	8-bit floating point, e4m3 format. Uses 1 sign, 4 exponent, and 3 significand bits.
`"float8_e5m2"`	8-bit floating point, e5m2 format. Uses 1 sign, 5 exponent, and 2 significand bits.
`"complex32"`	32-bit complex
`"complex64"`	64-bit complex
`"complex128"`	128-bit complex
`"int8"`	8-bit signed integer
`"int16"`	16-bit signed integer
`"int32"`	32-bit signed integer
`"int64"`	64-bit signed integer
`"uint8"`	8-bit unsigned integer
`"uint16"`	16-bit unsigned integer
`"uint32"`	32-bit unsigned integer
`"uint64"`	64-bit unsigned integer
`"bool"`	Boolean

If multiple precisions are used, they should be separated by commas. For example, if the model uses float16 and float8_e4m3, the precision should be set to "float16,float8_e4m3".

quantization string, OPTIONAL

Quantization technique applied to the model, such as "awq", or "gptq".

transformerConfig object, OPTIONAL

Transformer-specific architectural parameters. Should only be populated when architecture is "transformer".

attentionType string, OPTIONAL

The attention mechanism variant. Supported values:

Value	Description
`"mha"`	Multi-Head Attention — standard attention with one KV head per query head
`"gqa"`	Grouped-Query Attention — fewer KV heads than query heads, reducing KV cache size (e.g. LLaMA 3, Mistral)
`"mla"`	Multi-Latent Attention — low-rank KV compression for minimal KV cache (e.g. DeepSeek-V2)

mlpType string, OPTIONAL

The feed-forward / MLP layer variant. Supported values:

Value Description

"dense" Standard dense feed-forward layer

"moe" Mixture-of-Experts — tokens are routed to a subset of expert FFN layers (e.g. Mixtral, DeepSeek-V3)
numLayers integer, OPTIONAL

Total number of transformer layers (blocks).
numAttentionHeads integer, OPTIONAL

Number of query attention heads.
numKVHeads integer, OPTIONAL

Number of key/value heads. For GQA this is smaller than numAttentionHeads. Omitting this field or setting it equal to numAttentionHeads implies standard MHA.
hiddenSize integer, OPTIONAL

The model's hidden dimension size (d_model).
intermediateSize integer, OPTIONAL

The inner dimension of the feed-forward layer.

capabilities object, OPTIONAL

Special capabilities that the model supports, such as reasoning, toolusage, etc.
- inputTypes array of string, OPTIONAL
  
  An array of strings specifying the data types that the model can accept as input. The allowed values are: "text", "image", "audio", "video", or "embedding". For input types that are not explicitly defined, the value "other" value should be used.
- outputTypes array of string, OPTIONAL
  
  An array of strings specifying the data types that the model can produce as output. The allowed values are: "text", "image", "audio", "video", or "embedding". For output types that are not explicitly defined, the value "other" value should be used.
- knowledgeCutoff string, OPTIONAL
  
  The date and time of the datasets that the model was trained on, formatted as defined by RFC 3339, section 5.6.
- reasoning boolean, OPTIONAL
  
  Whether the model can perform reasoning tasks.
- toolUsage boolean, OPTIONAL
  
  Whether the model can use external tools or APIs to perform tasks.
- reward boolean, OPTIONAL
  
  Whether the model is a reward model.
- languages array of string, OPTIONAL
  
  What languages can the model speak. Encoded as ISO 639 two letter codes.

modelfs object, REQUIRED

Contains hashes of each uncompressed layer's content.
- type string, REQUIRED
  
  Must be set to "layers".
- diffIds array of strings, REQUIRED
  
  An array of layer content hashes (DiffIDs), in order from first to last.

Value	Description
`"dense"`	Standard dense feed-forward layer
`"moe"`	Mixture-of-Experts — tokens are routed to a subset of expert FFN layers (e.g. Mixtral, DeepSeek-V3)

Example

Here is an example model artifact configuration JSON document:

{
  "descriptor": {
    "createdAt": "2025-01-01T00:00:00Z",
    "authors": [
      "xyz@xyz.com"
    ],
    "vendor": "XYZ Corp.",
    "family": "xyz3",
    "name": "xyz-3-8B-Instruct",
    "version": "3.1",
    "title": "XYZ 3 8B Instruct",
    "description": "xyz is a large language model.",
    "docURL": "https://www.xyz.com/get-started/",
    "sourceURL": "https://github.com/xyz/xyz3",
    "datasetsURL": ["https://www.xyz.com/datasets/"],
    "revision": "1234567890",
    "licenses": [
      "Apache-2.0"
    ]
  },
  "config": {
    "architecture": "transformer",
    "format": "pt",
    "paramSize": "8b",
    "precision": "float16",
    "quantization": "gptq",
    "transformerConfig": {
      "attentionType": "gqa",
      "mlpType": "dense",
      "numLayers": 32,
      "numAttentionHeads": 32,
      "numKVHeads": 8,
      "hiddenSize": 4096,
      "intermediateSize": 14336
    },
    "capabilities": {
      "inputTypes": [
        "text"
      ],
      "outputTypes": [
        "text",
        "image"
      ],
      "knowledgeCutoff": "2024-05-21T00:00:00Z",
      "reasoning": true,
      "toolUsage": false,
      "reward": false,
      "languages": ["en", "zh"]
    }
  },
  "modelfs": {
    "type": "layers",
    "diffIds": [
      "sha256:1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef",
      "sha256:abcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890"
    ]
  }
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model Artifact Configuration

Terminology

Properties

Example

FilesExpand file tree

config.md

Latest commit

History

config.md

File metadata and controls

Model Artifact Configuration

Terminology

Properties

Example