speechmatics · cabbage-ice-cream · Jun 12, 2026 · Jun 12, 2026 · Jun 12, 2026 · Jun 12, 2026
diff --git a/custom-words.txt b/custom-words.txt
@@ -339,3 +339,5 @@ seqs
 vllm
 configmap
 sessiongroups
+melia
+مرحبا
diff --git a/docs/deployments/container/cpu-speech-to-text.mdx b/docs/deployments/container/cpu-speech-to-text.mdx
@@ -262,11 +262,11 @@ In general, the format is: `{language}_{domain}_{processor}_{operating_point}:{p
 The parameters are:
 - `language` - One of the supported [language codes](/speech-to-text/languages)
 
-- `domain` - One of `general` or a domain used for some [multi-lingual transcription](/speech-to-text/languages#multilingual-speech-to-text) use cases. For example: `SM_PREWARM_ENGINE_MODES='es_bilingual-en_gpu_standard:1'`
+- `domain` - One of `general` or a domain used for some [multi-lingual transcription](/speech-to-text/languages#bilingual-and-multi-language-packs) use cases. For example: `SM_PREWARM_ENGINE_MODES='es_bilingual-en_gpu_standard:1'`
 
 - `processor` - One of `cpu` or `gpu`. Note that selecting `gpu` requires a [GPU Inference Container](/deployments/container/gpu-speech-to-text)
 
-- `operating_point` - One of `standard` or `enhanced`. The [operating point](/speech-to-text/languages#models) you want to prewarm
+- `operating_point` - One of `standard` or `enhanced`. The [operating point](/speech-to-text/models) you want to prewarm
 
 - `prewarm_connections` - Integer. The number of engine instances of the specific mode you want to pre-warm. The total number of `prewarm_connections` cannot be greater than `SM_MAX_CONCURRENT_CONNECTIONS`. After the pre-warming is complete, this parameter does not limit the types of connections the engine can start.
 

diff --git a/docs/deployments/container/gpu-speech-to-text.mdx b/docs/deployments/container/gpu-speech-to-text.mdx
@@ -107,7 +107,7 @@ Once the GPU Server is running, follow the [Instructions for Linking a CPU Conta
 
 ### Running only one operating point
 
-[Operating Points](/speech-to-text/languages#models) represent different levels of model complexity.
+[Operating Points](/speech-to-text/models) represent different levels of model complexity.
 To save GPU memory for throughput, you can run the server with only one Operating Point loaded. To do this, pass the
 `SM_OPERATING_POINT` environment variable to the container and set it to either `standard` or `enhanced`.
 

diff --git a/docs/deployments/index.md b/docs/deployments/index.md
@@ -31,7 +31,7 @@ Feature availability varies depending on the deployment method you choose. Below
 
 | Feature                                                                               | Modes           | Deployments   |
 | ------------------------------------------------------------------------------------- | --------------- | ------------- |
-| [Multilingual speech to text](/speech-to-text/languages#multilingual-speech-to-text) | Batch, Realtime | SaaS, On-prem |
+| [Multilingual speech to text](/speech-to-text/languages#bilingual-and-multi-language-packs) | Batch, Realtime | SaaS, On-prem |
 | [Alignment](/speech-to-text/batch/alignment)                                          | Batch           | SaaS          |
 | [Audio events](/speech-to-text/features/audio-events)                                 | Batch, Realtime | SaaS, On-prem |
 | [Audio filtering](/speech-to-text/features/audio-filtering)                           | Batch, Realtime | SaaS, On-prem |

diff --git a/docs/get-started/authentication.mdx b/docs/get-started/authentication.mdx
@@ -76,6 +76,8 @@ Speechmatics Batch SaaS supports the following endpoints for production use:
 
 Jobs are created in the region corresponding to the endpoint used. You must use the same endpoint for all requests relating to a specific job.
 
+The Melia 1 model is available in the EU1 and US1 regions only. For details, refer to [Models](/speech-to-text/models#melia-1).
+
 :::warning
 The EU2 and US2 Batch SaaS endpoints are provided for enterprise customer high availability and failover purposes only. Jobs created in these environments will not be visible in the Portal.
 :::

diff --git a/docs/speech-to-text/batch/input.mdx b/docs/speech-to-text/batch/input.mdx
@@ -1,6 +1,6 @@
 ---
 keywords: [speechmatics, transcription, speech recognition, asr, api, limits]
-toc_max_heading_level: 2
+toc_max_heading_level: 3
 title: 'Input – Batch'
 sidebar_label: 'Input'
 description: 'Learn about configuration and supported input audio formats for the Speechmatics Batch API'
@@ -40,6 +40,26 @@ Below are the complete fields of the configuration object:
 
 <SchemaNode schema={batchSchema.definitions.JobConfig} />
 
+### Language hints
+
+The Melia 1 model detects every language it hears automatically, so language hints are optional. To select Melia 1, refer to [Models](/speech-to-text/models).
+
+Hints tell the model which languages to expect in the audio, biasing detection toward them. They are most useful for short clips, audio with heavy accents, or recordings where two languages sound similar, where they make language labeling more reliable.
+
+Provide hints as a list of [supported languages](/speech-to-text/languages#transcription-languages) to guide detection without restricting it. This config hints that the audio contains English and Arabic:
+
+```json
+{
+  "type": "transcription",
+  "transcription_config": {
+    "model": "melia-1",
+    "language": "multi",
+    "language_hints": ["en", "ar"]
+  }
+}
+```
+
+The model can still detect and label a language you did not hint, and it labels only the languages it actually hears.
 
 ## Fetch URL
 

diff --git a/docs/speech-to-text/batch/language-identification.mdx b/docs/speech-to-text/batch/language-identification.mdx
@@ -35,6 +35,10 @@ Once you are set up, just set `language` to `auto` to use Automatic Language Ide
 }
 ```
 
+:::note
+The Melia 1 model does not support `language: auto` and returns an error. Set `language` to `multi` instead; Melia 1 is multilingual and detects languages automatically. Refer to [Models](/speech-to-text/models).
+:::
+
 :::info
 To reliably identify the predominant language, the file should contain at least 60 seconds of speech in that language.
 :::

diff --git a/docs/speech-to-text/batch/output.mdx b/docs/speech-to-text/batch/output.mdx
@@ -131,6 +131,56 @@ The following is an example of a transcript response, which you should see as an
   {JSON.stringify(transcriptResponseExample, null, 2)}
 </CodeBlock>
 
+### Multilingual transcript output
+
+For a Melia 1 job, the `language` property on each word reflects the language detected for that word, so it can change across the transcript. For Enhanced and Standard jobs, which transcribe one selected language, the same language is reported for every word.
+
+The example below shows two words in different languages within one transcript:
+
+```json
+{
+  "results": [
+    {
+      "alternatives": [
+        { "content": "Hello", "confidence": 0.98, "language": "en" }
+      ],
+      "start_time": 0.20,
+      "end_time": 0.52,
+      "type": "word"
+    },
+    {
+      "alternatives": [
+        { "content": "مرحبا", "confidence": 0.95, "language": "ar" }
+      ],
+      "start_time": 0.60,
+      "end_time": 1.04,
+      "type": "word"
+    }
+  ]
+}
+```
+
+For multilingual transcripts, `language_pack_info` reports the word delimiter and writing direction per language rather than for a single language pack:
+
+```json
+{
+  "metadata": {
+    "language_pack_info": {
+      "per_language_word_delimiters": {
+        "en": " ",
+        "ar": " "
+      },
+      "per_language_writing_direction": {
+        "en": "left-to-right",
+        "ar": "right-to-left"
+      }
+    }
+  }
+}
+```
+
+`per_language_word_delimiters` gives the word delimiter for each language in the transcript, and `per_language_writing_direction` gives its writing direction.
+
 ## Quicklinks
 
 <Grid columns={{initial: "1", md: "2"}} gap="3">

diff --git a/docs/speech-to-text/features/audio-filtering.mdx b/docs/speech-to-text/features/audio-filtering.mdx
@@ -73,6 +73,6 @@ To obtain volume labelling without filtering any audio, supply an empty config o
 
 Once the audio is in a raw format (16kHz 16bit mono), it is split into 0.01s chunks. For each chunk, the root mean square amplitude of the signal is calculated, and scaled to the range `0 - 100`. If the volume is less than the supplied cut-off, the chunk will be replaced with silence.
 
-To work successfully without degrading accuracy, the background speech must be significantly quieter than the foreground speech, otherwise the filtering process may remove small sections of the audio which should be transcribed. For this reason, the feature works better with the [enhanced model](/speech-to-text/languages#operating-points), which is more robust against inadvertent damage to the audio.
+To work successfully without degrading accuracy, the background speech must be significantly quieter than the foreground speech, otherwise the filtering process may remove small sections of the audio which should be transcribed. For this reason, the feature works better with the [enhanced model](/speech-to-text/models), which is more robust against inadvertent damage to the audio.
 
 The word volume calculation takes the start and end times of words, and applies a weighted average of the volumes of each audio chunk which make up the word. The weighting attempts to ignore areas of silence within long words, and provide a better match with the volume classification a human listener would make.
diff --git a/docs/speech-to-text/features/feature-discovery.mdx b/docs/speech-to-text/features/feature-discovery.mdx
@@ -18,11 +18,11 @@ curl "https://eu1.asr.api.speechmatics.com/v1/discovery/features"
 
 The feature discovery endpoint will include an object with the following properties:
 - `metadata`
-    - `language_pack_info` - For each of our [supported languages](/speech-to-text/languages), give the full name of the language, as well as any [Domain Language Optimizations](/speech-to-text/languages#multilingual-speech-to-text) or [Output Locales](/speech-to-text/formatting#output-locale)
+    - `language_pack_info` - For each of our [supported languages](/speech-to-text/languages), give the full name of the language, as well as any [Domain Language Optimizations](/speech-to-text/languages#bilingual-and-multi-language-packs) or [Output Locales](/speech-to-text/formatting#output-locale)
 - `batch` - Capabilities relating to our Batch API
     - `transcription` - Capabilities relating to transcription
         - `languages` - Includes a list of supported ISO language codes
         - `locales` - Includes any languages with a supported [Output Locale](/speech-to-text/formatting#output-locale)
-        - `domains` - Includes any languages with a supported [Domain Language Optimizations](/speech-to-text/languages#multilingual-speech-to-text)
+        - `domains` - Includes any languages with a supported [Domain Language Optimizations](/speech-to-text/languages#bilingual-and-multi-language-packs)
     - `translation` - Includes all [supported translation pairs](/speech-to-text/features/translation#languages)
     - `languageid` - List of languages supported by [Language Identification](/speech-to-text/batch/language-identification)
-Original file line number
+Diff line change
@@ Expand Up @@
     }
     ```
+    :::note
+    The Melia 1 model does not support `language: auto` and returns an error. Set `language` to `multi` instead; Melia 1 is multilingual and detects languages automatically. Refer to [Models](/speech-to-text/models).
+    :::
     :::info
     To reliably identify the predominant language, the file should contain at least 60 seconds of speech in that language.
     :::
@@ Expand Down @@