Skip to content

[Feature]: per-request billable-duration cap on POST /v1/listen #709

@0xddy

Description

@0xddy

Summary

/v1/listen

Problem to solve

We operate a paid platform that re-sells Deepgram pre-recorded transcription to end users on a per-minute basis. Our app accepts a media URL from an end user, forwards it to POST /v1/listen (async with callback), and bills the user only after the webhook returns metadata.duration.

The /v1/listen endpoint has no parameter that lets the API caller bound the maximum billable duration of a single request. The full OpenAPI spec at https://developers.deepgram.com/reference/speech-to-text/listen-pre-recorded lists callback, callback_method, extra, tag, sentiment, summarize, topics, custom_topic, custom_topic_mode, intents, custom_intent, custom_intent_mode, detect_entities, detect_language, diarize, dictation, encoding, filler_words, keyterm, keywords, language, measurements, model, multichannel, numerals, paragraphs, profanity_filter, punctuate, redact, replace, search, smart_format, utterances, utt_split, version, mip_opt_out — none of which constrains duration or billable seconds.

Because the caller cannot tell Deepgram "stop and bill at most N seconds for this submission", any caller who exposes the API indirectly to untrusted end users is exposed to an unbounded cost-amplification attack:

  1. The end user submits a media URL pointing to an arbitrarily long, perfectly valid audio file (e.g. a 1000-hour public-domain recording — no metadata forgery required).
  2. The platform forwards the URL to /v1/listen. There is no client-side way to know the true playable duration short of fully downloading and re-decoding the file, which is operationally prohibitive on every submission.
  3. Deepgram decodes the entire file and bills the platform for the full real duration.
  4. The end user's account at the platform may only carry a few minutes worth of credit; the platform absorbs the rest of the cost.

The caller can validate everything except what only Deepgram can know: the true number of billable seconds Deepgram will charge for this specific request. Deepgram learns the exact number during decoding; only Deepgram can fail the request before billable seconds accrue beyond a caller-specified cap.

This is the single largest unbounded-cost vector for any Deepgram customer who exposes the API indirectly to untrusted end users (re-sellers, white-label SaaS, marketplaces, customer-facing transcription UIs).

Proposed solution

Add an optional request-level cap on POST /v1/listen, e.g. `max_billable_seconds`:

  - Type: integer (or float), in seconds.
  - Default: absent (preserves current behaviorbackward compatible).
  - On excess: reject the request before any billable seconds accrue beyond the cap. Suggested response: 413 Payload Too Large, or a dedicated 400 with a documented error code such as "MAX_BILLABLE_SECONDS_EXCEEDED".
  - Billing guarantee: the dollar charge for a single request must never exceed `max_billable_seconds × per-second rate`, regardless of the file's real duration.

Example:

  POST /v1/listen?callback=...&max_billable_seconds=14400
  { "url": "https://example.com/audio.mp3" }

  -> if decoded media > 14400s, request fails with no charge above 14400s; otherwise behaves identically to today.

A complementary feature would be an account-wide per-request cap configured from the Deepgram console, so every submission from a given API key is automatically capped without per-call wiring. This is arguably the better default for re-seller and B2B2C use cases.

Precedent in comparable usage-billed APIs:
  - OpenAI: `max_tokens` / `max_completion_tokens` per request.
  - AWS Transcribe: per-job duration limits surfaced via job-config errors.
  - Google Cloud Speech-to-Text: long-running operation duration cap.

Alternatives considered

No response

Scope

All SDKs (parity)

Priority

Blocker

Extra context / links

No response

Session ID (optional)

No response

Project ID (optional)

No response

Request ID (optional)

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions