containers: further restrict container label names by martinezjandrew · Pull Request #6420 · cloudflare/workerd

martinezjandrew · 2026-03-25T21:05:15Z

Follows up on PR #6352 (#6352), which only validated against control characters in container label names and values.

This PR further restricts label names to alphanumeric characters ([A-Za-z0-9]). Alphanumeric-only names prevent injection when labels flow into GraphQL queries.

…e ASCII

ask-bonk · 2026-03-25T21:06:51Z

Now I have a clear picture. Let me analyze the type issue:

field.value is a kj::String, and iterating it yields char values. In C++, char may be signed (typically on x86). The comparison c >= 0x20 && c <= 0x7E on a signed char means that if c has a value >= 128 (e.g., UTF-8 multibyte), it would be negative as a signed char. Since a negative value is < 0x20, the check would correctly reject it. So the logic is actually sound — any byte with the high bit set would be negative and fail c >= 0x20.

Wait, actually that's fine — we want to reject non-ASCII, and any byte >= 0x80 would be negative as signed char and would fail the >= 0x20 check. So the behavior is correct.

Let me double check: for the name validation in the old code, static_cast<kj::byte>(c) >= 0x20 was used. The new code for names doesn't need a cast since it's checking ranges like c >= 'A'. For values, the new code uses c >= 0x20 && c <= 0x7E without the static_cast<kj::byte> — this is fine since any byte >= 0x80 is negative as signed char and correctly rejected.

Let me also check if there's a missing test for the DEL character (0x7F) in label values:

Looking at the tests, there's no explicit test for 0x7F (DEL), which is the boundary between accepted (0x7E ~) and rejected. The DEL character is a control character that would have been accepted by the old check (>= 0x20) but is now correctly rejected by the new check (<= 0x7E). A boundary test would be nice but it's not a blocking issue.

Let me also check: is there a test for empty label values? The code doesn't check for empty values — an empty string would pass the validation loop (no characters to check). That seems fine since an empty value for a label is a valid use case.

Let me check if the error messages properly identify which label triggered the error. The messages include "at index i" which refers to the position of the label in the dict, not the position of the offending character within the name/value. That's consistent with the original code. Let me also verify the original error message style said "(index 0)" with parens and the new one says "at index 0" without parens:

Looking at the diff:

Old: "Label names cannot contain control characters (index ", i, ")"
New: "Label names must contain only alphanumeric characters (A-Z, a-z, 0-9) " "at index ", i

The new message drops the closing parenthesis and changes the format. This is fine — the test regex patterns match accordingly.

Actually, I notice the old error messages had parentheses (index N) while the new ones use at index N — consistent within the PR but a style change. Not a real issue.

Let me think about whether there are any real issues here:

Backward compatibility: Container API is experimental/new (2025), so tightening validation is fine without a compat flag.
Security: Tightening validation is a security improvement — restricting label names to alphanumeric prevents injection into downstream systems (GraphQL queries).
Correctness: The char signedness for value validation is correct — negative signed chars (>= 0x80 unsigned) will fail c >= 0x20 since they're negative.
Tests: Good coverage of edge cases (hyphens, emoji, accented chars, control chars, plus positive test for valid labels).
Missing edge case in tests: No test for DEL (0x7F) in values or for label value at the exact boundary (0x7E ~). But the positive test includes ! which is 0x21, and any-value_is.fine! covers several printable ASCII chars. Not a blocking issue.
No test for label name length limit: There's no max length check, but that's pre-existing and not part of this PR.

Overall this is a clean, well-tested tightening of validation on an experimental API. I don't see actionable issues.

LGTM

github run

src/workerd/api/container.c++

…ontainer-label-validation

src/workerd/api/container.c++

anonrig · 2026-03-26T16:09:30Z

src/workerd/api/container.c++

+constexpr bool isAlphanumericString(kj::StringPtr s) {
+  for (auto c: s) {
+    if (!((c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z') || (c >= '0' && c <= '9') || c == '_'))
+      return false;
+  }
+  return true;
+}
+
+constexpr bool hasNoControlCharacters(kj::StringPtr s) {
+  for (auto c: s) {
+    if (static_cast<kj::byte>(c) < 0x20) return false;
+  }
+  return true;
+}


This is still extremely slow:

On top:

#include <cstdint>

Suggested change

constexpr bool isAlphanumericString(kj::StringPtr s) {

for (auto c: s) {

if (!((c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z') || (c >= '0' && c <= '9') || c == '_'))

return false;

}

return true;

}

constexpr bool hasNoControlCharacters(kj::StringPtr s) {

for (auto c: s) {

if (static_cast<kj::byte>(c) < 0x20) return false;

}

return true;

}

static constexpr uint64_t kAlnumBits[4] = {

UINT64_C(0x03FF000000000000), // '0'–'9' (bits 48–57)

UINT64_C(0x07FFFFFE87FFFFFE), // 'A'–'Z', '_', 'a'–'z'

UINT64_C(0), // 0x80–0xBF: none

UINT64_C(0), // 0xC0–0xFF: none

};

static constexpr uint64_t kNoCtrlBits[4] = {

UINT64_C(0xFFFFFFFF00000000), // 0x20–0x3F valid (bits 32–63)

UINT64_C(0xFFFFFFFFFFFFFFFF), // 0x40–0x7F: all valid

UINT64_C(0xFFFFFFFFFFFFFFFF), // 0x80–0xBF: all valid

UINT64_C(0xFFFFFFFFFFFFFFFF), // 0xC0–0xFF: all valid

};

constexpr bool isAlphanumericString(kj::StringPtr s) {

uint64_t acc = ~uint64_t(0);

for (auto c : s) {

auto uc = static_cast<unsigned char>(c);

acc &= kAlnumBits[uc >> 6] >> (uc & 63);

}

return (acc & 1) != 0;

}

constexpr bool hasNoControlCharacters(kj::StringPtr s) {

uint64_t acc = ~uint64_t(0);

for (auto c : s) {

auto uc = static_cast<unsigned char>(c);

acc &= kNoCtrlBits[uc >> 6] >> (uc & 63);

}

return (acc & 1) != 0;

}

That's quite a bit of added complexity in the implementation that might not be worthwhile. How hot of a path is the name validation in? Are these names large in general? If they are short and the path is not that hot, then these don't need much micro-optimization

I appreciate the help! I applied the change in 7e847c5.

That's quite a bit of added complexity in the implementation that might not be worthwhile. How hot of a path is the name validation in? Are these names large in general? If they are short and the path is not that hot, then these don't need much micro-optimization

I agree, this seems like overkill here, the label validation is almost certainly not going to be the bottleneck in this RPC call. Label names are capped at 16 bytes long (label values are 64 bytes max), the simpler implementation we had before is likely fine imo.

If we did want to use an implementation like this then container.c++ is probably not the right place for it but should be in a more general purpose string library.

…ontainer-label-validation

restrict container label names to alphanumeric and values to printabl…

fbd8dbb

…e ASCII

martinezjandrew marked this pull request as ready for review March 25, 2026 21:05

martinezjandrew requested review from a team as code owners March 25, 2026 21:05

anonrig requested changes Mar 25, 2026

View reviewed changes

src/workerd/api/container.c++ Outdated Show resolved Hide resolved

src/workerd/api/container.c++ Outdated Show resolved Hide resolved

extract label validation into constexpr string functions

f5f7a6e

gpanders reviewed Mar 25, 2026

View reviewed changes

src/workerd/api/container.c++ Outdated Show resolved Hide resolved

revert container label value validation

856d61f

martinezjandrew requested a review from anonrig March 26, 2026 14:19

martinezjandrew changed the title ~~containers: further restrict container label names and values~~ containers: further restrict container label names Mar 26, 2026

Merge branch 'main' of github.com:cloudflare/workerd into amartinez/c…

6f948ed

…ontainer-label-validation

joshthoward approved these changes Mar 26, 2026

View reviewed changes

gpanders requested changes Mar 26, 2026

View reviewed changes

src/workerd/api/container.c++ Outdated Show resolved Hide resolved

allow underscores in container label names

9748168

gpanders approved these changes Mar 26, 2026

View reviewed changes

anonrig reviewed Mar 26, 2026

View reviewed changes

anonrig approved these changes Mar 26, 2026

View reviewed changes

jasnell approved these changes Mar 26, 2026

View reviewed changes

martinezjandrew added 2 commits March 26, 2026 11:55

optimize container label validation

7e847c5

Merge branch 'main' of github.com:cloudflare/workerd into amartinez/c…

05d356e

…ontainer-label-validation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

containers: further restrict container label names#6420

containers: further restrict container label names#6420
martinezjandrew wants to merge 7 commits intomainfrom
amartinez/container-label-validation

martinezjandrew commented Mar 25, 2026 •

edited

Loading

Uh oh!

ask-bonk bot commented Mar 25, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

anonrig Mar 26, 2026

Uh oh!

jasnell Mar 26, 2026

Uh oh!

martinezjandrew Mar 26, 2026

Uh oh!

gpanders Mar 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

-constexpr bool isAlphanumericString(kj::StringPtr s) {
-  for (auto c: s) {
-    if (!((c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z') || (c >= '0' && c <= '9') || c == '_'))
-      return false;
-  }
-  return true;
-}
-constexpr bool hasNoControlCharacters(kj::StringPtr s) {
-  for (auto c: s) {
-    if (static_cast<kj::byte>(c) < 0x20) return false;
-  }
-  return true;
-}
+static constexpr uint64_t kAlnumBits[4] = {
+    UINT64_C(0x03FF000000000000),  // '0'–'9'           (bits 48–57)
+    UINT64_C(0x07FFFFFE87FFFFFE),  // 'A'–'Z', '_', 'a'–'z'
+    UINT64_C(0),                    // 0x80–0xBF: none
+    UINT64_C(0),                    // 0xC0–0xFF: none
+};
+static constexpr uint64_t kNoCtrlBits[4] = {
+    UINT64_C(0xFFFFFFFF00000000),  // 0x20–0x3F valid (bits 32–63)
+    UINT64_C(0xFFFFFFFFFFFFFFFF),  // 0x40–0x7F: all valid
+    UINT64_C(0xFFFFFFFFFFFFFFFF),  // 0x80–0xBF: all valid
+    UINT64_C(0xFFFFFFFFFFFFFFFF),  // 0xC0–0xFF: all valid
+};
+constexpr bool isAlphanumericString(kj::StringPtr s) {
+    uint64_t acc = ~uint64_t(0);
+    for (auto c : s) {
+        auto uc = static_cast<unsigned char>(c);
+        acc &= kAlnumBits[uc >> 6] >> (uc & 63);
+    }
+    return (acc & 1) != 0;
+}
+constexpr bool hasNoControlCharacters(kj::StringPtr s) {
+    uint64_t acc = ~uint64_t(0);
+    for (auto c : s) {
+        auto uc = static_cast<unsigned char>(c);
+        acc &= kNoCtrlBits[uc >> 6] >> (uc & 63);
+    }
+    return (acc & 1) != 0;
+}

Conversation

martinezjandrew commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ask-bonk bot commented Mar 25, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

anonrig Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

jasnell Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

martinezjandrew Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

gpanders Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

martinezjandrew commented Mar 25, 2026 •

edited

Loading

gpanders Mar 26, 2026 •

edited

Loading