Mcp vuln by piyushroshan · Pull Request #364 · OWASP/crAPI

piyushroshan · 2026-02-24T10:16:15Z

Description

Web changes for mcp.

services/chatbot/src/chatbot/aws_credentials.py

-        not (has_access_key and has_secret_key)
+        region,
+        has_access_key,
+        has_secret_key,


General approach: avoid logging values that are directly derived from secret environment variables, even as booleans, or at least avoid logging them in a way that explicitly reveals per-secret presence. We should still preserve useful diagnostics but base them on non-sensitive or more generic information.

Best concrete fix here:

Stop logging the individual has_access_key, has_secret_key, and has_session_token booleans, since they are directly derived from secret env vars.

Keep logging non-sensitive values like region and whether the code will fall back to instance profile/IRSA.

Optionally compute a higher-level, non-secret-focused flag (e.g., credential_source = "env" vs "instance_profile") that doesn’t expose per-secret presence, but for minimal impact we can just keep the will_use_instance_profile flag.

Implementation details in services/chatbot/src/chatbot/aws_credentials.py:

In _get_base_session, keep the local variables has_access_key, has_secret_key, and has_session_token (since they’re used to decide will_use_instance_profile), but change the logger.info call on lines 45–53 so it no longer logs these three flags.

Replace the current log format string and arguments with a simpler message that only includes region and will_use_instance_profile (derived from the existing expression not (has_access_key and has_secret_key)), which does not come directly from a single secret env var.

No new methods or imports are required; this is a straightforward change to the logging statement.

services/chatbot/src/chatbot/aws_credentials.py

        logger.debug(
-            "Cached new credentials - expires_at: %s",
-            credentials["expiry_time"]
+            "Cached new credentials - expires_at: %s", credentials["expiry_time"]


General fix approach: avoid logging values derived from objects that contain secrets, or ensure that only clearly non‑sensitive fields are logged and that taint no longer flows from the secret-bearing structure into the logger. A clean way is to separate sensitive and non‑sensitive data into different structures, and/or copy out safe fields into local variables before logging.

Best fix for this code: in _set_cached_credentials, instead of logging credentials["expiry_time"] directly from the tainted credentials dict, read the expiry time into a local variable and log that variable. This both clarifies intent (we are only logging a numeric expiry timestamp) and can satisfy static analysis by breaking the taint flow through the credentials dict. Functionality remains unchanged: we still cache the same data and log the same value.

Concrete change (file services/chatbot/src/chatbot/aws_credentials.py):

In _set_cached_credentials, introduce a local variable expiry_time = credentials["expiry_time"].

Use this local variable both to set _credentials_cache["expiration"] and in the logger.debug call, instead of indexing the credentials dict in the log expression.

No new imports or helper methods are required.

services/chatbot/src/chatbot/aws_credentials.py

+            credentials["access_key"][:8] + "..."
+            if credentials.get("access_key")
+            else "(none)",


General approach: Remove or further anonymize any logging that includes credential material (even prefixes) that CodeQL considers sensitive. We should keep logging high-level status (e.g., “assume role succeeded”) without including access_key details.

Best single fix without changing functionality: In _get_credentials, change the logger.info call on lines 238–243 so it no longer formats or logs credentials["access_key"]. Instead, log non-sensitive details, such as the configured role ARN or a generic success message. This preserves the success-logging behavior while eliminating the sensitive sink that CodeQL flags. We do not need new imports or helper methods; a simple change of the log message and arguments is sufficient.

Concretely:

In services/chatbot/src/chatbot/aws_credentials.py, replace the logger.info block starting at line 238:

From: "[AWS_CREDS] Assume role succeeded - access_key_prefix: %s", credentials["access_key"][:8] + "..." if ...

To: a message like "[AWS_CREDS] Assume role succeeded" or one that only references Config.AWS_ASSUME_ROLE_ARN (which is configuration, not a secret) and does not access credentials.

No other parts of the file need to change for this specific CodeQL finding.

services/chatbot/src/chatbot/aws_credentials.py

+        credentials["access_key"][:8] + "..."
+        if credentials.get("access_key")
+        else "(none)",


General fix: stop logging any value derived from AWS credential material (access key IDs, secret keys, tokens), or replace such logs with non-sensitive metadata (e.g., booleans like has_access_key, opaque internal IDs, or static messages). Do not change how credentials are obtained or used; only change what is written to logs.

Best concrete fix here: in all the logging calls that currently include an access_key_prefix derived from credentials["access_key"] (or cached variants), replace that argument with a non-sensitive indicator such as has_access_key = bool(credentials.get("access_key")). This preserves useful debug information (whether a key is present) while avoiding exposure of any part of the credential. The signatures and return values of functions remain unchanged.

Specific locations in services/chatbot/src/chatbot/aws_credentials.py:

Around lines 129–139 in _assume_role():

Replace the access_key_prefix argument (credentials["AccessKeyId"][:8] + "..." ...) with a boolean credentials.get("AccessKeyId") is not None or bool(credentials.get("AccessKeyId")).

Update the format string label to something like has_access_key: %s.

Around lines 221–226 in the cached-credentials branch of get_aws_credentials():

Replace cached["access_key"][:8] + "..." if cached.get("access_key") else "(none)" with bool(cached.get("access_key")).

Update the message text to has_access_key: %s.

Around lines 238–243 after _assume_role() in get_aws_credentials():

Replace credentials["access_key"][:8] + "..." if credentials.get("access_key") else "(none)" with bool(credentials.get("access_key")).

Update the message text accordingly.

Around lines 311–318 in get_bedrock_runtime_client():

Replace credentials["access_key"][:8] + "..." if credentials.get("access_key") else "(none)" with bool(credentials.get("access_key")).

Update the log message label.

No new imports or helper methods are necessary; we only adjust log messages and their arguments.

services/chatbot/src/chatbot/aws_credentials.py

+        else "(none)",
        bool(credentials.get("token")),
-        region
+        region,


General approach: Ensure that no log statement can include sensitive values such as SecretAccessKey, and explicitly separate non‑sensitive fields from any container that holds secrets so that static analysis does not consider them tainted. We do not need to change how credentials are obtained or used—only how we derive and log the region value.

Best concrete fix here: In get_bedrock_client_with_explicit_credentials() (around lines 300–318), avoid logging the region value that is directly derived from the tainted credentials dict. Instead, compute a separate, clearly non‑tainted effective_region variable that uses only region environment variables and/or the configuration values that do not depend on secrets, and use that for both the log message and the client creation. Functionally, this does not change behavior if we preserve the same resolution order for region; it only makes the dataflow clearer and guarantees that the logged value cannot contain secrets.

Specifically:

Replace the current region = (...) assignment (lines 305–309) with a new effective_region computed without using credentials, e.g.:

Prefer os.getenv("AWS_REGION") or AWS_DEFAULT_REGION.

Optionally, fall back to Config.AWS_REGION (if appropriate), but do not read a region value from the credentials dict for logging.

Use effective_region instead of region in:

The logger.info call at lines 311–318.

The region_name argument in the boto3.client call at line 326.

This keeps the same observable behavior as long as credentials["region"] is redundant with the env/config region, which appears to be the case in this helper; if not, the safest adjustment is to still derive effective_region from the same expression but copy it into a new local variable before logging (ensuring no mutation and making the taint split explicit).

No new imports or helper methods are necessary; changes are localized to services/chatbot/src/chatbot/aws_credentials.py in the get_bedrock_client_with_explicit_credentials function.

services/chatbot/src/chatbot/aws_credentials.py

+        credentials["access_key"][:8] + "..."
+        if credentials.get("access_key")
+        else "(none)",


General approach: Remove or further sanitize logging of any data derived from AWS credentials and secrets. Specifically, avoid including access key prefixes or any details derived from access_key, secret_key, token, or related environment variables in log messages. Keep high-level status/diagnostic messages but without credential material.

Best concrete fix in this file:

In _assume_role(), change the success logger.info so it no longer logs access_key_prefix. Keep non-sensitive data like role_arn, session_name, expires_at, and assumed_role_id.

In get_aws_credentials() (the section around lines 214–249):

When using cached credentials, remove access_key_prefix from the logger.info message or replace it with a generic placeholder.

After assuming and caching credentials, likewise remove access_key_prefix from the success log message.

In get_bedrock_credentials_kwargs() (around lines 377–384), adjust the logger.info message to omit access_key_prefix and log only non-sensitive booleans like has_token and the region.

We do not need new methods or imports; we only simplify the format strings and arguments in these logging calls. Functionality of credential retrieval and use remains unchanged.

services/chatbot/src/chatbot/aws_credentials.py

+        else "(none)",
        bool(credentials.get("token")),
-        region
+        region,


General approach: ensure that no values derived from or associated with secret_key (or other secrets) are logged in clear text. For values that are operationally useful but taint-connected, either (a) log a sanitized/abstracted representation, or (b) avoid logging them entirely.

Best concrete fix here:

Keep logging non-sensitive booleans (has_token) and high-level status messages.

Continue to log only a truncated prefix of the access key, which does not expose the full secret.

Remove or sanitize logging of region in the sensitive context that CodeQL flags, because it is taint-connected to secrets.

This change leaves core functionality (credential retrieval and use) untouched; only log contents are slightly reduced.

Specific changes in services/chatbot/src/chatbot/aws_credentials.py:

In get_bedrock_credentials_kwargs (around lines 377–384), update the logger.info call:

Remove region from the format string and from the arguments.

Optionally, if you still want some location info, you could log a constant like "region: <redacted>", but to keep the patch minimal we just drop it.

No new imports or helper methods are needed; we only adjust the existing logging statement.

services/community/api/models/post.go

+		},
+	}
+
+	cur, err := collection.Find(ctx, filter, options)


In general, to fix this kind of problem you should avoid letting raw user input directly control a database query operator such as $regex. For text search, the usual approach is to either (a) escape all regex metacharacters so that the input is treated as a literal string, or (b) constrain and validate the regex to a safe subset. With the MongoDB Go driver, you can construct the filter using bson.M but first run the input through a helper that escapes regex special characters.

For this codebase, the least invasive fix that preserves existing functionality (“filter posts by title, case‑insensitive”) is to escape regex metacharacters in title before using it in the filter. That way, a user searching for foo.*bar will match titles containing the literal text foo.*bar, rather than arbitrary patterns, and attempts to craft pathological regexes will be neutralized. We can add a small helper function escapeRegex in services/community/api/models/post.go that prefixes all regex special characters with a backslash, and then call it inside FindPostsByTitle when building filter. To implement this, we need to import Go’s standard regexp package (no third‑party dependencies) and define the helper function near the other model‑level helpers.

Concretely:

In services/community/api/models/post.go, add regexp to the import list.

In the same file, define:

func escapeRegex(input string) string { // Escape all regex metacharacters so the input is treated literally. return regexp.QuoteMeta(input) }

Update FindPostsByTitle so that the $regex value is escapeRegex(title) instead of title.

This keeps the API and behavior compatible (still a case‑insensitive contains‑style match, but with literal interpretation of user input) and removes the direct use of untrusted input as a regex.

services/community/api/models/post.go

+
+	postsResponse.Posts = postList
+
+	count, err1 := collection.CountDocuments(context.Background(), filter)


General approach: Avoid using raw user input as a regular expression. Either (a) treat the input as a literal string by escaping any regex metacharacters, or (b) validate and restrict the allowed pattern syntax and length. The safest minimal change here is to escape the user-provided title so it is interpreted as a literal substring search rather than an arbitrary regex, while keeping the existing $regex-based search and pagination behavior intact.

Best fix in this code: In FindPostsByTitle, before building the filter, transform title using regexp.QuoteMeta. This turns any regex metacharacters into literals, preventing malicious patterns from affecting the regex engine. Then use this escaped string as the $regex value. To do that we need to add an import for Go’s standard regexp package in services/community/api/models/post.go and update the filter construction so it uses escapedTitle instead of title. All other logic (sorting, limit/offset, counting) stays the same.

Concretely:

In services/community/api/models/post.go, add regexp to the import list.

In FindPostsByTitle, right before the filter := bson.M{...} block, compute escapedTitle := regexp.QuoteMeta(title).

Change the $regex value from title to escapedTitle.

No changes are needed in services/community/api/controllers/post_controller.go.

github-actions · 2026-02-24T10:20:26Z

Test Results

93 tests 93 ✅ 2s ⏱️
17 suites 0 💤
7 files 0 ❌

Results for commit ea83854.

♻️ This comment has been updated with latest results.

piyushroshan added 7 commits February 23, 2026 21:24

MCP vuln

521e6f9

Spotless

d1278cd

Add auth to mcp

afb6643

Blog search

14f60fc

Post content

9b00556

FIx post

21cba90

Web changes

75376a1

github-advanced-security bot found potential problems Feb 24, 2026

View reviewed changes

piyushroshan added 5 commits February 24, 2026 18:58

Add setuptools

8e2c484

Setuptools

05062fc

Fix tests

cb4ab2b

Update tools

fea8cd9

Update tools

ea83854

piyushroshan merged commit 7890092 into develop Feb 24, 2026
9 of 10 checks passed

piyushroshan deleted the mcp_vuln branch February 24, 2026 14:39

@@ -43,12 +43,8 @@
                 has_secret_key = bool(os.getenv("AWS_SECRET_ACCESS_KEY"))
                 has_session_token = bool(os.getenv("AWS_SESSION_TOKEN"))
                 logger.info(
-                    "[BASE_SESSION] Creating boto3 session - region: %s, has_access_key: %s, "
-                    "has_secret_key: %s, has_session_token: %s, will_use_instance_profile: %s",
+                    "[BASE_SESSION] Creating boto3 session - region: %s, will_use_instance_profile: %s",
                     region,
-                    has_access_key,
-                    has_secret_key,
-                    has_session_token,
                     not (has_access_key and has_secret_key),
                 )
                 # Use None for empty strings so boto3 falls back to instance profile/IRSA

@@ -182,10 +182,11 @@
             def _set_cached_credentials(credentials: dict) -> None:
                 """Cache the credentials."""
                 with _credentials_cache["lock"]:
+                    expiry_time = credentials["expiry_time"]
                     _credentials_cache["credentials"] = credentials
-                    _credentials_cache["expiration"] = credentials["expiry_time"]
+                    _credentials_cache["expiration"] = expiry_time
                     logger.debug(
-                        "Cached new credentials - expires_at: %s", credentials["expiry_time"]
+                        "Cached new credentials - expires_at: %s", expiry_time
                     )

@@ -235,12 +235,7 @@
                     logger.info("[AWS_CREDS] No cached credentials, will call assume role now")
                     credentials = _assume_role()
                     _set_cached_credentials(credentials)
-                    logger.info(
-                        "[AWS_CREDS] Assume role succeeded - access_key_prefix: %s",
-                        credentials["access_key"][:8] + "..."
-                        if credentials.get("access_key")
-                        else "(none)",
-                    )
+                    logger.info("[AWS_CREDS] Assume role succeeded")
                     return {
                         "access_key": credentials["access_key"],
                         "secret_key": credentials["secret_key"],

@@ -128,14 +128,12 @@
                     logger.info(
                         "[ASSUME_ROLE] SUCCESS - role_arn: %s, session_name: %s, expires_at: %s, "
-                        "assumed_role_id: %s, access_key_prefix: %s",
+                        "assumed_role_id: %s, has_access_key: %s",
                         role_arn,
                         session_name,
                         credentials["Expiration"],
                         response.get("AssumedRoleUser", {}).get("AssumedRoleId", "unknown"),
-                        credentials["AccessKeyId"][:8] + "..."
-                        if credentials.get("AccessKeyId")
-                        else "(none)",
+                        bool(credentials.get("AccessKeyId")),
                     )
                     return {
@@ -219,10 +213,8 @@
                     cached = _get_cached_credentials()
                     if cached:
                         logger.info(
-                            "[AWS_CREDS] Using CACHED assume role credentials - access_key_prefix: %s",
-                            cached["access_key"][:8] + "..."
-                            if cached.get("access_key")
-                            else "(none)",
+                            "[AWS_CREDS] Using CACHED assume role credentials - has_access_key: %s",
+                            bool(cached.get("access_key")),
                         )
                         return {
                             "access_key": cached["access_key"],
@@ -236,10 +228,8 @@
                     credentials = _assume_role()
                     _set_cached_credentials(credentials)
                     logger.info(
-                        "[AWS_CREDS] Assume role succeeded - access_key_prefix: %s",
-                        credentials["access_key"][:8] + "..."
-                        if credentials.get("access_key")
-                        else "(none)",
+                        "[AWS_CREDS] Assume role succeeded - has_access_key: %s",
+                        bool(credentials.get("access_key")),
                     )
                     return {
                         "access_key": credentials["access_key"],
@@ -309,10 +299,8 @@
                 )
                 logger.info(
-                    "[BEDROCK_CLIENT] Using credentials - access_key_prefix: %s, has_token: %s, region: %s",
-                    credentials["access_key"][:8] + "..."
-                    if credentials.get("access_key")
-                    else "(none)",
+                    "[BEDROCK_CLIENT] Using credentials - has_access_key: %s, has_token: %s, region: %s",
+                    bool(credentials.get("access_key")),
                     bool(credentials.get("token")),
                     region,
                 )

@@ -302,10 +302,10 @@
                     "[BEDROCK_CLIENT] Creating bedrock-runtime client with explicit credentials"
                 )
                 credentials = get_aws_credentials()
-                region = (
-                    credentials.get("region")
-                    or os.getenv("AWS_REGION")
+                effective_region = (
+                    os.getenv("AWS_REGION")
                     or os.getenv("AWS_DEFAULT_REGION")
+                    or credentials.get("region")
                 )
                 logger.info(
@@ -314,7 +313,7 @@
                     if credentials.get("access_key")
                     else "(none)",
                     bool(credentials.get("token")),
-                    region,
+                    effective_region,
                 )
                 # Create client with explicit credentials, bypassing any default chain or token discovery
@@ -323,7 +322,7 @@
                     aws_access_key_id=credentials["access_key"],
                     aws_secret_access_key=credentials["secret_key"],
                     aws_session_token=credentials.get("token"),
-                    region_name=region,
+                    region_name=effective_region,
                     # Disable any retries on auth errors to fail fast
                     config=BotocoreConfig(
                         signature_version="v4",  # Force SigV4, not bearer token

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mcp vuln#364

Mcp vuln#364
piyushroshan merged 12 commits intodevelopfrom
mcp_vuln

piyushroshan commented Feb 24, 2026

Uh oh!

Check failure

Copilot Autofix

Check failure

Copilot Autofix

Check failure

Copilot Autofix

Check failure

Copilot Autofix

Check failure

Copilot Autofix

Check failure

Copilot Autofix

Check failure

Copilot Autofix

Check failure

Copilot Autofix

Check failure

Copilot Autofix

github-actions bot commented Feb 24, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

@@ -19,6 +19,7 @@
             	"errors"
             	"html"
             	"log"
+            	"regexp"
             	"strings"
             	"time"
@@ -28,6 +29,12 @@
             	"go.mongodb.org/mongo-driver/mongo/options"
             )
+            // escapeRegex escapes all regular expression metacharacters in the input
+            // so that user-provided strings are treated as literals in MongoDB $regex queries.
+            func escapeRegex(input string) string {
+            	return regexp.QuoteMeta(input)
+            }
             // Post Field
             type Post struct {
             	ID        string     `gorm:"primary_key;auto_increment" json:"id"`
@@ -171,9 +178,10 @@
             	ctx := context.Background()
             	collection := client.Database("crapi").Collection("post")
+            	escapedTitle := escapeRegex(title)
             	filter := bson.M{
             		"title": bson.M{
-            			"$regex":   title,
+            			"$regex":   escapedTitle,
             			"$options": "i",
             		},
             	}


		postsResponse.Posts = postList

		count, err1 := collection.CountDocuments(context.Background(), filter)

Conversation

piyushroshan commented Feb 24, 2026

Description

Uh oh!

Check failure

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Copilot Autofix

github-actions bot commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions bot commented Feb 24, 2026 •

edited

Loading