Skip to content

RDK-59001: [POC] Create Marker Instrumentation Discovery Tool#288

Open
yogeswaransky wants to merge 3 commits intodevelopfrom
feature/Discovery_tool
Open

RDK-59001: [POC] Create Marker Instrumentation Discovery Tool#288
yogeswaransky wants to merge 3 commits intodevelopfrom
feature/Discovery_tool

Conversation

@yogeswaransky
Copy link
Copy Markdown
Contributor

No description provided.

Signed-off-by: Yogeswaran K <yogeswaransky@gmail.com>
@yogeswaransky yogeswaransky requested a review from a team as a code owner March 17, 2026 13:10
Copilot AI review requested due to automatic review settings March 17, 2026 13:10
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a POC Python-based “marker discovery” CLI under tools/marker_discovery/ to inventory T2 telemetry marker instrumentation across repos (C/C++ direct calls + wrapper resolution, script calls, and .patch additions) and emit a consolidated Markdown report.

Changes:

  • Introduces GitHub org scanning + shallow clone orchestration, plus C/script/patch marker scanners.
  • Adds Markdown report generation (sorting, duplicate detection, dynamic marker section).
  • Adds pytest unit tests and supporting spec/docs artifacts (plus a sample generated report.md).

Reviewed changes

Copilot reviewed 20 out of 22 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
tools/marker_discovery/init.py Defines the MarkerRecord datamodel used across scanners/reporting.
tools/marker_discovery/main.py Enables python -m tools.marker_discovery execution.
tools/marker_discovery/code_parser.py tree-sitter-based C/C++ parser for direct marker calls + wrapper detection/resolution.
tools/marker_discovery/github_client.py GitHub API client for repo listing/search, plus shallow cloning with branch fallback.
tools/marker_discovery/marker_scanner.py CLI entry point and orchestration across orgs/repos; runs all scanners and generates report.
tools/marker_discovery/patch_parser.py Scans added lines in .patch files for marker API usage.
tools/marker_discovery/report_generator.py Produces the Markdown inventory report including duplicate/dynamic sections.
tools/marker_discovery/script_parser.py Scans non-C/patch files for t2ValNotify/t2CountNotify, flags dynamic markers, resolves $1 for simple wrappers.
tools/marker_discovery/requirements.txt Declares Python dependencies for the tool.
tools/marker_discovery/tests/init.py Test package marker.
tools/marker_discovery/tests/test_code_parser.py Unit tests for direct-call extraction and wrapper call-site resolution.
tools/marker_discovery/tests/test_script_parser.py Unit tests for script marker extraction, dynamic detection, and positional arg resolution.
tools/marker_discovery/tests/test_patch_parser.py Unit tests for patch scanning (added lines only) across C + script APIs.
tools/marker_discovery/tests/test_report_generator.py Unit tests for report structure, sorting, duplicates, and dynamic marker handling.
tools/marker_discovery/tests/integration_test.py Adds a manual “integration test” style script (currently placed under tests).
tools/marker_discovery/tests/debug_wrappers.py Adds a local debugging helper script for wrapper detection.
report.md Sample/generated marker inventory report output.
openspec/config.yaml Adds OpenSpec config scaffold.
openspec/changes/marker-discovery/.openspec.yaml OpenSpec metadata for the change.
openspec/changes/marker-discovery/proposal.md Proposal/spec for the marker discovery tool.
openspec/changes/marker-discovery/design.md Design doc describing architecture and behavior.
openspec/changes/marker-discovery/tasks.md Task breakdown/checklist for implementing the tool.

repos_to_clone = [r["name"] for r in all_repos if r["name"] in repo_names_to_clone]

cloned = github_client.clone_matching_repos(org, repos_to_clone, branch, temp_dir)
return cloned, len(all_repos)
Comment on lines +209 to +230
def clone_matching_repos(org, repo_names, branch, temp_dir):
"""Clone a list of repos with branch fallback.

Args:
org: GitHub organization name
repo_names: iterable of repo names to clone
branch: target branch name
temp_dir: directory to clone into

Returns list of dicts: {'name': str, 'path': str, 'branch': str}
"""
cloned = []
for repo_name in sorted(repo_names):
clone_path, actual_branch = clone_repo(org, repo_name, branch, temp_dir)
if clone_path:
cloned.append({
"name": repo_name,
"path": clone_path,
"branch": actual_branch,
})
logger.info("Cloned %d/%d repos for org %s", len(cloned), len(list(repo_names)), org)
return cloned
Comment on lines +102 to +123
def _find_shell_function_for_line(file_content, line_num):
"""Find what shell function a given line belongs to.

Returns (function_name, positional_arg_index) if the marker at line_num
uses a positional parameter like $1, $2, or None if not in a function.
"""
lines = file_content.split('\n')
if line_num < 1 or line_num > len(lines):
return None

# Walk backwards to find function definition
func_pattern = re.compile(r'^\s*(?:function\s+)?(\w+)\s*\(\s*\)')
for i in range(line_num - 1, -1, -1):
line = lines[i]
m = func_pattern.match(line)
if m:
return m.group(1)
# If we hit another closing brace at column 0, we've left the function
if line.strip() == '}' and i < line_num - 1:
return None

return None
file_path: str
line: int
api: str
source_type: str # "source" | "script" | "patch"
Comment on lines +47 to +160
def run_fast_path(org, branch, temp_dir):
"""Fast path: use search API to find repos with markers, then clone only those."""
logger.info("Fast path: searching for marker repos in %s via Search API...", org)

matching_repos = github_client.search_all_markers_in_org(org)
if not matching_repos:
logger.warning("No repos with markers found in %s via search", org)
return [], 0

logger.info("Found %d repos with potential markers, cloning...", len(matching_repos))

# Get clone URLs for matching repos
all_repos = github_client.list_org_repos(org)
repo_names_to_clone = {r for r in matching_repos}
repos_to_clone = [r["name"] for r in all_repos if r["name"] in repo_names_to_clone]

cloned = github_client.clone_matching_repos(org, repos_to_clone, branch, temp_dir)
return cloned, len(all_repos)


def run_full_path(org, branch, temp_dir):
"""Full path: list all repos, clone all, scan locally."""
logger.info("Full path: listing all repos in %s...", org)

all_repos = github_client.list_org_repos(org)
repo_names = [r["name"] for r in all_repos]

logger.info("Cloning %d repos on branch %s...", len(repo_names), branch)
cloned = github_client.clone_matching_repos(org, repo_names, branch, temp_dir)
return cloned, len(all_repos)


def scan_cloned_repos(cloned_repos):
"""Run all scanners on cloned repos. Returns list of MarkerRecord."""
all_markers = []

for repo_info in cloned_repos:
repo_path = repo_info["path"]
repo_name = repo_info["name"]

logger.info("Scanning %s...", repo_name)

# C/C++ source scanning (tree-sitter)
try:
markers = code_parser.scan_repo(repo_path, repo_name)
all_markers.extend(markers)
except Exception as e:
logger.warning("Code parser failed for %s: %s", repo_name, e)

# Script scanning
try:
markers = script_parser.scan_repo_scripts(repo_path, repo_name)
all_markers.extend(markers)
except Exception as e:
logger.warning("Script parser failed for %s: %s", repo_name, e)

# Patch scanning
try:
markers = patch_parser.scan_repo_patches(repo_path, repo_name)
all_markers.extend(markers)
except Exception as e:
logger.warning("Patch parser failed for %s: %s", repo_name, e)

return all_markers


def main(argv=None):
args = parse_args(argv)
setup_logging(args.verbose)

temp_dir = github_client.create_temp_dir()
logger.info("Using temp directory: %s", temp_dir)

try:
all_markers = []
total_repos_scanned = 0

for org in args.orgs:
# Choose strategy based on branch
if args.branch == "develop":
cloned, total_in_org = run_fast_path(org, args.branch, temp_dir)
else:
cloned, total_in_org = run_full_path(org, args.branch, temp_dir)

total_repos_scanned += total_in_org

# Scan cloned repos
markers = scan_cloned_repos(cloned)
all_markers.extend(markers)

# Generate report
report = report_generator.generate_report(
markers=all_markers,
branch=args.branch,
orgs=args.orgs,
components_scanned=total_repos_scanned,
)

# Output
if args.output:
with open(args.output, "w") as f:
f.write(report)
logger.info("Report written to %s", args.output)
else:
print(report)

logger.info("Done. Found %d markers total.", len(all_markers))
return 0

except Exception as e:
logger.error("Fatal error: %s", e, exc_info=True)
return 1

finally:
Comment on lines +1 to +16
# Telemetry Marker Inventory
**Branch**: develop
**Organizations**: rdkcentral
**Generated**: 2026-03-16 06:31:31 UTC

## Summary
- **Total Markers**: 913
- **Static Markers**: 909
- **Dynamic Markers**: 4 (contain shell variables)
- **Components Scanned**: 594
- **Duplicate Markers**: 56 ⚠️

## Marker Inventory
| Marker Name | Component | File Path | Line | API |
|-------------|-----------|-----------|------|-----|
| 2GRxPackets_split | OneWifi | scripts/process_monitor_atom.sh | 592 | t2ValNotify |
Comment on lines +1 to +37
"""Quick integration test against a real cloned repo."""
import sys
sys.path.insert(0, "/mnt/L2_CONTAINER_SHARED_VOLUME/59001")

from tools.marker_discovery.code_parser import scan_repo
from tools.marker_discovery.script_parser import scan_repo_scripts
from tools.marker_discovery.patch_parser import scan_repo_patches
from tools.marker_discovery.report_generator import generate_report

repo_path = "/tmp/test_telemetry"
repo_name = "telemetry"

print("=== Code Parser ===")
code_markers = scan_repo(repo_path, repo_name)
for m in code_markers[:10]:
print(f" {m.marker_name:40s} | {m.file_path:50s} | {m.line:5d} | {m.api}")
print(f" ... Total code markers: {len(code_markers)}")

print("\n=== Script Parser ===")
script_markers = scan_repo_scripts(repo_path, repo_name)
for m in script_markers[:10]:
print(f" {m.marker_name:40s} | {m.file_path:50s} | {m.line:5d} | {m.api}")
print(f" ... Total script markers: {len(script_markers)}")

print("\n=== Patch Parser ===")
patch_markers = scan_repo_patches(repo_path, repo_name)
for m in patch_markers[:10]:
print(f" {m.marker_name:40s} | {m.file_path:50s} | {m.line:5d} | {m.api}")
print(f" ... Total patch markers: {len(patch_markers)}")

all_markers = code_markers + script_markers + patch_markers
print(f"\n=== TOTAL: {len(all_markers)} markers ===")

print("\n=== Report Preview (first 30 lines) ===")
report = generate_report(all_markers, "develop", ["rdkcentral"], 1)
for line in report.split("\n")[:30]:
print(line)

def _create_parser():
"""Create a tree-sitter parser with C grammar."""
parser = Parser(C_LANGUAGE)
Signed-off-by: Yogeswaran K <yogeswaransky@gmail.com>
Signed-off-by: Yogeswaran K <yogeswaransky@gmail.com>
Copilot AI review requested due to automatic review settings March 29, 2026 18:42
"""Remove a directory tree, handling stubborn files (e.g. git-lfs)."""
def _on_error(_func, _path, _exc_info):
try:
os.chmod(_path, 0o777)

Check failure

Code scanning / CodeQL

Overly permissive file permissions High

Overly permissive mask in chmod sets file to world writable.

Copilot Autofix

AI 4 days ago

In general, to fix overly permissive file permissions, ensure that when you relax permissions to enable cleanup or access, you grant only the minimum rights necessary, typically to the owner only (e.g., 0o700 for directories, 0o600 for files) rather than world-readable/writable modes like 0o777.

In this specific case, _force_rmtree is trying to make stubborn files or directories deletable. Deleting a file or directory on POSIX primarily depends on write and execute permissions on the parent directory, but in practice tools often relax permissions on the problematic path as well. Instead of setting 0o777, we can set an owner-only mode that still ensures the current process can read/write/execute the path but does not grant rights to group/others. A simple and safe choice that works for both files and directories is 0o700 (rwx for owner only). That is sufficient to allow the process to modify or remove the entry without opening it to other users.

Concretely, in tools/marker_discovery/github_client.py, inside _force_rmtree, change the line os.chmod(_path, 0o777) to os.chmod(_path, 0o700). No new imports or helper functions are required, and this change does not alter the external behavior of _force_rmtree other than tightening permissions during error handling.

Suggested changeset 1
tools/marker_discovery/github_client.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/tools/marker_discovery/github_client.py b/tools/marker_discovery/github_client.py
--- a/tools/marker_discovery/github_client.py
+++ b/tools/marker_discovery/github_client.py
@@ -464,7 +464,7 @@
     """Remove a directory tree, handling stubborn files (e.g. git-lfs)."""
     def _on_error(_func, _path, _exc_info):
         try:
-            os.chmod(_path, 0o777)
+            os.chmod(_path, 0o700)
             _func(_path)
         except OSError:
             pass
EOF
@@ -464,7 +464,7 @@
"""Remove a directory tree, handling stubborn files (e.g. git-lfs)."""
def _on_error(_func, _path, _exc_info):
try:
os.chmod(_path, 0o777)
os.chmod(_path, 0o700)
_func(_path)
except OSError:
pass
Copilot is powered by AI and may make mistakes. Always verify output.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a Python-based “Marker Discovery Tool” under tools/marker_discovery/ to scan GitHub org repos for T2 telemetry marker instrumentation (C/C++ direct calls + wrapper resolution, script calls, and .patch additions) and emit a consolidated markdown inventory report.

Changes:

  • Introduces GitHub org enumeration + shallow cloning workflow and a CLI orchestrator to scan repos and generate a markdown report.
  • Adds parsers for C/C++ (tree-sitter), scripts (regex + limited positional arg resolution), and patch files (added-lines-only).
  • Adds pytest unit tests and supporting documentation/spec artifacts.

Reviewed changes

Copilot reviewed 22 out of 24 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
tools/marker_discovery/code_parser.py tree-sitter-based extraction of t2_event_* markers + wrapper detection/resolution.
tools/marker_discovery/github_client.py GitHub API client + cloning helpers (branch/commit fallback strategies).
tools/marker_discovery/marker_scanner.py CLI entry point/orchestration across orgs or an input manifest.
tools/marker_discovery/script_parser.py Scans non-C/non-binary files for t2ValNotify/t2CountNotify, flags dynamic markers, attempts $1 resolution.
tools/marker_discovery/patch_parser.py Scans .patch files for added-line marker calls (C + script APIs).
tools/marker_discovery/report_generator.py Markdown report generation, sorting, duplicates, dynamic markers, unresolved components.
tools/marker_discovery/init.py Defines MarkerRecord dataclass used across scanners/reporting.
tools/marker_discovery/main.py Enables python3 -m tools.marker_discovery execution.
tools/marker_discovery/requirements.txt Python dependencies for the scanner (tree-sitter + requests).
tools/marker_discovery/README.md Tool documentation, usage, and design details.
tools/marker_discovery/tests/test_code_parser.py Unit tests for direct call extraction, wrapper detection, wrapper call resolution.
tools/marker_discovery/tests/test_script_parser.py Unit tests for script parsing, dynamic marker classification, positional arg resolution.
tools/marker_discovery/tests/test_patch_parser.py Unit tests for .patch scanning behavior.
tools/marker_discovery/tests/test_report_generator.py Unit tests for markdown report formatting, sorting, duplicates, dynamic/unresolved sections.
tools/marker_discovery/tests/integration_test.py Adds a “quick integration test” script (currently executed at import time).
tools/marker_discovery/tests/debug_wrappers.py Adds a wrapper-debug helper script.
tools/marker_discovery/tests/init.py Marks test package.
report.md Checked-in generated report output (sample inventory).
report6.md Checked-in generated report output (sample inventory).
openspec/config.yaml Adds OpenSpec configuration scaffold.
openspec/changes/marker-discovery/tasks.md Spec-driven task breakdown for the feature.
openspec/changes/marker-discovery/proposal.md Proposal/spec for the marker discovery tool.
openspec/changes/marker-discovery/design.md Design document for architecture and module responsibilities.
openspec/changes/marker-discovery/.openspec.yaml OpenSpec metadata for the change.

Comment on lines +1 to +12
# Telemetry Marker Inventory
**Branch**: per-component (from versions-e.txt)
**Organizations**: rdkcentral, rdk-e, rdk-common, rdk-gdcs
**Generated**: 2026-03-29 18:30:14 UTC

## Summary
- **Total Markers**: 491
- **Static Markers**: 487
- **Dynamic Markers**: 4 (contain shell variables)
- **Components Scanned**: 196
- **Duplicate Markers**: 43 ⚠️

Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These appear to be generated output reports (large, time-stamped inventories). Committing generated scan artifacts will quickly become stale and inflates the repository/PR diff. Suggest removing these from source control (or moving to a docs/examples location with clear naming) and generating them as build artifacts instead.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +11
# Telemetry Marker Inventory
**Branch**: develop
**Organizations**: rdkcentral
**Generated**: 2026-03-16 06:31:31 UTC

## Summary
- **Total Markers**: 913
- **Static Markers**: 909
- **Dynamic Markers**: 4 (contain shell variables)
- **Components Scanned**: 594
- **Duplicate Markers**: 56 ⚠️
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a generated, time-stamped output report rather than source. Keeping it committed will go stale and adds a very large diff/repo footprint. Recommend removing it from version control and generating it on demand (or storing a small, curated example under docs/ if needed).

Copilot uses AI. Check for mistakes.
Comment on lines +3 to +37
sys.path.insert(0, "/mnt/L2_CONTAINER_SHARED_VOLUME/59001")

from tools.marker_discovery.code_parser import scan_repo
from tools.marker_discovery.script_parser import scan_repo_scripts
from tools.marker_discovery.patch_parser import scan_repo_patches
from tools.marker_discovery.report_generator import generate_report

repo_path = "/tmp/test_telemetry"
repo_name = "telemetry"

print("=== Code Parser ===")
code_markers = scan_repo(repo_path, repo_name)
for m in code_markers[:10]:
print(f" {m.marker_name:40s} | {m.file_path:50s} | {m.line:5d} | {m.api}")
print(f" ... Total code markers: {len(code_markers)}")

print("\n=== Script Parser ===")
script_markers = scan_repo_scripts(repo_path, repo_name)
for m in script_markers[:10]:
print(f" {m.marker_name:40s} | {m.file_path:50s} | {m.line:5d} | {m.api}")
print(f" ... Total script markers: {len(script_markers)}")

print("\n=== Patch Parser ===")
patch_markers = scan_repo_patches(repo_path, repo_name)
for m in patch_markers[:10]:
print(f" {m.marker_name:40s} | {m.file_path:50s} | {m.line:5d} | {m.api}")
print(f" ... Total patch markers: {len(patch_markers)}")

all_markers = code_markers + script_markers + patch_markers
print(f"\n=== TOTAL: {len(all_markers)} markers ===")

print("\n=== Report Preview (first 30 lines) ===")
report = generate_report(all_markers, "develop", ["rdkcentral"], 1)
for line in report.split("\n")[:30]:
print(line)
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module will be collected by pytest (matches *_test.py) and executes network/FS-dependent code at import time (hardcoded sys.path, /tmp/test_telemetry, print loops). This will break pytest runs in CI. Convert this into a proper test with fixtures and pytest.mark.integration + opt-in skipping, or move it out of tests/ and guard execution with if __name__ == "__main__":.

Suggested change
sys.path.insert(0, "/mnt/L2_CONTAINER_SHARED_VOLUME/59001")
from tools.marker_discovery.code_parser import scan_repo
from tools.marker_discovery.script_parser import scan_repo_scripts
from tools.marker_discovery.patch_parser import scan_repo_patches
from tools.marker_discovery.report_generator import generate_report
repo_path = "/tmp/test_telemetry"
repo_name = "telemetry"
print("=== Code Parser ===")
code_markers = scan_repo(repo_path, repo_name)
for m in code_markers[:10]:
print(f" {m.marker_name:40s} | {m.file_path:50s} | {m.line:5d} | {m.api}")
print(f" ... Total code markers: {len(code_markers)}")
print("\n=== Script Parser ===")
script_markers = scan_repo_scripts(repo_path, repo_name)
for m in script_markers[:10]:
print(f" {m.marker_name:40s} | {m.file_path:50s} | {m.line:5d} | {m.api}")
print(f" ... Total script markers: {len(script_markers)}")
print("\n=== Patch Parser ===")
patch_markers = scan_repo_patches(repo_path, repo_name)
for m in patch_markers[:10]:
print(f" {m.marker_name:40s} | {m.file_path:50s} | {m.line:5d} | {m.api}")
print(f" ... Total patch markers: {len(patch_markers)}")
all_markers = code_markers + script_markers + patch_markers
print(f"\n=== TOTAL: {len(all_markers)} markers ===")
print("\n=== Report Preview (first 30 lines) ===")
report = generate_report(all_markers, "develop", ["rdkcentral"], 1)
for line in report.split("\n")[:30]:
print(line)
def main() -> None:
# Adjust sys.path so local tools package can be imported when run as a script.
sys.path.insert(0, "/mnt/L2_CONTAINER_SHARED_VOLUME/59001")
from tools.marker_discovery.code_parser import scan_repo
from tools.marker_discovery.script_parser import scan_repo_scripts
from tools.marker_discovery.patch_parser import scan_repo_patches
from tools.marker_discovery.report_generator import generate_report
repo_path = "/tmp/test_telemetry"
repo_name = "telemetry"
print("=== Code Parser ===")
code_markers = scan_repo(repo_path, repo_name)
for m in code_markers[:10]:
print(f" {m.marker_name:40s} | {m.file_path:50s} | {m.line:5d} | {m.api}")
print(f" ... Total code markers: {len(code_markers)}")
print("\n=== Script Parser ===")
script_markers = scan_repo_scripts(repo_path, repo_name)
for m in script_markers[:10]:
print(f" {m.marker_name:40s} | {m.file_path:50s} | {m.line:5d} | {m.api}")
print(f" ... Total script markers: {len(script_markers)}")
print("\n=== Patch Parser ===")
patch_markers = scan_repo_patches(repo_path, repo_name)
for m in patch_markers[:10]:
print(f" {m.marker_name:40s} | {m.file_path:50s} | {m.line:5d} | {m.api}")
print(f" ... Total patch markers: {len(patch_markers)}")
all_markers = code_markers + script_markers + patch_markers
print(f"\n=== TOTAL: {len(all_markers)} markers ===")
print("\n=== Report Preview (first 30 lines) ===")
report = generate_report(all_markers, "develop", ["rdkcentral"], 1)
for line in report.split("\n")[:30]:
print(line)
if __name__ == "__main__":
main()

Copilot uses AI. Check for mistakes.
Comment on lines +2 to +6
sys.path.insert(0, "/mnt/L2_CONTAINER_SHARED_VOLUME/59001")
from tools.marker_discovery.code_parser import detect_wrappers
ws = detect_wrappers("/tmp/test_telemetry")
for w in ws:
print(w)
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper script hardcodes a container-specific sys.path and a local path (/tmp/test_telemetry). Even if it isn't collected by pytest today, keeping this under tests/ is brittle and encourages non-reproducible debugging paths. Consider moving it to a separate scripts/ or devtools/ directory and/or making the paths configurable via CLI args/env vars.

Suggested change
sys.path.insert(0, "/mnt/L2_CONTAINER_SHARED_VOLUME/59001")
from tools.marker_discovery.code_parser import detect_wrappers
ws = detect_wrappers("/tmp/test_telemetry")
for w in ws:
print(w)
import os
import argparse
def main() -> None:
parser = argparse.ArgumentParser(
description="Debug helper for listing marker discovery wrappers."
)
parser.add_argument(
"--code-path",
dest="code_path",
default=os.environ.get("TELEMETRY_CODE_PATH"),
help=(
"Optional path to prepend to sys.path so that "
"tools.marker_discovery can be imported "
"(default: value of TELEMETRY_CODE_PATH, if set)."
),
)
parser.add_argument(
"--workspace",
dest="workspace",
default=os.environ.get("TELEMETRY_WORKSPACE", "/tmp/test_telemetry"),
help=(
"Path to the telemetry workspace to scan for wrappers "
"(default: TELEMETRY_WORKSPACE env var or /tmp/test_telemetry)."
),
)
args = parser.parse_args()
if args.code_path:
sys.path.insert(0, args.code_path)
from tools.marker_discovery.code_parser import detect_wrappers
ws = detect_wrappers(args.workspace)
for w in ws:
print(w)
if __name__ == "__main__":
main()

Copilot uses AI. Check for mistakes.
file_path: str
line: int
api: str
source_type: str # "source" | "script" | "patch"
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MarkerRecord.source_type is documented as only "source" | "script" | "patch", but the implementation/reporting also uses "script_dynamic". Update this comment/type annotation to reflect the actual allowed values to avoid confusing callers and test writers.

Suggested change
source_type: str # "source" | "script" | "patch"
source_type: str # "source" | "script" | "patch" | "script_dynamic"

Copilot uses AI. Check for mistakes.
Comment on lines +105 to +106
Returns (function_name, positional_arg_index) if the marker at line_num
uses a positional parameter like $1, $2, or None if not in a function.
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring for _find_shell_function_for_line says it returns (function_name, positional_arg_index) but the function actually returns only the function name (or None). Please correct the docstring to match behavior (or adjust the function to return the documented tuple).

Suggested change
Returns (function_name, positional_arg_index) if the marker at line_num
uses a positional parameter like $1, $2, or None if not in a function.
Returns the name of the enclosing shell function as a string, or None if
the line is not inside a function definition.

Copilot uses AI. Check for mistakes.
Comment on lines +301 to +322
def clone_matching_repos(org, repo_names, branch, temp_dir):
"""Clone a list of repos with branch fallback.

Args:
org: GitHub organization name
repo_names: iterable of repo names to clone
branch: target branch name
temp_dir: directory to clone into

Returns list of dicts: {'name': str, 'path': str, 'branch': str}
"""
cloned = []
for repo_name in sorted(repo_names):
clone_path, actual_branch = clone_repo(org, repo_name, branch, temp_dir)
if clone_path:
cloned.append({
"name": repo_name,
"path": clone_path,
"branch": actual_branch,
})
logger.info("Cloned %d/%d repos for org %s", len(cloned), len(list(repo_names)), org)
return cloned
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clone_matching_repos logs the total repo count using len(list(repo_names)), which can (a) consume an iterator/generator and report an incorrect total, and (b) allocate an extra list unnecessarily. Compute the total once up front (e.g., repo_names = list(repo_names) at entry) and reuse it for both sorting and logging.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants