RDK-59001: [POC] Create Marker Instrumentation Discovery Tool#288
RDK-59001: [POC] Create Marker Instrumentation Discovery Tool#288yogeswaransky wants to merge 3 commits intodevelopfrom
Conversation
Signed-off-by: Yogeswaran K <yogeswaransky@gmail.com>
There was a problem hiding this comment.
Pull request overview
Adds a POC Python-based “marker discovery” CLI under tools/marker_discovery/ to inventory T2 telemetry marker instrumentation across repos (C/C++ direct calls + wrapper resolution, script calls, and .patch additions) and emit a consolidated Markdown report.
Changes:
- Introduces GitHub org scanning + shallow clone orchestration, plus C/script/patch marker scanners.
- Adds Markdown report generation (sorting, duplicate detection, dynamic marker section).
- Adds pytest unit tests and supporting spec/docs artifacts (plus a sample generated
report.md).
Reviewed changes
Copilot reviewed 20 out of 22 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/marker_discovery/init.py | Defines the MarkerRecord datamodel used across scanners/reporting. |
| tools/marker_discovery/main.py | Enables python -m tools.marker_discovery execution. |
| tools/marker_discovery/code_parser.py | tree-sitter-based C/C++ parser for direct marker calls + wrapper detection/resolution. |
| tools/marker_discovery/github_client.py | GitHub API client for repo listing/search, plus shallow cloning with branch fallback. |
| tools/marker_discovery/marker_scanner.py | CLI entry point and orchestration across orgs/repos; runs all scanners and generates report. |
| tools/marker_discovery/patch_parser.py | Scans added lines in .patch files for marker API usage. |
| tools/marker_discovery/report_generator.py | Produces the Markdown inventory report including duplicate/dynamic sections. |
| tools/marker_discovery/script_parser.py | Scans non-C/patch files for t2ValNotify/t2CountNotify, flags dynamic markers, resolves $1 for simple wrappers. |
| tools/marker_discovery/requirements.txt | Declares Python dependencies for the tool. |
| tools/marker_discovery/tests/init.py | Test package marker. |
| tools/marker_discovery/tests/test_code_parser.py | Unit tests for direct-call extraction and wrapper call-site resolution. |
| tools/marker_discovery/tests/test_script_parser.py | Unit tests for script marker extraction, dynamic detection, and positional arg resolution. |
| tools/marker_discovery/tests/test_patch_parser.py | Unit tests for patch scanning (added lines only) across C + script APIs. |
| tools/marker_discovery/tests/test_report_generator.py | Unit tests for report structure, sorting, duplicates, and dynamic marker handling. |
| tools/marker_discovery/tests/integration_test.py | Adds a manual “integration test” style script (currently placed under tests). |
| tools/marker_discovery/tests/debug_wrappers.py | Adds a local debugging helper script for wrapper detection. |
| report.md | Sample/generated marker inventory report output. |
| openspec/config.yaml | Adds OpenSpec config scaffold. |
| openspec/changes/marker-discovery/.openspec.yaml | OpenSpec metadata for the change. |
| openspec/changes/marker-discovery/proposal.md | Proposal/spec for the marker discovery tool. |
| openspec/changes/marker-discovery/design.md | Design doc describing architecture and behavior. |
| openspec/changes/marker-discovery/tasks.md | Task breakdown/checklist for implementing the tool. |
| repos_to_clone = [r["name"] for r in all_repos if r["name"] in repo_names_to_clone] | ||
|
|
||
| cloned = github_client.clone_matching_repos(org, repos_to_clone, branch, temp_dir) | ||
| return cloned, len(all_repos) |
| def clone_matching_repos(org, repo_names, branch, temp_dir): | ||
| """Clone a list of repos with branch fallback. | ||
|
|
||
| Args: | ||
| org: GitHub organization name | ||
| repo_names: iterable of repo names to clone | ||
| branch: target branch name | ||
| temp_dir: directory to clone into | ||
|
|
||
| Returns list of dicts: {'name': str, 'path': str, 'branch': str} | ||
| """ | ||
| cloned = [] | ||
| for repo_name in sorted(repo_names): | ||
| clone_path, actual_branch = clone_repo(org, repo_name, branch, temp_dir) | ||
| if clone_path: | ||
| cloned.append({ | ||
| "name": repo_name, | ||
| "path": clone_path, | ||
| "branch": actual_branch, | ||
| }) | ||
| logger.info("Cloned %d/%d repos for org %s", len(cloned), len(list(repo_names)), org) | ||
| return cloned |
| def _find_shell_function_for_line(file_content, line_num): | ||
| """Find what shell function a given line belongs to. | ||
|
|
||
| Returns (function_name, positional_arg_index) if the marker at line_num | ||
| uses a positional parameter like $1, $2, or None if not in a function. | ||
| """ | ||
| lines = file_content.split('\n') | ||
| if line_num < 1 or line_num > len(lines): | ||
| return None | ||
|
|
||
| # Walk backwards to find function definition | ||
| func_pattern = re.compile(r'^\s*(?:function\s+)?(\w+)\s*\(\s*\)') | ||
| for i in range(line_num - 1, -1, -1): | ||
| line = lines[i] | ||
| m = func_pattern.match(line) | ||
| if m: | ||
| return m.group(1) | ||
| # If we hit another closing brace at column 0, we've left the function | ||
| if line.strip() == '}' and i < line_num - 1: | ||
| return None | ||
|
|
||
| return None |
| file_path: str | ||
| line: int | ||
| api: str | ||
| source_type: str # "source" | "script" | "patch" |
| def run_fast_path(org, branch, temp_dir): | ||
| """Fast path: use search API to find repos with markers, then clone only those.""" | ||
| logger.info("Fast path: searching for marker repos in %s via Search API...", org) | ||
|
|
||
| matching_repos = github_client.search_all_markers_in_org(org) | ||
| if not matching_repos: | ||
| logger.warning("No repos with markers found in %s via search", org) | ||
| return [], 0 | ||
|
|
||
| logger.info("Found %d repos with potential markers, cloning...", len(matching_repos)) | ||
|
|
||
| # Get clone URLs for matching repos | ||
| all_repos = github_client.list_org_repos(org) | ||
| repo_names_to_clone = {r for r in matching_repos} | ||
| repos_to_clone = [r["name"] for r in all_repos if r["name"] in repo_names_to_clone] | ||
|
|
||
| cloned = github_client.clone_matching_repos(org, repos_to_clone, branch, temp_dir) | ||
| return cloned, len(all_repos) | ||
|
|
||
|
|
||
| def run_full_path(org, branch, temp_dir): | ||
| """Full path: list all repos, clone all, scan locally.""" | ||
| logger.info("Full path: listing all repos in %s...", org) | ||
|
|
||
| all_repos = github_client.list_org_repos(org) | ||
| repo_names = [r["name"] for r in all_repos] | ||
|
|
||
| logger.info("Cloning %d repos on branch %s...", len(repo_names), branch) | ||
| cloned = github_client.clone_matching_repos(org, repo_names, branch, temp_dir) | ||
| return cloned, len(all_repos) | ||
|
|
||
|
|
||
| def scan_cloned_repos(cloned_repos): | ||
| """Run all scanners on cloned repos. Returns list of MarkerRecord.""" | ||
| all_markers = [] | ||
|
|
||
| for repo_info in cloned_repos: | ||
| repo_path = repo_info["path"] | ||
| repo_name = repo_info["name"] | ||
|
|
||
| logger.info("Scanning %s...", repo_name) | ||
|
|
||
| # C/C++ source scanning (tree-sitter) | ||
| try: | ||
| markers = code_parser.scan_repo(repo_path, repo_name) | ||
| all_markers.extend(markers) | ||
| except Exception as e: | ||
| logger.warning("Code parser failed for %s: %s", repo_name, e) | ||
|
|
||
| # Script scanning | ||
| try: | ||
| markers = script_parser.scan_repo_scripts(repo_path, repo_name) | ||
| all_markers.extend(markers) | ||
| except Exception as e: | ||
| logger.warning("Script parser failed for %s: %s", repo_name, e) | ||
|
|
||
| # Patch scanning | ||
| try: | ||
| markers = patch_parser.scan_repo_patches(repo_path, repo_name) | ||
| all_markers.extend(markers) | ||
| except Exception as e: | ||
| logger.warning("Patch parser failed for %s: %s", repo_name, e) | ||
|
|
||
| return all_markers | ||
|
|
||
|
|
||
| def main(argv=None): | ||
| args = parse_args(argv) | ||
| setup_logging(args.verbose) | ||
|
|
||
| temp_dir = github_client.create_temp_dir() | ||
| logger.info("Using temp directory: %s", temp_dir) | ||
|
|
||
| try: | ||
| all_markers = [] | ||
| total_repos_scanned = 0 | ||
|
|
||
| for org in args.orgs: | ||
| # Choose strategy based on branch | ||
| if args.branch == "develop": | ||
| cloned, total_in_org = run_fast_path(org, args.branch, temp_dir) | ||
| else: | ||
| cloned, total_in_org = run_full_path(org, args.branch, temp_dir) | ||
|
|
||
| total_repos_scanned += total_in_org | ||
|
|
||
| # Scan cloned repos | ||
| markers = scan_cloned_repos(cloned) | ||
| all_markers.extend(markers) | ||
|
|
||
| # Generate report | ||
| report = report_generator.generate_report( | ||
| markers=all_markers, | ||
| branch=args.branch, | ||
| orgs=args.orgs, | ||
| components_scanned=total_repos_scanned, | ||
| ) | ||
|
|
||
| # Output | ||
| if args.output: | ||
| with open(args.output, "w") as f: | ||
| f.write(report) | ||
| logger.info("Report written to %s", args.output) | ||
| else: | ||
| print(report) | ||
|
|
||
| logger.info("Done. Found %d markers total.", len(all_markers)) | ||
| return 0 | ||
|
|
||
| except Exception as e: | ||
| logger.error("Fatal error: %s", e, exc_info=True) | ||
| return 1 | ||
|
|
||
| finally: |
| # Telemetry Marker Inventory | ||
| **Branch**: develop | ||
| **Organizations**: rdkcentral | ||
| **Generated**: 2026-03-16 06:31:31 UTC | ||
|
|
||
| ## Summary | ||
| - **Total Markers**: 913 | ||
| - **Static Markers**: 909 | ||
| - **Dynamic Markers**: 4 (contain shell variables) | ||
| - **Components Scanned**: 594 | ||
| - **Duplicate Markers**: 56 ⚠️ | ||
|
|
||
| ## Marker Inventory | ||
| | Marker Name | Component | File Path | Line | API | | ||
| |-------------|-----------|-----------|------|-----| | ||
| | 2GRxPackets_split | OneWifi | scripts/process_monitor_atom.sh | 592 | t2ValNotify | |
| """Quick integration test against a real cloned repo.""" | ||
| import sys | ||
| sys.path.insert(0, "/mnt/L2_CONTAINER_SHARED_VOLUME/59001") | ||
|
|
||
| from tools.marker_discovery.code_parser import scan_repo | ||
| from tools.marker_discovery.script_parser import scan_repo_scripts | ||
| from tools.marker_discovery.patch_parser import scan_repo_patches | ||
| from tools.marker_discovery.report_generator import generate_report | ||
|
|
||
| repo_path = "/tmp/test_telemetry" | ||
| repo_name = "telemetry" | ||
|
|
||
| print("=== Code Parser ===") | ||
| code_markers = scan_repo(repo_path, repo_name) | ||
| for m in code_markers[:10]: | ||
| print(f" {m.marker_name:40s} | {m.file_path:50s} | {m.line:5d} | {m.api}") | ||
| print(f" ... Total code markers: {len(code_markers)}") | ||
|
|
||
| print("\n=== Script Parser ===") | ||
| script_markers = scan_repo_scripts(repo_path, repo_name) | ||
| for m in script_markers[:10]: | ||
| print(f" {m.marker_name:40s} | {m.file_path:50s} | {m.line:5d} | {m.api}") | ||
| print(f" ... Total script markers: {len(script_markers)}") | ||
|
|
||
| print("\n=== Patch Parser ===") | ||
| patch_markers = scan_repo_patches(repo_path, repo_name) | ||
| for m in patch_markers[:10]: | ||
| print(f" {m.marker_name:40s} | {m.file_path:50s} | {m.line:5d} | {m.api}") | ||
| print(f" ... Total patch markers: {len(patch_markers)}") | ||
|
|
||
| all_markers = code_markers + script_markers + patch_markers | ||
| print(f"\n=== TOTAL: {len(all_markers)} markers ===") | ||
|
|
||
| print("\n=== Report Preview (first 30 lines) ===") | ||
| report = generate_report(all_markers, "develop", ["rdkcentral"], 1) | ||
| for line in report.split("\n")[:30]: | ||
| print(line) |
|
|
||
| def _create_parser(): | ||
| """Create a tree-sitter parser with C grammar.""" | ||
| parser = Parser(C_LANGUAGE) |
Signed-off-by: Yogeswaran K <yogeswaransky@gmail.com>
Signed-off-by: Yogeswaran K <yogeswaransky@gmail.com>
| """Remove a directory tree, handling stubborn files (e.g. git-lfs).""" | ||
| def _on_error(_func, _path, _exc_info): | ||
| try: | ||
| os.chmod(_path, 0o777) |
Check failure
Code scanning / CodeQL
Overly permissive file permissions High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 4 days ago
In general, to fix overly permissive file permissions, ensure that when you relax permissions to enable cleanup or access, you grant only the minimum rights necessary, typically to the owner only (e.g., 0o700 for directories, 0o600 for files) rather than world-readable/writable modes like 0o777.
In this specific case, _force_rmtree is trying to make stubborn files or directories deletable. Deleting a file or directory on POSIX primarily depends on write and execute permissions on the parent directory, but in practice tools often relax permissions on the problematic path as well. Instead of setting 0o777, we can set an owner-only mode that still ensures the current process can read/write/execute the path but does not grant rights to group/others. A simple and safe choice that works for both files and directories is 0o700 (rwx for owner only). That is sufficient to allow the process to modify or remove the entry without opening it to other users.
Concretely, in tools/marker_discovery/github_client.py, inside _force_rmtree, change the line os.chmod(_path, 0o777) to os.chmod(_path, 0o700). No new imports or helper functions are required, and this change does not alter the external behavior of _force_rmtree other than tightening permissions during error handling.
| @@ -464,7 +464,7 @@ | ||
| """Remove a directory tree, handling stubborn files (e.g. git-lfs).""" | ||
| def _on_error(_func, _path, _exc_info): | ||
| try: | ||
| os.chmod(_path, 0o777) | ||
| os.chmod(_path, 0o700) | ||
| _func(_path) | ||
| except OSError: | ||
| pass |
There was a problem hiding this comment.
Pull request overview
Adds a Python-based “Marker Discovery Tool” under tools/marker_discovery/ to scan GitHub org repos for T2 telemetry marker instrumentation (C/C++ direct calls + wrapper resolution, script calls, and .patch additions) and emit a consolidated markdown inventory report.
Changes:
- Introduces GitHub org enumeration + shallow cloning workflow and a CLI orchestrator to scan repos and generate a markdown report.
- Adds parsers for C/C++ (tree-sitter), scripts (regex + limited positional arg resolution), and patch files (added-lines-only).
- Adds pytest unit tests and supporting documentation/spec artifacts.
Reviewed changes
Copilot reviewed 22 out of 24 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/marker_discovery/code_parser.py | tree-sitter-based extraction of t2_event_* markers + wrapper detection/resolution. |
| tools/marker_discovery/github_client.py | GitHub API client + cloning helpers (branch/commit fallback strategies). |
| tools/marker_discovery/marker_scanner.py | CLI entry point/orchestration across orgs or an input manifest. |
| tools/marker_discovery/script_parser.py | Scans non-C/non-binary files for t2ValNotify/t2CountNotify, flags dynamic markers, attempts $1 resolution. |
| tools/marker_discovery/patch_parser.py | Scans .patch files for added-line marker calls (C + script APIs). |
| tools/marker_discovery/report_generator.py | Markdown report generation, sorting, duplicates, dynamic markers, unresolved components. |
| tools/marker_discovery/init.py | Defines MarkerRecord dataclass used across scanners/reporting. |
| tools/marker_discovery/main.py | Enables python3 -m tools.marker_discovery execution. |
| tools/marker_discovery/requirements.txt | Python dependencies for the scanner (tree-sitter + requests). |
| tools/marker_discovery/README.md | Tool documentation, usage, and design details. |
| tools/marker_discovery/tests/test_code_parser.py | Unit tests for direct call extraction, wrapper detection, wrapper call resolution. |
| tools/marker_discovery/tests/test_script_parser.py | Unit tests for script parsing, dynamic marker classification, positional arg resolution. |
| tools/marker_discovery/tests/test_patch_parser.py | Unit tests for .patch scanning behavior. |
| tools/marker_discovery/tests/test_report_generator.py | Unit tests for markdown report formatting, sorting, duplicates, dynamic/unresolved sections. |
| tools/marker_discovery/tests/integration_test.py | Adds a “quick integration test” script (currently executed at import time). |
| tools/marker_discovery/tests/debug_wrappers.py | Adds a wrapper-debug helper script. |
| tools/marker_discovery/tests/init.py | Marks test package. |
| report.md | Checked-in generated report output (sample inventory). |
| report6.md | Checked-in generated report output (sample inventory). |
| openspec/config.yaml | Adds OpenSpec configuration scaffold. |
| openspec/changes/marker-discovery/tasks.md | Spec-driven task breakdown for the feature. |
| openspec/changes/marker-discovery/proposal.md | Proposal/spec for the marker discovery tool. |
| openspec/changes/marker-discovery/design.md | Design document for architecture and module responsibilities. |
| openspec/changes/marker-discovery/.openspec.yaml | OpenSpec metadata for the change. |
| # Telemetry Marker Inventory | ||
| **Branch**: per-component (from versions-e.txt) | ||
| **Organizations**: rdkcentral, rdk-e, rdk-common, rdk-gdcs | ||
| **Generated**: 2026-03-29 18:30:14 UTC | ||
|
|
||
| ## Summary | ||
| - **Total Markers**: 491 | ||
| - **Static Markers**: 487 | ||
| - **Dynamic Markers**: 4 (contain shell variables) | ||
| - **Components Scanned**: 196 | ||
| - **Duplicate Markers**: 43 ⚠️ | ||
|
|
There was a problem hiding this comment.
These appear to be generated output reports (large, time-stamped inventories). Committing generated scan artifacts will quickly become stale and inflates the repository/PR diff. Suggest removing these from source control (or moving to a docs/examples location with clear naming) and generating them as build artifacts instead.
| # Telemetry Marker Inventory | ||
| **Branch**: develop | ||
| **Organizations**: rdkcentral | ||
| **Generated**: 2026-03-16 06:31:31 UTC | ||
|
|
||
| ## Summary | ||
| - **Total Markers**: 913 | ||
| - **Static Markers**: 909 | ||
| - **Dynamic Markers**: 4 (contain shell variables) | ||
| - **Components Scanned**: 594 | ||
| - **Duplicate Markers**: 56 ⚠️ |
There was a problem hiding this comment.
This looks like a generated, time-stamped output report rather than source. Keeping it committed will go stale and adds a very large diff/repo footprint. Recommend removing it from version control and generating it on demand (or storing a small, curated example under docs/ if needed).
| sys.path.insert(0, "/mnt/L2_CONTAINER_SHARED_VOLUME/59001") | ||
|
|
||
| from tools.marker_discovery.code_parser import scan_repo | ||
| from tools.marker_discovery.script_parser import scan_repo_scripts | ||
| from tools.marker_discovery.patch_parser import scan_repo_patches | ||
| from tools.marker_discovery.report_generator import generate_report | ||
|
|
||
| repo_path = "/tmp/test_telemetry" | ||
| repo_name = "telemetry" | ||
|
|
||
| print("=== Code Parser ===") | ||
| code_markers = scan_repo(repo_path, repo_name) | ||
| for m in code_markers[:10]: | ||
| print(f" {m.marker_name:40s} | {m.file_path:50s} | {m.line:5d} | {m.api}") | ||
| print(f" ... Total code markers: {len(code_markers)}") | ||
|
|
||
| print("\n=== Script Parser ===") | ||
| script_markers = scan_repo_scripts(repo_path, repo_name) | ||
| for m in script_markers[:10]: | ||
| print(f" {m.marker_name:40s} | {m.file_path:50s} | {m.line:5d} | {m.api}") | ||
| print(f" ... Total script markers: {len(script_markers)}") | ||
|
|
||
| print("\n=== Patch Parser ===") | ||
| patch_markers = scan_repo_patches(repo_path, repo_name) | ||
| for m in patch_markers[:10]: | ||
| print(f" {m.marker_name:40s} | {m.file_path:50s} | {m.line:5d} | {m.api}") | ||
| print(f" ... Total patch markers: {len(patch_markers)}") | ||
|
|
||
| all_markers = code_markers + script_markers + patch_markers | ||
| print(f"\n=== TOTAL: {len(all_markers)} markers ===") | ||
|
|
||
| print("\n=== Report Preview (first 30 lines) ===") | ||
| report = generate_report(all_markers, "develop", ["rdkcentral"], 1) | ||
| for line in report.split("\n")[:30]: | ||
| print(line) |
There was a problem hiding this comment.
This module will be collected by pytest (matches *_test.py) and executes network/FS-dependent code at import time (hardcoded sys.path, /tmp/test_telemetry, print loops). This will break pytest runs in CI. Convert this into a proper test with fixtures and pytest.mark.integration + opt-in skipping, or move it out of tests/ and guard execution with if __name__ == "__main__":.
| sys.path.insert(0, "/mnt/L2_CONTAINER_SHARED_VOLUME/59001") | |
| from tools.marker_discovery.code_parser import scan_repo | |
| from tools.marker_discovery.script_parser import scan_repo_scripts | |
| from tools.marker_discovery.patch_parser import scan_repo_patches | |
| from tools.marker_discovery.report_generator import generate_report | |
| repo_path = "/tmp/test_telemetry" | |
| repo_name = "telemetry" | |
| print("=== Code Parser ===") | |
| code_markers = scan_repo(repo_path, repo_name) | |
| for m in code_markers[:10]: | |
| print(f" {m.marker_name:40s} | {m.file_path:50s} | {m.line:5d} | {m.api}") | |
| print(f" ... Total code markers: {len(code_markers)}") | |
| print("\n=== Script Parser ===") | |
| script_markers = scan_repo_scripts(repo_path, repo_name) | |
| for m in script_markers[:10]: | |
| print(f" {m.marker_name:40s} | {m.file_path:50s} | {m.line:5d} | {m.api}") | |
| print(f" ... Total script markers: {len(script_markers)}") | |
| print("\n=== Patch Parser ===") | |
| patch_markers = scan_repo_patches(repo_path, repo_name) | |
| for m in patch_markers[:10]: | |
| print(f" {m.marker_name:40s} | {m.file_path:50s} | {m.line:5d} | {m.api}") | |
| print(f" ... Total patch markers: {len(patch_markers)}") | |
| all_markers = code_markers + script_markers + patch_markers | |
| print(f"\n=== TOTAL: {len(all_markers)} markers ===") | |
| print("\n=== Report Preview (first 30 lines) ===") | |
| report = generate_report(all_markers, "develop", ["rdkcentral"], 1) | |
| for line in report.split("\n")[:30]: | |
| print(line) | |
| def main() -> None: | |
| # Adjust sys.path so local tools package can be imported when run as a script. | |
| sys.path.insert(0, "/mnt/L2_CONTAINER_SHARED_VOLUME/59001") | |
| from tools.marker_discovery.code_parser import scan_repo | |
| from tools.marker_discovery.script_parser import scan_repo_scripts | |
| from tools.marker_discovery.patch_parser import scan_repo_patches | |
| from tools.marker_discovery.report_generator import generate_report | |
| repo_path = "/tmp/test_telemetry" | |
| repo_name = "telemetry" | |
| print("=== Code Parser ===") | |
| code_markers = scan_repo(repo_path, repo_name) | |
| for m in code_markers[:10]: | |
| print(f" {m.marker_name:40s} | {m.file_path:50s} | {m.line:5d} | {m.api}") | |
| print(f" ... Total code markers: {len(code_markers)}") | |
| print("\n=== Script Parser ===") | |
| script_markers = scan_repo_scripts(repo_path, repo_name) | |
| for m in script_markers[:10]: | |
| print(f" {m.marker_name:40s} | {m.file_path:50s} | {m.line:5d} | {m.api}") | |
| print(f" ... Total script markers: {len(script_markers)}") | |
| print("\n=== Patch Parser ===") | |
| patch_markers = scan_repo_patches(repo_path, repo_name) | |
| for m in patch_markers[:10]: | |
| print(f" {m.marker_name:40s} | {m.file_path:50s} | {m.line:5d} | {m.api}") | |
| print(f" ... Total patch markers: {len(patch_markers)}") | |
| all_markers = code_markers + script_markers + patch_markers | |
| print(f"\n=== TOTAL: {len(all_markers)} markers ===") | |
| print("\n=== Report Preview (first 30 lines) ===") | |
| report = generate_report(all_markers, "develop", ["rdkcentral"], 1) | |
| for line in report.split("\n")[:30]: | |
| print(line) | |
| if __name__ == "__main__": | |
| main() |
| sys.path.insert(0, "/mnt/L2_CONTAINER_SHARED_VOLUME/59001") | ||
| from tools.marker_discovery.code_parser import detect_wrappers | ||
| ws = detect_wrappers("/tmp/test_telemetry") | ||
| for w in ws: | ||
| print(w) |
There was a problem hiding this comment.
This helper script hardcodes a container-specific sys.path and a local path (/tmp/test_telemetry). Even if it isn't collected by pytest today, keeping this under tests/ is brittle and encourages non-reproducible debugging paths. Consider moving it to a separate scripts/ or devtools/ directory and/or making the paths configurable via CLI args/env vars.
| sys.path.insert(0, "/mnt/L2_CONTAINER_SHARED_VOLUME/59001") | |
| from tools.marker_discovery.code_parser import detect_wrappers | |
| ws = detect_wrappers("/tmp/test_telemetry") | |
| for w in ws: | |
| print(w) | |
| import os | |
| import argparse | |
| def main() -> None: | |
| parser = argparse.ArgumentParser( | |
| description="Debug helper for listing marker discovery wrappers." | |
| ) | |
| parser.add_argument( | |
| "--code-path", | |
| dest="code_path", | |
| default=os.environ.get("TELEMETRY_CODE_PATH"), | |
| help=( | |
| "Optional path to prepend to sys.path so that " | |
| "tools.marker_discovery can be imported " | |
| "(default: value of TELEMETRY_CODE_PATH, if set)." | |
| ), | |
| ) | |
| parser.add_argument( | |
| "--workspace", | |
| dest="workspace", | |
| default=os.environ.get("TELEMETRY_WORKSPACE", "/tmp/test_telemetry"), | |
| help=( | |
| "Path to the telemetry workspace to scan for wrappers " | |
| "(default: TELEMETRY_WORKSPACE env var or /tmp/test_telemetry)." | |
| ), | |
| ) | |
| args = parser.parse_args() | |
| if args.code_path: | |
| sys.path.insert(0, args.code_path) | |
| from tools.marker_discovery.code_parser import detect_wrappers | |
| ws = detect_wrappers(args.workspace) | |
| for w in ws: | |
| print(w) | |
| if __name__ == "__main__": | |
| main() |
| file_path: str | ||
| line: int | ||
| api: str | ||
| source_type: str # "source" | "script" | "patch" |
There was a problem hiding this comment.
MarkerRecord.source_type is documented as only "source" | "script" | "patch", but the implementation/reporting also uses "script_dynamic". Update this comment/type annotation to reflect the actual allowed values to avoid confusing callers and test writers.
| source_type: str # "source" | "script" | "patch" | |
| source_type: str # "source" | "script" | "patch" | "script_dynamic" |
| Returns (function_name, positional_arg_index) if the marker at line_num | ||
| uses a positional parameter like $1, $2, or None if not in a function. |
There was a problem hiding this comment.
The docstring for _find_shell_function_for_line says it returns (function_name, positional_arg_index) but the function actually returns only the function name (or None). Please correct the docstring to match behavior (or adjust the function to return the documented tuple).
| Returns (function_name, positional_arg_index) if the marker at line_num | |
| uses a positional parameter like $1, $2, or None if not in a function. | |
| Returns the name of the enclosing shell function as a string, or None if | |
| the line is not inside a function definition. |
| def clone_matching_repos(org, repo_names, branch, temp_dir): | ||
| """Clone a list of repos with branch fallback. | ||
|
|
||
| Args: | ||
| org: GitHub organization name | ||
| repo_names: iterable of repo names to clone | ||
| branch: target branch name | ||
| temp_dir: directory to clone into | ||
|
|
||
| Returns list of dicts: {'name': str, 'path': str, 'branch': str} | ||
| """ | ||
| cloned = [] | ||
| for repo_name in sorted(repo_names): | ||
| clone_path, actual_branch = clone_repo(org, repo_name, branch, temp_dir) | ||
| if clone_path: | ||
| cloned.append({ | ||
| "name": repo_name, | ||
| "path": clone_path, | ||
| "branch": actual_branch, | ||
| }) | ||
| logger.info("Cloned %d/%d repos for org %s", len(cloned), len(list(repo_names)), org) | ||
| return cloned |
There was a problem hiding this comment.
clone_matching_repos logs the total repo count using len(list(repo_names)), which can (a) consume an iterator/generator and report an incorrect total, and (b) allocate an extra list unnecessarily. Compute the total once up front (e.g., repo_names = list(repo_names) at entry) and reuse it for both sorting and logging.
No description provided.