Skip to content

chore/SOF 7908#329

Open
VsevolodX wants to merge 4 commits into
mainfrom
chore/SOF-7908
Open

chore/SOF 7908#329
VsevolodX wants to merge 4 commits into
mainfrom
chore/SOF-7908

Conversation

@VsevolodX

@VsevolodX VsevolodX commented Jun 5, 2026

Copy link
Copy Markdown
Member
  • update: remove collab
  • update: remove collab realted

Summary by CodeRabbit

Release Notes

  • Documentation

    • Updated authentication guidance to use OIDC device-flow login via browser popup instead of manual credential management.
  • New Features

    • Enhanced API examples demonstrating unified authenticated client interface with simplified account initialization.
  • Refactor

    • Updated all example notebooks to use streamlined API client methods for consistent API interactions across materials, jobs, workflows, and properties.

@review-notebook-app

Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@coderabbitai

coderabbitai Bot commented Jun 5, 2026

Copy link
Copy Markdown

Looking for one thing? Review this PR in Change Stack to search files, summaries, diffs, and code without losing your place.

Review Change Stack

Warning

Review limit reached

@VsevolodX, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 47 minutes and 22 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5bc0203a-e0ab-4c71-9eb8-3c5bbf66ad9e

📥 Commits

Reviewing files that changed from the base of the PR and between 5620bcd and 8ca2bdc.

📒 Files selected for processing (6)
  • examples/job/get-file-from-job.ipynb
  • examples/job/ml-train-model-predict-properties.ipynb
  • examples/job/run-simulations-and-extract-properties.ipynb
  • examples/reproducing_publications/band_gaps_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb
  • examples/reproducing_publications/band_structure_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb
  • examples/workflow/qe_scf_calculation.ipynb
📝 Walkthrough

Walkthrough

This pull request refactors notebook examples across the repository to use a unified OIDC-authenticated APIClient pattern instead of separate endpoint classes. The system get_authentication_params example is rewritten to demonstrate device-flow authentication. Google Colab support is removed, and documentation is updated to reflect the new authentication flow.

Changes

Notebook Examples APIClient Migration

Layer / File(s) Summary
Documentation and authentication pattern setup
README.md, src/py/mat3ra/notebooks_utils/core/api/settings.py
README table and Usage section updated to describe OIDC device-flow login and JupyterLite credential injection; settings.py comments clarified on environment variable precedence and legacy token setup guidance.
System authentication example notebook
examples/system/get_authentication_params.ipynb
Complete rewrite: demonstrates OIDC login via authenticate(), initializes APIClient, exposes authenticated account details via ACCOUNT_ID/ORGANIZATION_ID environment variables, and lists available accounts; replaces prior username/password + login-endpoint flow.
Material examples: authentication and client setup
examples/material/create_material.ipynb, examples/material/get_materials_by_formula.ipynb, examples/material/upload_materials_from_file_poscar.ipynb
Consistent OIDC authentication and APIClient initialization added to all three; OWNER_ID derived from ORGANIZATION_ID env var or authenticated account; prior authorization-form logic removed.
Material examples: client-based API calls
examples/material/*
Material creation, listing, and import operations switched from MaterialEndpoints to client.materials.create(), client.materials.list(), and client.materials.import_from_file().
Job examples: authentication and setup
examples/job/create_and_submit_job.ipynb, examples/job/get-file-from-job.ipynb, examples/job/ml-train-model-predict-properties.ipynb, examples/job/run-simulations-and-extract-properties.ipynb
All four notebooks refactored with OIDC auth, APIClient initialization, and consistent OWNER_ID derivation; prior endpoint-based initialization removed.
Job examples: workflow, material, and compute setup
examples/job/*
Workflow copying, material creation/listing, and compute configuration setup refactored to use client.bank_workflows, client.materials, and client.jobs.build_compute_config().
Job examples: job creation, submission, and monitoring
examples/job/*
Job creation, submission, and async monitoring switched from job_endpoints to client.jobs methods; wait_for_jobs_to_finish_async() now receives client.jobs instead of endpoint instance.
Workflow examples: authentication and setup
examples/workflow/get_workflows.ipynb, examples/workflow/qe_scf_calculation.ipynb
OIDC authentication and APIClient initialization added; OWNER_ID derived from environment or authenticated account; prior endpoint-based initialization removed.
Workflow examples: client-based workflow and job operations
examples/workflow/*
Workflow listing and job operations switched from WorkflowEndpoints and JobEndpoints to client.workflows.list() and client.jobs.*() methods.
Publication examples: authentication and infrastructure setup
examples/reproducing_publications/band_gaps_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb, examples/reproducing_publications/band_structure_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb
OIDC authentication, APIClient initialization, and environment-based OWNER_ID derivation added to both complex publication examples; prior platform-login boilerplate removed.
Publication examples: material and workflow operations
examples/reproducing_publications/*
Material creation, workflow copying, and workflow updates refactored to use client.materials.create(), client.bank_workflows, and client.workflows.update().
Publication examples: job creation and execution flow
examples/reproducing_publications/*
Job creation, submission, parameter updates, and async monitoring switched to client.jobs methods; property extraction for results uses client.properties helpers.
Build cleanup: remove Colab helper and entry point
pyproject.toml
Removed [project.scripts] entry point for notebook-path console script (pointed to removed Google Colab helper); Google Colab support module no longer referenced.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • Exabyte-io/api-examples#313: Refactors wait_for_jobs_to_finish_async into mat3ra.notebooks_utils.api.job, which is now called with client.jobs throughout this PR's notebook updates.
  • Exabyte-io/api-examples#301: Earlier notebook setup refactors for similar examples (e.g., create_material.ipynb, qe_scf_calculation.ipynb) that overlap with this PR's authentication/client initialization changes.
  • Exabyte-io/api-examples#310: Depends on notebooks-utils package auth/settings refactors that align with this PR's removal of legacy Colab helpers and introduction of the OIDC authenticate() pattern.

Suggested reviewers

  • timurbazhirov

Poem

🐰 Hops through notebooks with glee,
Old endpoints fade, new clients we see,
OIDC flows and auth so divine,
Each example refactored—a modern design!
Colab helpers retire with grace,
Unified patterns now take their place. 🚀

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title is vague and does not clearly convey the main purpose of the changes. While it references a ticket number (SOF 7908), it does not summarize what the PR actually accomplishes. Replace the title with a clear, descriptive summary such as 'Migrate notebooks to OIDC authentication and remove Google Colab support' or 'Update notebook examples to use APIClient and remove collab module'.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch chore/SOF-7908

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (7)
examples/material/upload_materials_from_file_poscar.ipynb (1)

80-91: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Keep the POSCAR_PATH instructions consistent with the example.

Line 80 says POSCAR_PATH should be absolute, but Line 91 uses a relative ../assets/... path. That mismatch makes the setup instructions misleading for users following the notebook.

Suggested change
- - **POSCAR_PATH**: absolute path to the POSCAR file
+ - **POSCAR_PATH**: path to the POSCAR file (relative to the notebook working directory in this example)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@examples/material/upload_materials_from_file_poscar.ipynb` around lines 80 -
91, The notebook's parameter instructions are inconsistent: the markdown says
POSCAR_PATH must be absolute but the code cell sets a relative path; update
either the description or the example so they match — for example, change the
markdown line referencing POSCAR_PATH to indicate a relative path is acceptable,
or replace the POSCAR_PATH value in the code cell (the variable POSCAR_PATH next
to NAME) with an absolute path string; ensure the variable name POSCAR_PATH and
the explanatory markdown are consistent.
examples/reproducing_publications/band_structure_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb (1)

869-877: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Do not shift the band energies in place.

band_data aliases result["band_structure"]["data"], so rerunning this plotting cell subtracts the Fermi level again and produces progressively wrong plots.

Suggested fix
 for result in results:
     if result["band_structure"]:
         band_data = result["band_structure"]["data"]
         # adjust for Fermi level
         fermi_level = result["fermi_level"]["data"]["value"]
-        for i in range(len(band_data["yDataSeries"])):
-            band_data["yDataSeries"][i] = [e - fermi_level for e in band_data["yDataSeries"][i]]
-
-        plot_band_structure_with_labels(band_data, ylim=[MIN_E, MAX_E])
+        shifted_band_data = {
+            **band_data,
+            "yDataSeries": [[e - fermi_level for e in energies] for energies in band_data["yDataSeries"]],
+        }
+
+        plot_band_structure_with_labels(shifted_band_data, ylim=[MIN_E, MAX_E])
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@examples/reproducing_publications/band_structure_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb`
around lines 869 - 877, The code currently mutates
result["band_structure"]["data"] in place by assigning band_data =
result["band_structure"]["data"] and then subtracting fermi_level from each
entry in band_data["yDataSeries"]; instead, create a non-mutating copy of the
band data (or build a new shifted_yDataSeries list) and apply the Fermi-level
shift to that copy so result remains unchanged, then pass the copied/shifted
structure to plot_band_structure_with_labels; refer to result, band_data,
fermi_level, and the "yDataSeries" key to locate and update the logic.
examples/workflow/qe_scf_calculation.ipynb (1)

191-199: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fix _material in job payload: use JSON null/None, not the string "null"

In examples/workflow/qe_scf_calculation.ipynb the JOB_BODY sends "_material": "null", which serializes to a literal string rather than JSON null. For material-less jobs, _material should be omitted or set to None so it serializes to JSON null.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@examples/workflow/qe_scf_calculation.ipynb` around lines 191 - 199, The
JOB_BODY payload sets "_material" to the string "null", which will serialize as
a JSON string; update the JOB_BODY in examples/workflow/qe_scf_calculation.ipynb
so that the "_material" entry is either removed entirely or set to Python None
(not the string) so it serializes to JSON null; locate the JOB_BODY definition
and replace "\"_material\": \"null\"" with either no "_material" key or
"\"_material\": None" to fix the serialization.
examples/job/create_and_submit_job.ipynb (1)

123-128: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add _project to the job creation payload (to match the “default account’s project” claim).

examples/job/create_and_submit_job.ipynb builds config with owner, _material, workflow, and name, but omits _project, while the markdown states the job is created inside the default account’s project. The shared helper src/py/mat3ra/notebooks_utils/core/entity/job/api.py includes _project in the payload passed to api_client.jobs.create(...), and other job examples explicitly fetch a default project_id via client.projects.list({ "isDefault": True, "owner._id": OWNER_ID })[0]["_id"] and pass it into job creation helpers.

Suggested fix
+default_project = client.projects.list({"isDefault": True, "owner._id": OWNER_ID})[0]
+
 config = {
     "owner": {"_id": OWNER_ID},
+    "_project": {"_id": default_project["_id"]},
     "_material": {"_id": material_id},
     "workflow": {"_id": workflow_id},
     "name": JOB_NAME,
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@examples/job/create_and_submit_job.ipynb` around lines 123 - 128, The job
creation payload `config` in the notebook is missing the `_project` field
required to create the job in the default account project; update the notebook
to fetch the default project id (e.g., via `client.projects.list({ "isDefault":
True, "owner._id": OWNER_ID })[0]["_id"]`) and add `"_project": {"_id":
project_id}` to `config` before calling the job creation helper so the payload
aligns with `notebooks_utils/core/entity/job/api.py` and the
`api_client.jobs.create(...)` usage.
examples/job/run-simulations-and-extract-properties.ipynb (2)

68-80: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Re-add json; the compute cell still parses CLUSTERS.

The later json.loads(os.getenv("CLUSTERS")) call now crashes with NameError because this import block no longer brings json in.

🛠️ Suggested fix
+import json
 import time
 from IPython.display import IFrame
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@examples/job/run-simulations-and-extract-properties.ipynb` around lines 68 -
80, The import block is missing the json module which causes
json.loads(os.getenv("CLUSTERS")) to raise NameError later; add an import for
the json module (e.g., import json) alongside the existing imports in the top
cell that contains wait_for_jobs_to_finish_async,
get_property_by_subworkflow_and_unit_indicies, dataframe_to_html and
flatten_material so json.loads and os.getenv("CLUSTERS") work correctly.

417-422: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fix the row-building loop; it currently drops rows and duplicates the initial structure.

Only Line 419 is inside the loop right now, so the cell appends a single row for the last result. It also fills the FIN-* columns from initial_structure again instead of final_structure.

🛠️ Suggested fix
 table = []
 for result in results:
     data = flatten_material(result["initial_structure"])
-data.extend(flatten_material(result["initial_structure"]))
-data.extend([result["pressure"], result["band_gap_direct"], result["band_gap_indirect"]])
-table.append(data)
+    data.extend(flatten_material(result["final_structure"]))
+    data.extend([result["pressure"], result["band_gap_direct"], result["band_gap_indirect"]])
+    table.append(data)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@examples/job/run-simulations-and-extract-properties.ipynb` around lines 417 -
422, The loop body only contains the first line, causing only the last result to
be appended and duplicating initial_structure for FIN-* columns; fix by moving
all row-building statements into the for loop so each iteration does: create
data = flatten_material(result["initial_structure"]), then extend it with
flatten_material(result["final_structure"]) (not initial_structure again), then
extend with result["pressure"], result["band_gap_direct"],
result["band_gap_indirect"], and finally append data to table so every result
produces one row.
examples/reproducing_publications/band_gaps_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb (1)

725-729: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fix the notebook cell to call the correct job-wait helper and match its signature

In examples/reproducing_publications/band_gaps_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb (lines 725-729), the cell imports wait_for_jobs_to_finish but only wait_for_jobs_to_finish_async(endpoint, job_ids) exists/is exported in src/py/mat3ra/notebooks_utils/api/job.py. The cell then calls wait_for_jobs_to_finish_async(...) without importing it (would fail with NameError once the import is corrected), and the call also passes poll_interval=60 even though the helper signature does not accept poll_interval (would fail with TypeError).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@examples/reproducing_publications/band_gaps_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb`
around lines 725 - 729, The notebook imports wait_for_jobs_to_finish but the
available helper is wait_for_jobs_to_finish_async; update the import to "from
mat3ra.notebooks_utils.api.job import wait_for_jobs_to_finish_async" and call it
using the correct signature (await wait_for_jobs_to_finish_async(client.jobs,
job_ids)) — remove the unsupported poll_interval=60 argument and ensure you pass
the client.jobs and job_ids variables as shown.
🧹 Nitpick comments (2)
examples/material/create_material.ipynb (1)

136-137: ⚡ Quick win

Use the named owner_id argument in the create call.

Line 137 is the only migrated material-create example here that still passes the owner as a bare second positional argument. The in-repo helper in src/py/mat3ra/notebooks_utils/core/entity/material/api.py:4-23 uses owner_id=..., so keeping this positional form makes the notebook depend on the external client's parameter order.

Suggested change
- material = client.materials.create(CONFIG, OWNER_ID)
+ material = client.materials.create(CONFIG, owner_id=OWNER_ID)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@examples/material/create_material.ipynb` around lines 136 - 137, The call to
client.materials.create passes the owner as a second positional argument
(CONFIG, OWNER_ID); change it to use the named parameter owner_id so the call
becomes client.materials.create(CONFIG, owner_id=OWNER_ID) to avoid depending on
parameter order — update the invocation of client.materials.create and ensure
OWNER_ID remains the same variable used for owner_id.
examples/reproducing_publications/band_structure_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb (1)

724-747: ⚡ Quick win

Avoid hard-coding workflow indexes for property lookup.

These subworkflows[1]["units"][0/1] lookups tie the notebook to the current bank-workflow layout. src/py/mat3ra/notebooks_utils/core/entity/property/api.py already has helpers that resolve the Fermi-energy flowchart from the job, which would make this example survive workflow-template changes.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@examples/reproducing_publications/band_structure_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb`
around lines 724 - 747, The loop is hard-coded to
job["workflow"]["subworkflows"][1]["units"][0/1] via unit_flowchart_id_0 and
unit_flowchart_id_1 which will break if the workflow template changes; replace
those index lookups by calling the existing helper that resolves the correct
unit/flowchart for a given property (use it to fetch the flowchart id for
"fermi_energy" and "band_structure") and pass that id to
client.properties.get_property instead of unit_flowchart_id_0/1; update the code
that computes unit_flowchart_id_0 and unit_flowchart_id_1 (and any references)
to use the helper so the loop using job and client.properties.get_property
becomes resilient to workflow layout changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@examples/job/get-file-from-job.ipynb`:
- Line 9: Update the notebook intro to reflect the migrated auth flow: replace
the instruction to update ../../utils/settings.json with a note that
authentication now uses authenticate() / APIClient.authenticate() and show where
to call it, and fix the link text/target to point to create_and_submit_job.ipynb
(singular) instead of create_and_submit_jobs.ipynb so readers are directed to
the correct sibling example.
- Around line 105-115: The notebook computes material_id from a copied bank
material but then calls client.jobs.create_by_ids using materials =
client.materials.list({"owner._id": owner_id}) which ignores material_id; update
the call so it only passes the copied material (either filter the results of
client.materials.list to the copied material_id or construct a single-item
list/dict for the copied material using material_id) and then pass that
filtered/constructed materials list into client.jobs.create_by_ids to ensure the
job uses the intended material_id (references: material_id, bank_materials,
bank_material_id, client.materials.list, client.jobs.create_by_ids).

In `@examples/job/ml-train-model-predict-properties.ipynb`:
- Around line 46-52: The notebook is missing an import for json which causes a
NameError when evaluating cluster_config =
next(iter(json.loads(os.getenv("CLUSTERS"))), {}); add "import json" alongside
the other imports at the top of the file so json.loads can be used, ensuring
CLUSTERS parsing works and cluster_config is properly created.

In `@examples/job/run-simulations-and-extract-properties.ipynb`:
- Around line 385-389: The band-gap extraction is using the flowchart ID from
subworkflow index 0 (relaxation) instead of the band-gap subworkflow (index 1);
update how unit_flowchart_id is derived so client.properties.get_direct_band_gap
and get_indirect_band_gap receive the flowchartId from
job["workflow"]["subworkflows"][1]["units"][1] (i.e., use subworkflows[1] rather
than subworkflows[0]) so the calls to
client.properties.get_direct_band_gap(job["_id"], unit_flowchart_id) and
client.properties.get_indirect_band_gap(...) target the correct subworkflow.

In `@examples/material/upload_materials_from_file_poscar.ipynb`:
- Around line 18-27: The notebook text claims POSCAR_PATH is absolute but the
code sets POSCAR_PATH = "../assets/mp-978534.poscar"; change the documentation
to state POSCAR_PATH is a relative path (or update POSCAR_PATH to an absolute
path) so the docstring and variable agree, and ensure any readers know which
behavior you choose; also avoid top-level await in the notebook to prevent
lint/export issues by wrapping calls like await install_packages(...) and await
authenticate() inside an async function (e.g., main) and calling it via
asyncio.run or similar, or document the lint risk (F704/PLE1142) if you
intentionally keep top-level await. Include references to POSCAR_PATH,
install_packages, and authenticate when making these changes.

In
`@examples/reproducing_publications/band_structure_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb`:
- Around line 702-706: The notebook imports wait_for_jobs_to_finish but then
awaits an undefined wait_for_jobs_to_finish_async; fix by using the imported
helper or importing the async variant: either replace await
wait_for_jobs_to_finish_async(client.jobs, job_ids, poll_interval=60) with await
wait_for_jobs_to_finish(client.jobs, job_ids, poll_interval=60), or update the
import to bring in wait_for_jobs_to_finish_async and ensure it accepts
(client.jobs, job_ids, poll_interval=60) so the awaited call matches a defined
symbol.

In `@examples/workflow/get_workflows.ipynb`:
- Around line 18-27: Top-level await calls (install_packages and authenticate)
must be converted so the notebook can be linted as plain Python: replace lines
using "await install_packages('api')" and "await authenticate()" with a
synchronous entry (import asyncio) and either wrap them in an async def main()
and call asyncio.run(main()) or call asyncio.run(install_packages("api")) /
asyncio.run(authenticate()); ensure you add "import asyncio" and keep the
original function names (install_packages, authenticate) unchanged.

In `@examples/workflow/qe_scf_calculation.ipynb`:
- Around line 43-49: Add an import for the json module at the top of the
notebook so json is available when the compute setup runs; specifically, add
"import json" alongside the existing "import os" before the code that calls
json.loads(os.getenv("CLUSTERS")) (near where APIClient.authenticate(), client,
selected_account and OWNER_ID are defined) to prevent the NameError.
- Around line 245-246: The notebook awaits wait_for_jobs_to_finish_async but
never imports it, causing a NameError; add an import for
wait_for_jobs_to_finish_async at the top of the notebook from the module that
provides the job helper utilities (the same place other job helpers/clients are
imported from) so the symbol is defined before calling await
wait_for_jobs_to_finish_async(client.jobs, [JOB_RESP["_id"]]).

---

Outside diff comments:
In `@examples/job/create_and_submit_job.ipynb`:
- Around line 123-128: The job creation payload `config` in the notebook is
missing the `_project` field required to create the job in the default account
project; update the notebook to fetch the default project id (e.g., via
`client.projects.list({ "isDefault": True, "owner._id": OWNER_ID })[0]["_id"]`)
and add `"_project": {"_id": project_id}` to `config` before calling the job
creation helper so the payload aligns with
`notebooks_utils/core/entity/job/api.py` and the `api_client.jobs.create(...)`
usage.

In `@examples/job/run-simulations-and-extract-properties.ipynb`:
- Around line 68-80: The import block is missing the json module which causes
json.loads(os.getenv("CLUSTERS")) to raise NameError later; add an import for
the json module (e.g., import json) alongside the existing imports in the top
cell that contains wait_for_jobs_to_finish_async,
get_property_by_subworkflow_and_unit_indicies, dataframe_to_html and
flatten_material so json.loads and os.getenv("CLUSTERS") work correctly.
- Around line 417-422: The loop body only contains the first line, causing only
the last result to be appended and duplicating initial_structure for FIN-*
columns; fix by moving all row-building statements into the for loop so each
iteration does: create data = flatten_material(result["initial_structure"]),
then extend it with flatten_material(result["final_structure"]) (not
initial_structure again), then extend with result["pressure"],
result["band_gap_direct"], result["band_gap_indirect"], and finally append data
to table so every result produces one row.

In `@examples/material/upload_materials_from_file_poscar.ipynb`:
- Around line 80-91: The notebook's parameter instructions are inconsistent: the
markdown says POSCAR_PATH must be absolute but the code cell sets a relative
path; update either the description or the example so they match — for example,
change the markdown line referencing POSCAR_PATH to indicate a relative path is
acceptable, or replace the POSCAR_PATH value in the code cell (the variable
POSCAR_PATH next to NAME) with an absolute path string; ensure the variable name
POSCAR_PATH and the explanatory markdown are consistent.

In
`@examples/reproducing_publications/band_gaps_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb`:
- Around line 725-729: The notebook imports wait_for_jobs_to_finish but the
available helper is wait_for_jobs_to_finish_async; update the import to "from
mat3ra.notebooks_utils.api.job import wait_for_jobs_to_finish_async" and call it
using the correct signature (await wait_for_jobs_to_finish_async(client.jobs,
job_ids)) — remove the unsupported poll_interval=60 argument and ensure you pass
the client.jobs and job_ids variables as shown.

In
`@examples/reproducing_publications/band_structure_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb`:
- Around line 869-877: The code currently mutates
result["band_structure"]["data"] in place by assigning band_data =
result["band_structure"]["data"] and then subtracting fermi_level from each
entry in band_data["yDataSeries"]; instead, create a non-mutating copy of the
band data (or build a new shifted_yDataSeries list) and apply the Fermi-level
shift to that copy so result remains unchanged, then pass the copied/shifted
structure to plot_band_structure_with_labels; refer to result, band_data,
fermi_level, and the "yDataSeries" key to locate and update the logic.

In `@examples/workflow/qe_scf_calculation.ipynb`:
- Around line 191-199: The JOB_BODY payload sets "_material" to the string
"null", which will serialize as a JSON string; update the JOB_BODY in
examples/workflow/qe_scf_calculation.ipynb so that the "_material" entry is
either removed entirely or set to Python None (not the string) so it serializes
to JSON null; locate the JOB_BODY definition and replace "\"_material\":
\"null\"" with either no "_material" key or "\"_material\": None" to fix the
serialization.

---

Nitpick comments:
In `@examples/material/create_material.ipynb`:
- Around line 136-137: The call to client.materials.create passes the owner as a
second positional argument (CONFIG, OWNER_ID); change it to use the named
parameter owner_id so the call becomes client.materials.create(CONFIG,
owner_id=OWNER_ID) to avoid depending on parameter order — update the invocation
of client.materials.create and ensure OWNER_ID remains the same variable used
for owner_id.

In
`@examples/reproducing_publications/band_structure_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb`:
- Around line 724-747: The loop is hard-coded to
job["workflow"]["subworkflows"][1]["units"][0/1] via unit_flowchart_id_0 and
unit_flowchart_id_1 which will break if the workflow template changes; replace
those index lookups by calling the existing helper that resolves the correct
unit/flowchart for a given property (use it to fetch the flowchart id for
"fermi_energy" and "band_structure") and pass that id to
client.properties.get_property instead of unit_flowchart_id_0/1; update the code
that computes unit_flowchart_id_0 and unit_flowchart_id_1 (and any references)
to use the helper so the loop using job and client.properties.get_property
becomes resilient to workflow layout changes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9b0e3d94-b00d-4cbd-9638-2e726e23d83e

📥 Commits

Reviewing files that changed from the base of the PR and between 3712fa8 and 5620bcd.

📒 Files selected for processing (16)
  • README.md
  • examples/job/create_and_submit_job.ipynb
  • examples/job/get-file-from-job.ipynb
  • examples/job/ml-train-model-predict-properties.ipynb
  • examples/job/run-simulations-and-extract-properties.ipynb
  • examples/material/create_material.ipynb
  • examples/material/get_materials_by_formula.ipynb
  • examples/material/upload_materials_from_file_poscar.ipynb
  • examples/reproducing_publications/band_gaps_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb
  • examples/reproducing_publications/band_structure_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb
  • examples/system/get_authentication_params.ipynb
  • examples/workflow/get_workflows.ipynb
  • examples/workflow/qe_scf_calculation.ipynb
  • pyproject.toml
  • src/py/mat3ra/notebooks_utils/core/api/settings.py
  • src/py/mat3ra/notebooks_utils/ipython/_collab.py
💤 Files with no reviewable changes (2)
  • src/py/mat3ra/notebooks_utils/ipython/_collab.py
  • pyproject.toml

"<img alt=\"Open in Google Colab\" src=\"https://user-images.githubusercontent.com/20477508/128780728-491fea90-9b23-495f-a091-11681150db37.jpeg\" width=\"150\" border=\"0\">\n",
"</a>"
]
"source": "# Get-File-From-Job\n\nThis example demonstrates how to use Mat3ra RESTful API to check for and acquire files from jobs which have been run. This example assumes that the user is already familiar with the [creation and submission of jobs](create_and_submit_jobs.ipynb) using our API.\n\n> <span style=\"color: orange\">**IMPORTANT NOTE**</span>: In order to run this example in full, an active Mat3ra.com account is required. Alternatively, Readers may substitute the workflow ID below with another one (an equivalent one for VASP, for example) and adjust extraction of the results (\"Viewing job files\" section). RESTful API credentials shall be updated in [settings](../../utils/settings.json).\n\n\n## Steps\n\nAfter working through this notebook, you will be able to:\n\n1. Import [the structure of Si](https://materialsproject.org/materials/mp-149/) from Materials Bank\n2. Set up and run a single-point calculation using Quantum Espresso.\n3. List files currently in the job's directory\n4. Check metadata for every file (modification date, size, etc)\n5. Access file contents directly and print them to console\n6. Download files to your local machine\n\n## Pre-requisites\n\nThe explanation below assumes that the reader is familiar with the concepts used in Mat3ra platform and RESTful API. We outline these below and direct the reader to the original sources of information:\n\n- [Generating RESTful API authentication parameters](../system/get_authentication_params.ipynb)\n- [Creating and submitting jobs](../job/create_and_submit_job.ipynb)"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update the intro to match the migrated auth flow.

This block still tells readers to update ../../utils/settings.json, even though the notebook now authenticates via authenticate() / APIClient.authenticate(). It also links to create_and_submit_jobs.ipynb, while the sibling notebook in this PR is create_and_submit_job.ipynb.

🧰 Tools
🪛 Ruff (0.15.15)

[error] 9-9: await statement outside of a function

(F704)


[error] 9-9: await should be used within an async function

(PLE1142)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@examples/job/get-file-from-job.ipynb` at line 9, Update the notebook intro to
reflect the migrated auth flow: replace the instruction to update
../../utils/settings.json with a note that authentication now uses
authenticate() / APIClient.authenticate() and show where to call it, and fix the
link text/target to point to create_and_submit_job.ipynb (singular) instead of
create_and_submit_jobs.ipynb so readers are directed to the correct sibling
example.

Comment on lines 105 to 115
"# Get materials from bank and copy one to our account\n",
"material_bank_endpoints = BankMaterialEndpoints(*ENDPOINT_ARGS)\n",
"MATERIAL_QUERY = {\"formula\": \"Si\"}\n",
"bank_materials = material_bank_endpoints.list(MATERIAL_QUERY)\n",
"bank_materials = client.bank_materials.list(MATERIAL_QUERY)\n",
"bank_material_id = bank_materials[0][\"_id\"]\n",
"material_endpoints = MaterialEndpoints(*ENDPOINT_ARGS)\n",
"material_id = material_bank_endpoints.copy(bank_material_id, owner_id)[\"_id\"]\n",
"materials = material_endpoints.list({\"owner._id\": owner_id})\n",
"material_id = client.bank_materials.copy(bank_material_id, owner_id)[\"_id\"]\n",
"materials = client.materials.list({\"owner._id\": owner_id})\n",
"\n",
"# Create the job\n",
"job_endpoints = JobEndpoints(*ENDPOINT_ARGS)\n",
"job = job_endpoints.create_by_ids(\n",
"job = client.jobs.create_by_ids(\n",
" materials=materials, workflow_id=workflow_id, project_id=project_id, owner_id=owner_id, prefix=\"Test_Job_Output\"\n",
")[0]\n",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate the notebook and the job API helper
echo "== Files containing create_by_ids =="
rg -n "create_by_ids" -S .

echo "== Job API helper (create_job) =="
rg -n "def create_job" -S src examples .

# Print the relevant helper implementation if found
python3 - <<'PY'
import subprocess, re, os, sys, textwrap, json, pathlib

# Find candidate files for create_job
out = subprocess.check_output(["bash","-lc","rg -n \"def create_job\" -S src examples || true"], text=True)
files=set()
for line in out.splitlines():
    m=re.match(r"([^:]+):\d+:def create_job", line)
    if m: files.add(m.group(1))
print("Candidate create_job files:", sorted(files))
for f in sorted(files):
    # Print a window around the function definition
    # Use sed to show ~120 lines around matches
    # Determine line number(s)
    out2 = subprocess.check_output(["bash","-lc", f"rg -n \"def create_job\" -S {shlex.quote(f)} || true"], text=True)
PY

Repository: Exabyte-io/api-examples

Length of output: 1349


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Search create_by_ids definition =="
rg -n "def create_by_ids|create_by_ids\(" src/py -S

echo "== Show src/py/mat3ra/notebooks_utils/core/entity/job/api.py around create_job =="
python3 - <<'PY'
import pathlib, itertools
path = pathlib.Path("src/py/mat3ra/notebooks_utils/core/entity/job/api.py")
text = path.read_text(encoding="utf-8").splitlines()
for i,line in enumerate(text, start=1):
    if "def create_job" in line:
        start=max(1, i-15)
        end=min(len(text), i+120)
        for j in range(start,end+1):
            print(f"{j:5d}: {text[j-1]}")
        break
PY

echo "== Show examples/job/get-file-from-job.ipynb around materials construction and create_by_ids call =="
python3 - <<'PY'
import json, pathlib
p = pathlib.Path("examples/job/get-file-from-job.ipynb")
nb = json.loads(p.read_text(encoding="utf-8"))
# Find the cell(s) containing create_by_ids
hits=[]
for idx,cell in enumerate(nb.get("cells",[])):
    src="".join(cell.get("source",[]))
    if "create_by_ids" in src:
        hits.append((idx, src))
print("Cells containing create_by_ids:", [h[0] for h in hits])
# Print the first hit region with some context from the previous/next cell
if hits:
    i = hits[0][0]
    for k in range(max(0,i-2), min(len(nb["cells"]), i+3)):
        cell=nb["cells"][k]
        src="".join(cell.get("source",[]))
        if src.strip():
            print("\n--- cell", k, "---")
            print(src)
PY

Repository: Exabyte-io/api-examples

Length of output: 105


🏁 Script executed:

#!/bin/bash
set -u

echo "== Search create_by_ids occurrences in src/py =="
rg -n "create_by_ids" src/py -S || true

echo "== Show create_job implementation (with line numbers) =="
python3 - <<'PY'
import pathlib
path = pathlib.Path("src/py/mat3ra/notebooks_utils/core/entity/job/api.py")
text = path.read_text(encoding="utf-8").splitlines()
for i,line in enumerate(text, start=1):
    if "def create_job" in line:
        start=max(1,i-20)
        end=min(len(text), i+220)
        for j in range(start,end+1):
            print(f"{j:5d}: {text[j-1]}")
        break
PY

echo "== Dump the exact notebook cell(s) that call client.jobs.create_by_ids =="
python3 - <<'PY'
import json, pathlib
p = pathlib.Path("examples/job/get-file-from-job.ipynb")
nb = json.loads(p.read_text(encoding="utf-8"))
for idx,cell in enumerate(nb.get("cells",[])):
    src = "".join(cell.get("source",[]))
    if "client.jobs.create_by_ids" in src:
        print(f"\n--- cell index {idx} ---\n{src}")
PY

Repository: Exabyte-io/api-examples

Length of output: 4200


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Search whole repo for create_by_ids implementation =="
rg -n "def create_by_ids|create_by_ids\s*\(" -S . || true

echo "== Search for jobs client class/module =="
rg -n "class .*Jobs|jobs\.create_by_ids|create_by_ids" -S src || true

echo "== List likely job client files =="
fd -t f -S "job" src/py || true

Repository: Exabyte-io/api-examples

Length of output: 1107


Pass only the copied Si material into client.jobs.create_by_ids(...)

In examples/job/get-file-from-job.ipynb (lines 105-115), the notebook computes material_id from the copied silicon material, but the create_by_ids(...) call uses materials = client.materials.list({"owner._id": owner_id}) (all materials for the account), leaving the copied material_id unused and risking the wrong _id being used (the create_job(...) helper sets _material from materials[0]["_id"]). Filter materials to the copied _id or pass a single-item list containing the copied material dict.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@examples/job/get-file-from-job.ipynb` around lines 105 - 115, The notebook
computes material_id from a copied bank material but then calls
client.jobs.create_by_ids using materials = client.materials.list({"owner._id":
owner_id}) which ignores material_id; update the call so it only passes the
copied material (either filter the results of client.materials.list to the
copied material_id or construct a single-item list/dict for the copied material
using material_id) and then pass that filtered/constructed materials list into
client.jobs.create_by_ids to ensure the job uses the intended material_id
(references: material_id, bank_materials, bank_material_id,
client.materials.list, client.jobs.create_by_ids).

Comment thread examples/job/ml-train-model-predict-properties.ipynb Outdated
Comment thread examples/job/run-simulations-and-extract-properties.ipynb Outdated
Comment on lines +18 to 27
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Overview\n",
"from mat3ra.notebooks_utils.packages import install_packages\n",
"\n",
"This example demonstrates how to import a material from a POSCAR file via [Material](https://docs.mat3ra.com/api/Material/post_materials_import) endpoints."
"await install_packages(\"api\")"
]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Inspect the notebook JSON for top-level await and referenced POSCAR paths.
NOTEBOOK="examples/material/upload_materials_from_file_poscar.ipynb"
if [ ! -f "$NOTEBOOK" ]; then
  echo "Missing $NOTEBOOK"
  exit 1
fi

python3 - <<'PY'
import json
import re
from pathlib import Path

nb_path = Path("examples/material/upload_materials_from_file_poscar.ipynb")
nb = json.loads(nb_path.read_text(encoding="utf-8"))

await_lines = []
poscar_lines = []
source_cell_idx = 0

for cell in nb.get("cells", []):
    if cell.get("cell_type") != "code":
        continue
    src = cell.get("source", [])
    if isinstance(src, str):
        src = src.splitlines(True)
    for i, line in enumerate(src, start=1):
        # rough checks for "await" usage in source strings
        if re.search(r'(^|\s)await\s', line):
            await_lines.append((source_cell_idx, i, line.rstrip("\n")))
        if "POSCAR" in line or "poscar" in line:
            poscar_lines.append((source_cell_idx, i, line.rstrip("\n")))
    source_cell_idx += 1

print("=== Top-level-ish await occurrences (cell_index, line_in_cell, line) ===")
for cidx, lno, text in await_lines[:200]:
    print(f"cell={cidx} line={lno}: {text}")
print(f"... total await lines: {len(await_lines)}")

print("\n=== POSCAR/poscar occurrences (cell_index, line_in_cell, line) ===")
for cidx, lno, text in poscar_lines[:200]:
    print(f"cell={cidx} line={lno}: {text}")
print(f"... total poscar lines: {len(poscar_lines)}")
PY

# 2) Locate CI linting/export steps for notebooks (nbqa, ruff on .ipynb, etc.)
echo "=== Searching for notebook lint/export in CI ==="
rg -n --hidden --glob '.github/workflows/*' -S "nbqa|ruff.*ipynb|ipynb.*ruff|ruff.*nb|lint.*ipynb|export.*notebook|jupytext|convert.*ipynb|nbconvert" . || true

echo "=== Searching for ruff config targeting notebooks ==="
ls -la .ruff* 2>/dev/null || true
rg -n --hidden -S "ipynb" pyproject.toml ruff.toml setup.cfg . || true

# 3) Also search for the specific Ruff codes mentioned (F704/PLE1142) across the repo.
echo "=== Searching for ruff F704/PLE1142 references ==="
rg -n --hidden -S "F704|PLE1142" . || true

Repository: Exabyte-io/api-examples

Length of output: 50447


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Focused CI searches
echo "=== Search workflows for ruff/nbqa/jupytext/nbconvert/ipynb ==="
rg -n --hidden -S "\.ipynb|nbqa|jupytext|nbconvert|ruff" .github/workflows || true
echo "=== Search for nbconvert render/lint scripts in repo root ==="
rg -n --hidden -S "nbqa|jupytext|nbconvert|ruff.*ipynb|ipynb.*ruff|inspect.*ipynb" . || true

# Focused Ruff config extraction from pyproject.toml
echo "=== Ruff config sections from pyproject.toml ==="
python3 - <<'PY'
import re, pathlib
p = pathlib.Path("pyproject.toml")
if not p.exists():
    print("pyproject.toml not found")
    raise SystemExit(0)
txt = p.read_text(encoding="utf-8")
# Print only likely ruff config block lines
for m in re.finditer(r'(?ms)^\[tool\.ruff[^\]]*\].*?(?=^\[tool\.|\Z)', txt):
    block = m.group(0)
    if any(k in block for k in ["ipynb","extend-include","include","exclude","lint","select","ignore","F704","PLE1142"]):
        print(block)
PY

# Search for explicit ruff F704/PLE1142 mentions
echo "=== Search for F704/PLE1142 in repo ==="
rg -n --hidden -S "F704|PLE1142" . || true

Repository: Exabyte-io/api-examples

Length of output: 9847


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== List .github workflows ==="
ls -ლა .github/workflows || true
echo

echo "=== Search CI for render-notebooks / nbconvert / ruff / nbqa / jupytext ==="
rg -n --hidden -S "render-notebooks\.sh|nbconvert|jupytext|nbqa|ruff|jupyter-nbconvert" .github/workflows . || true
echo

echo "=== Show render-notebooks.sh (key parts) ==="
sed -n '1,120p' scripts/render-notebooks.sh
echo
sed -n '120,240p' scripts/render-notebooks.sh || true
echo

echo "=== Inspect examples/config.py (key parts) ==="
sed -n '1,120p' examples/config.py
echo
sed -n '120,240p' examples/config.py || true
echo

echo "=== Check pyproject for any notebook lint/export integration ==="
rg -n --hidden -S "ipynb|nbconvert|nbqa|jupytext|jupyter-nbconvert|jupytext|ruff.*ipynb" pyproject.toml || true
echo
sed -n '1,220p' pyproject.toml
echo

echo "=== Search the target notebook for POSCAR_PATH / absolute wording ==="
NOTEBOOK="examples/material/upload_materials_from_file_poscar.ipynb"
python3 - <<'PY'
import json, re
from pathlib import Path
nb_path = Path("examples/material/upload_materials_from_file_poscar.ipynb")
nb = json.loads(nb_path.read_text(encoding="utf-8"))
patterns = [r"POSCAR_PATH", r"absolute", r"absolute path", r"relative", r"../assets/.*poscar"]
for cell in nb.get("cells", []):
    if cell.get("cell_type") not in ("markdown","code"):
        continue
    src = cell.get("source", [])
    if isinstance(src, str):
        src = src.splitlines(True)
    text = "".join(src)
    if any(re.search(p, text, flags=re.IGNORECASE) for p in patterns):
        print("\n--- cell_type:", cell.get("cell_type"), "---")
        # print first ~12 lines for context
        lines = text.splitlines()
        for l in lines[:12]:
            print(l)
PY

Repository: Exabyte-io/api-examples

Length of output: 17460


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Check .github contents ==="
ls -la .github || true
find .github -maxdepth 3 -type f -print || true

echo "=== Inspect pre-commit config for ruff ==="
ls -la .pre-commit-config.yaml || true
sed -n '1,200p' .pre-commit-config.yaml || true

echo "=== Search for ruff invocation in scripts/ci ==="
rg -n --hidden -S "pre-commit|ruff check|ruff\s|nbqa|jupytext|jupyter-nbconvert|nbconvert.*python" . || true

echo "=== Search for any CI configs besides GitHub workflows ==="
ls -la .gitlab-ci.yml .circleci/config.yml .travis.yml 2>/dev/null || true
find . -maxdepth 4 -type f -iname '*ci*' -o -iname '*pipeline*' -o -iname '*workflow*' | head -n 50 || true

echo "=== Extract the relevant cells from the target notebook (code+markdown around POSCAR_PATH/await) ==="
python3 - <<'PY'
import json
from pathlib import Path
nb_path = Path("examples/material/upload_materials_from_file_poscar.ipynb")
nb = json.loads(nb_path.read_text(encoding="utf-8"))

def show(cell_idx):
    cell = nb["cells"][cell_idx]
    print(f"\n--- cell_index={cell_idx} cell_type={cell.get('cell_type')} ---")
    src = cell.get("source", [])
    if isinstance(src, str):
        src = src.splitlines(True)
    for i,line in enumerate(src, start=1):
        print(f"{i:>3}: {line.rstrip()}")

# Show first few cells with await
for idx in range(min(6, len(nb["cells"]))):
    cell = nb["cells"][idx]
    if cell.get("cell_type") == "code" and any("await " in (l if isinstance(cell["source"], list) else "") for l in (cell.get("source") or [])):
        show(idx)

# Show cells mentioning POSCAR_PATH
for idx, cell in enumerate(nb["cells"]):
    if cell.get("cell_type") == "code":
        src = cell.get("source", [])
        if isinstance(src, str):
            src = src.splitlines(True)
        if any("POSCAR_PATH" in l for l in src):
            show(idx)
    if cell.get("cell_type") == "markdown":
        src = cell.get("source", [])
        if isinstance(src, str):
            src = src.splitlines(True)
        if any("POSCAR_PATH" in l for l in src):
            show(idx)
PY

Repository: Exabyte-io/api-examples

Length of output: 12384


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== .github/workflows/cicd.yml ==="
sed -n '1,120p' .github/workflows/cicd.yml
echo
sed -n '120,240p' .github/workflows/cicd.yml || true

echo
echo "=== .pre-commit-config.yaml (full) ==="
cat .pre-commit-config.yaml

echo
echo "=== ruff.toml (full) ==="
cat ruff.toml

echo
echo "=== Search for ruff hook options (ipynb) ==="
rg -n --hidden -S "id: ruff|ruff\.toml|ipynb|F704|PLE1142|--extend|--select|--ignore" .pre-commit-config.yaml pyproject.toml ruff.toml .github/workflows || true

Repository: Exabyte-io/api-examples

Length of output: 7286


🌐 Web query:

Exabyte-io/pre-commit-hooks ruff hook id: ruff supports .ipynb

💡 Result:

The repository Exabyte-io/pre-commit-hooks appears to be a separate, inactive project [1] and is not the official repository for the Ruff pre-commit hook. The official Ruff pre-commit integration is maintained at https://github.com/astral-sh/ruff-pre-commit [2][3]. This official hook does support Jupyter Notebooks (.ipynb) [2][4]. As of August 2024, the official ruff-pre-commit hook checks Jupyter notebooks by default [4]. If you wish to configure the hook, you can control which file types are processed by modifying the types_or list in your.pre-commit-config.yaml file [2][3]. For example, to include Jupyter notebooks, you can use: - repo: https://github.com/astral-sh/ruff-pre-commit rev: v0.15.14 hooks: - id: ruff-check types_or: [python, pyi, jupyter] args: [--fix] - id: ruff-format types_or: [python, pyi, jupyter] If you want to exclude notebooks, simply remove jupyter from that list [2][3].

Citations:


🌐 Web query:

Exabyte-io/pre-commit-hooks ruff hook configuration types_or ipynb

💡 Result:

The Ruff pre-commit hook—primarily maintained at astral-sh/ruff-pre-commit [1][2]—uses types_or: [python, pyi, jupyter] by default, meaning it runs on Python files, type stubs, and Jupyter Notebooks [3]. To specifically configure the types_or setting for Jupyter notebooks in your .pre-commit-config.yaml, you can include or exclude jupyter in the list of file types [1][2][4]. For example, to run Ruff only on Python files and type stubs (thereby disabling it for Jupyter notebooks), you would configure the hook as follows [1][2]: - repo: https://github.com/astral-sh/ruff-pre-commit rev: v0.15.14 # Use the latest version hooks: - id: ruff-check types_or: [python, pyi] args: [--fix] - id: ruff-format types_or: [python, pyi] If you wish to keep Jupyter notebooks enabled, you can either omit the types_or line (to use the default) or explicitly include it [1][3]: - id: ruff-check types_or: [python, pyi, jupyter] Note that Exabyte-io/pre-commit-hooks [5] is a distinct repository and does not appear to be the official source for the standard Ruff pre-commit integration. For production setups, it is recommended to use the official astral-sh/ruff-pre-commit repository [1][4].

Citations:


🌐 Web query:

site:github.com/Exabyte-io/pre-commit-hooks ruff hook id ruff types_or jupyter ipynb

💡 Result:

In Exabyte-io/pre-commit-hooks, the hook with id: ruff is currently restricted to Python files only:

  • - id: ruff
  • types: [python] (note: not types_or, and no jupyter) (github.com)

So it will not run on Jupyter notebooks (.ipynb), since pre-commit matches notebooks via the jupyter file type. To include notebooks, change it to something like:

- id: ruff
  types_or: [python, pyi, jupyter]

This is the same approach documented for Ruff’s official pre-commit integration (types_or: [..., jupyter]). (github.com)

Ruff itself has built-in support for linting/formatting Jupyter notebooks. (github.com)

Citations:


Fix POSCAR_PATH documentation mismatch and reassess lint-blocking risk for await.

  • The notebook claims POSCAR_PATH is an absolute path, but the code sets POSCAR_PATH = "../assets/mp-978534.poscar" (relative path).
  • The notebook uses top-level await (e.g., await install_packages(...), await authenticate()). CI runs pre-commit (.github/workflows/cicd.yml), but the configured ruff hook in Exabyte-io/pre-commit-hooks is limited to Python files (types: [python]), so .ipynb won’t be linted there; this will still matter if the notebook is linted/exported as Python elsewhere (F704/PLE1142).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@examples/material/upload_materials_from_file_poscar.ipynb` around lines 18 -
27, The notebook text claims POSCAR_PATH is absolute but the code sets
POSCAR_PATH = "../assets/mp-978534.poscar"; change the documentation to state
POSCAR_PATH is a relative path (or update POSCAR_PATH to an absolute path) so
the docstring and variable agree, and ensure any readers know which behavior you
choose; also avoid top-level await in the notebook to prevent lint/export issues
by wrapping calls like await install_packages(...) and await authenticate()
inside an async function (e.g., main) and calling it via asyncio.run or similar,
or document the lint risk (F704/PLE1142) if you intentionally keep top-level
await. Include references to POSCAR_PATH, install_packages, and authenticate
when making these changes.

Comment on lines +18 to 27
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Overview\n",
"from mat3ra.notebooks_utils.packages import install_packages\n",
"\n",
"Inside this example we contact [Workflow](https://docs.mat3ra.com/api/Workflows/get_workflows) endpoint to obtain a list of workflows that an account has access to."
"await install_packages(\"api\")"
]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

❓ Verification inconclusive

🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="examples/workflow/get_workflows.ipynb"
echo "== File: $FILE =="
if [ ! -f "$FILE" ]; then
  echo "Missing file!"
  exit 1
fi

# Show line numbers around 18-50
nl -ba "$FILE" | sed -n '1,120p' | sed -n '15,60p'

# Extract all code cells' sources that contain "await"
python3 - <<'PY'
import json, re
path="examples/workflow/get_workflows.ipynb"
with open(path,'r',encoding='utf-8') as f:
    nb=json.load(f)
hits=[]
for i,cell in enumerate(nb.get("cells",[])):
    if cell.get("cell_type")!="code": 
        continue
    src=cell.get("source",[])
    text="".join(src) if isinstance(src,list) else str(src)
    if re.search(r'(^|\n)\s*await\s+', text) or re.search(r'await\s+install_packages', text):
        hits.append((i,text))
print("== Code cells containing 'await' ==")
for idx,text in hits:
    print(f"\n--- cell_index={idx} ---")
    print(text)
PY

# Find Ruff invocation / config
echo "== Search for Ruff config/CI =="
ls -la
fd -t f -i 'pyproject.toml' . -x printf '%p\n' 2>/dev/null | head -n 20 || true
fd -t f -i '.ruff.toml' . | head -n 20 || true
fd -t f -i 'ruff.*toml' . | head -n 50 || true
fd -t f -i '*ruff*' .gitlab-ci* .github/workflows 2>/dev/null | head -n 50 || true

# Show likely Ruff config files
for f in pyproject.toml ruff.toml .ruff.toml .ruff.toml; do
  [ -f "$f" ] && echo "FOUND $f"
done

# If pyproject exists, extract ruff section
if [ -f "pyproject.toml" ]; then
  echo "== pyproject.toml ruff section (approx) =="
  python3 - <<'PY'
import tomllib, pathlib
p=pathlib.Path("pyproject.toml")
data=tomllib.loads(p.read_text("utf-8"))
ruff=data.get("tool",{}).get("ruff",{})
print("ruff keys:", list(ruff.keys()))
print("select:", ruff.get("lint",{}).get("select") if isinstance(ruff.get("lint"),dict) else None)
print("ignore:", ruff.get("lint",{}).get("ignore") if isinstance(ruff.get("lint"),dict) else None)
print("extend-select:", ruff.get("lint",{}).get("extend-select") if isinstance(ruff.get("lint"),dict) else None)
PY
fi

# Search workflow/CI for ruff
rg -n --hidden --glob '!**/node_modules/**' "ruff.*(check|format)|\\bRuff\\b" .github/workflows 2>/dev/null || true
rg -n --hidden "ruff\\s+check|ruff\\s+format" . 2>/dev/null | head -n 50 || true

# Search for F704/PLE1142 in repo
rg -n "F704|PLE1142" . | head -n 50 || true

Repository: Exabyte-io/api-examples

Length of output: 179


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="examples/workflow/get_workflows.ipynb"
echo "== File: $FILE =="
if [ ! -f "$FILE" ]; then
  echo "Missing file!"
  exit 1
fi

nl -ba "$FILE" | sed -n '1,200p' | sed -n '15,90p'

python3 - <<'PY'
import json, re
path="examples/workflow/get_workflows.ipynb"
with open(path,'r',encoding='utf-8') as f:
    nb=json.load(f)
hits=[]
for i,cell in enumerate(nb.get("cells",[])):
    if cell.get("cell_type")!="code":
        continue
    src=cell.get("source",[])
    text="".join(src) if isinstance(src,list) else str(src)
    if "await" in text:
        hits.append((i,text))
print("== Code cells with any 'await' ==")
for idx,text in hits:
    print(f"\n--- cell_index={idx} ---")
    for line in text.splitlines():
        if line.strip():
            print(line)
PY

echo "== Search for Ruff config/CI =="
ls -la > /dev/null
if [ -f "pyproject.toml" ]; then
  echo "FOUND pyproject.toml"
  python3 - <<'PY'
import tomllib, pathlib
data=tomllib.loads(pathlib.Path("pyproject.toml").read_text("utf-8"))
ruff=data.get("tool",{}).get("ruff",{})
print("ruff keys:", list(ruff.keys()))
lint=ruff.get("lint",{})
print("lint keys:", list(lint.keys()) if isinstance(lint,dict) else None)
print("lint.select:", lint.get("select") if isinstance(lint,dict) else None)
print("lint.extend-select:", lint.get("extend-select") if isinstance(lint,dict) else None)
print("lint.ignore:", lint.get("ignore") if isinstance(lint,dict) else None)
PY
fi

# List possible ruff config files
fd -t f -i 'ruff*.toml' . 2>/dev/null | head -n 20 || true
fd -t f -i '*ruff*' .github/workflows 2>/dev/null | head -n 50 || true

rg -n --hidden "ruff\\s+(check|format)" .github/workflows 2>/dev/null || true
rg -n --hidden "ruff\\s+(check|format)" . 2>/dev/null | head -n 80 || true

rg -n "F704|PLE1142" . | head -n 80 || true

Repository: Exabyte-io/api-examples

Length of output: 179


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="examples/workflow/get_workflows.ipynb"
echo "== File: $FILE =="
if [ ! -f "$FILE" ]; then
  echo "Missing file!"
  exit 1
fi

echo "== Raw JSON lines 1-120 (subset 15-90) =="
nl -ba "$FILE" | sed -n '15,90p'

echo "== Extract code-cell sources containing 'await' =="
python3 - <<'PY'
import json, re
path="examples/workflow/get_workflows.ipynb"
with open(path,'r',encoding='utf-8') as f:
    nb=json.load(f)
hits=[]
for i,cell in enumerate(nb.get("cells",[])):
    if cell.get("cell_type")!="code":
        continue
    src=cell.get("source",[])
    text="".join(src) if isinstance(src,list) else str(src)
    if re.search(r'\bawait\b', text):
        hits.append((i,text))
print("code_cell_count:", sum(1 for c in nb.get("cells",[]) if c.get("cell_type")=="code"))
print("== Code cells with any 'await' ==")
for idx,text in hits:
    print(f"\n--- code_cell_index={idx} ---")
    print(text)
PY

echo "== Ruff configuration & CI =="
if [ -f "pyproject.toml" ]; then
  echo "FOUND pyproject.toml"
  python3 - <<'PY'
import tomllib, pathlib, json
data=tomllib.loads(pathlib.Path("pyproject.toml").read_text("utf-8"))
ruff=data.get("tool",{}).get("ruff",{})
lint=ruff.get("lint",{})
print("ruff.lint keys:", list(lint.keys()) if isinstance(lint,dict) else None)
for k in ["select","extend-select","ignore","extend-ignore"]:
    if isinstance(lint,dict) and k in lint:
        print(f"ruff.lint.{k} =", lint[k])
PY
fi

echo "== Searching for ruff invocations =="
rg -n --hidden "ruff\\s+(check|format)" .github/workflows 2>/dev/null || true
rg -n --hidden "ruff\\s+(check|format)" . 2>/dev/null | head -n 100 || true

echo "== Searching for mentions of F704/PLE1142 =="
rg -n "F704|PLE1142" . 2>/dev/null | head -n 100 || true

Repository: Exabyte-io/api-examples

Length of output: 221


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="examples/workflow/get_workflows.ipynb"
echo "== File: $FILE =="
test -f "$FILE" || { echo "Missing file"; exit 1; }

echo "== Show notebook excerpt (lines 15-90) =="
# use sed directly for JSON excerpt
sed -n '15,90p' "$FILE" | head -n 200

echo "== Extract code-cell sources containing 'await' (and show code-cell index) =="
python3 - <<'PY'
import json, re
path="examples/workflow/get_workflows.ipynb"
with open(path,'r',encoding='utf-8') as f:
    nb=json.load(f)

cells=nb.get("cells",[])
print("code_cell_count:", sum(1 for c in cells if c.get("cell_type")=="code"))
hits=[]
for i,cell in enumerate(cells):
    if cell.get("cell_type")!="code":
        continue
    src=cell.get("source",[])
    text="".join(src) if isinstance(src,list) else str(src)
    if re.search(r'\bawait\b', text):
        hits.append((i,text))
print("== Code cells with any 'await' ==")
for idx,text in hits:
    print(f"\n--- code_cell_index={idx} ---")
    print(text)
PY

echo "== Search repo for Ruff config/CI =="
# list common ruff config files
fd -t f -i 'pyproject.toml' . | head -n 50 || true
fd -t f -i '.ruff.toml' . | head -n 50 || true
fd -t f -i 'ruff*.toml' . | head -n 50 || true

# show ruff invocation in CI workflows
rg -n --hidden --glob '!**/node_modules/**' "ruff\\s+(check|format)" .github/workflows 2>/dev/null || true
rg -n --hidden --glob '!**/node_modules/**' "Ruff" .github/workflows 2>/dev/null || true

echo "== Dump Ruff relevant config (if pyproject.toml exists) =="
if [ -f "pyproject.toml" ]; then
  python3 - <<'PY'
import tomllib, pathlib
p=pathlib.Path("pyproject.toml")
data=tomllib.loads(p.read_text("utf-8"))
ruff=data.get("tool",{}).get("ruff",{})
print("ruff_present:", bool(ruff))
lint=ruff.get("lint",{})
print("ruff.lint keys:", list(lint.keys()) if isinstance(lint,dict) else None)
for k in ["select","extend-select","ignore","extend-ignore","fixable","unfixable"]:
    if isinstance(lint,dict) and k in lint:
        print(f"ruff.lint.{k} =", lint[k])
# notebook-specific config if present
for k in ["extend-exclude","exclude","src","target-version","format"]:
    if k in ruff:
        pass
PY
fi

echo "== Search for F704 / PLE1142 mentions =="
rg -n "F704|PLE1142" . 2>/dev/null | head -n 80 || true

Repository: Exabyte-io/api-examples

Length of output: 2881


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== pyproject.toml (top-level excerpt) =="
# Print only relevant lines containing 'ruff' or 'lint' or 'ipynb'
rg -n "ruff|ipynb|notebook|jupyter" pyproject.toml ruff.toml .github/workflows 2>/dev/null || true

echo "== ruff.toml =="
if [ -f ruff.toml ]; then
  sed -n '1,200p' ruff.toml
fi

echo "== pyproject.toml ruff-related sections =="
if [ -f pyproject.toml ]; then
  # Extract sections containing [tool.ruff] and related
  # Use python to parse toml keys safely
  python3 - <<'PY'
import tomllib, pathlib
p=pathlib.Path("pyproject.toml")
data=tomllib.loads(p.read_text("utf-8"))
tool=data.get("tool",{})
ruff=tool.get("ruff")
print("tool.ruff present:", bool(ruff))
if ruff:
    print("tool.ruff keys:", list(ruff.keys()))
    # print nested lint/select/extend-select/ignore if exist
    lint=ruff.get("lint")
    if isinstance(lint,dict):
        for k in ["select","extend-select","ignore","extend-ignore","per-file-ignores","extend-per-file-ignores","exclude","fixable"]:
            if k in lint:
                print("lint."+k+" =", lint[k])
    target=ruff.get("target-version")
    print("target-version:", target)
PY
  # show any [tool.ruff.*] blocks more directly
  # best-effort: show lines around tool.ruff occurrences
  rg -n "^\[tool\.ruff" pyproject.toml || true
  # Print limited ranges around those matches
  for line in $(rg -n "^\[tool\.ruff" pyproject.toml | cut -d: -f1); do
    start=$((line-5)); end=$((line+60));
    echo "--- pyproject.toml around line $line ---"
    sed -n "${start},${end}p" pyproject.toml
  done
fi

echo "== Search CI/workflows for ruff & ipynb =="
rg -n --hidden --glob '!**/node_modules/**' "ruff\\s+(check|format)|ruff\\.check|ruff\\.format|Ruff\\b" .github/workflows . 2>/dev/null | head -n 200 || true
rg -n --hidden --glob '!**/node_modules/**' "ipynb" .github/workflows . 2>/dev/null | head -n 200 || true

echo "== Search for any local scripts invoking ruff on notebooks =="
rg -n --hidden --glob '!**/node_modules/**' "ipynb.*ruff|ruff.*ipynb|nbconvert|jupyter|jupytext" . 2>/dev/null | head -n 200 || true

Repository: Exabyte-io/api-examples

Length of output: 50379


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== pyproject.toml / ruff.toml: locate Ruff+ipynb related text =="
rg -n "ruff|ipynb|notebook|jupyter" pyproject.toml ruff.toml .github/workflows 2>/dev/null || true

echo "== ruff.toml (first 200 lines) =="
if [ -f ruff.toml ]; then
  sed -n '1,220p' ruff.toml
else
  echo "No ruff.toml"
fi

echo "== pyproject.toml: parse tool.ruff =="
if [ -f pyproject.toml ]; then
  python3 - <<'PY'
import tomllib, pathlib, json
data=tomllib.loads(pathlib.Path("pyproject.toml").read_text("utf-8"))
ruff=data.get("tool",{}).get("ruff")
print("tool.ruff present:", bool(ruff))
if not ruff:
    raise SystemExit
print("tool.ruff keys:", list(ruff.keys()))
lint=ruff.get("lint")
if isinstance(lint,dict):
    for k in ["select","extend-select","ignore","extend-ignore","per-file-ignores","extend-per-file-ignores","exclude","fixable","unfixable"]:
        if k in lint:
            v=lint[k]
            if isinstance(v,(dict,list,str,int,float,bool)):
                print("lint."+k+" =", v if not isinstance(v,dict) else list(v.keys()))
            else:
                print("lint."+k+" type:", type(v))
    if "per-file-ignores" in lint:
        print("lint.per-file-ignores =", lint["per-file-ignores"])
if "extend-exclude" in ruff:
    print("tool.ruff.extend-exclude =", ruff["extend-exclude"])
if "exclude" in ruff:
    print("tool.ruff.exclude =", ruff["exclude"])
PY

  echo "== pyproject.toml: show [tool.ruff] blocks with limited context =="
  rg -n "^\[tool\.ruff" pyproject.toml || true
  for line in $(rg -n "^\[tool\.ruff" pyproject.toml | cut -d: -f1); do
    start=$((line-5)); end=$((line+80));
    echo "--- around line $line ---"
    sed -n "${start},${end}p" pyproject.toml
  done
fi

echo "== CI/workflows: find ruff invocations =="
rg -n --hidden --glob '!**/node_modules/**' "ruff\\s+(check|format)|Ruff\\b" .github/workflows 2>/dev/null || true

echo "== Repo-wide: find ipynb mentions =="
rg -n --hidden --glob '!**/node_modules/**' "ipynb" . 2>/dev/null | head -n 200 || true

echo "== Repo-wide: find F704 / PLE1142 =="
rg -n "F704|PLE1142" . 2>/dev/null | head -n 200 || true

Repository: Exabyte-io/api-examples

Length of output: 48265


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Ruff config files exist? =="
ls -la pyproject.toml ruff.toml 2>/dev/null || true

echo "== Extract Ruff-related config from ruff.toml (if present) =="
if [ -f ruff.toml ]; then
  sed -n '1,260p' ruff.toml
fi

echo "== Extract Ruff-related sections from pyproject.toml (if present) =="
if [ -f pyproject.toml ]; then
  # show relevant lines around tool.ruff occurrences
  rg -n "^\[tool\.ruff" pyproject.toml || true
  rg -n "tool\.ruff|ipynb|notebook" pyproject.toml || true
fi

echo "== Parse tool.ruff from pyproject.toml (if present) =="
if [ -f pyproject.toml ]; then
  python3 - <<'PY'
import tomllib, pathlib
p=pathlib.Path("pyproject.toml")
data=tomllib.loads(p.read_text("utf-8"))
ruff=data.get("tool",{}).get("ruff")
print("tool.ruff:", bool(ruff))
if ruff:
    print("keys:", list(ruff.keys()))
    print("lint:", ruff.get("lint",{}))
    # print full lint section only if small enough
PY
fi

echo "== Search CI/workflows for ruff commands =="
rg -n --hidden --glob '!**/node_modules/**' "ruff\\s+(check|format)|Ruff\\b" .github/workflows 2>/dev/null || true

echo "== Search repo for ipynb linting patterns =="
rg -n --hidden --glob '!**/node_modules/**' "ipynb.*ruff|ruff.*ipynb|jupytext|nbqa" . 2>/dev/null || true

Repository: Exabyte-io/api-examples

Length of output: 2309


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Files =="
ls -la pyproject.toml ruff.toml .github/workflows 2>/dev/null || true

echo "== Show ruff.toml =="
if [ -f ruff.toml ]; then
  sed -n '1,260p' ruff.toml
fi

echo "== Show relevant parts of pyproject.toml for tool.ruff =="
if [ -f pyproject.toml ]; then
  rg -n "^\[tool\.ruff" pyproject.toml || true
  # Print ruff blocks with limited context
  for line in $(rg -n "^\[tool\.ruff" pyproject.toml | cut -d: -f1); do
    start=$((line-5)); end=$((line+120));
    echo "--- around line $line ---"
    sed -n "${start},${end}p" pyproject.toml
  done
fi

echo "== Search workflows for ruff =="
rg -n --hidden --glob '!**/node_modules/**' "ruff\\s+(check|format)|Ruff\\b" .github/workflows 2>/dev/null || true

echo "== Search config for ipynb handling (include/exclude/extend-include/etc) =="
if [ -f ruff.toml ]; then
  rg -n "ipynb|notebook|jupyter" ruff.toml || true
fi
if [ -f pyproject.toml ]; then
  rg -n "ipynb|notebook|jupyter|extend-include|extend-exclude|include|exclude" pyproject.toml || true
fi

echo "== Search repo for F704 / PLE1142 again (wider) =="
rg -n "F704|PLE1142" . 2>/dev/null || true

Repository: Exabyte-io/api-examples

Length of output: 3304


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Files =="
ls -la pyproject.toml ruff.toml .github/workflows 2>/dev/null || true

echo "== Show ruff.toml =="
if [ -f ruff.toml ]; then
  sed -n '1,260p' ruff.toml
fi

echo "== Show relevant parts of pyproject.toml for tool.ruff =="
if [ -f pyproject.toml ]; then
  rg -n "^\[tool\.ruff" pyproject.toml || true
  # Print ruff blocks with limited context
  for line in $(rg -n "^\[tool\.ruff" pyproject.toml | cut -d: -f1); do
    start=$((line-5)); end=$((line+120));
    echo "--- around line $line ---"
    sed -n "${start},${end}p" pyproject.toml
  done
fi

echo "== Search workflows for ruff =="
rg -n --hidden --glob '!**/node_modules/**' "ruff\\s+(check|format)|Ruff\\b" .github/workflows 2>/dev/null || true

echo "== Search config for ipynb handling (include/exclude/extend-include/etc) =="
if [ -f ruff.toml ]; then
  rg -n "ipynb|notebook|jupyter" ruff.toml || true
fi
if [ -f pyproject.toml ]; then
  rg -n "ipynb|notebook|jupyter|extend-include|extend-exclude|include|exclude" pyproject.toml || true
fi

echo "== Search repo for F704 / PLE1142 again (wider) =="
rg -n "F704|PLE1142" . 2>/dev/null || true

Repository: Exabyte-io/api-examples

Length of output: 3304


Avoid top-level await in examples/workflow/get_workflows.ipynb unless the lint/export pipeline understands notebook async

Line 26 and Line 37 contain await in notebook code cells (await install_packages("api"), await authenticate()). This is valid in Jupyter, but becomes invalid syntax if the cells are linted as plain Python scripts. The exact CI impact depends on whether notebooks are included in the Ruff lint path (directly or via rendered/exported .py output).

Also applies to: 30-38

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@examples/workflow/get_workflows.ipynb` around lines 18 - 27, Top-level await
calls (install_packages and authenticate) must be converted so the notebook can
be linted as plain Python: replace lines using "await install_packages('api')"
and "await authenticate()" with a synchronous entry (import asyncio) and either
wrap them in an async def main() and call asyncio.run(main()) or call
asyncio.run(install_packages("api")) / asyncio.run(authenticate()); ensure you
add "import asyncio" and keep the original function names (install_packages,
authenticate) unchanged.

Comment thread examples/workflow/qe_scf_calculation.ipynb Outdated
Comment thread examples/workflow/qe_scf_calculation.ipynb Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant