Skip to content

Add faqs, and fix microos kubeovn networkpolicy blocks liveness probes#164

Open
typhoonzero wants to merge 1 commit intomasterfrom
troubleshoot_kubeflow_networkpolicy_with_kubeovn_join
Open

Add faqs, and fix microos kubeovn networkpolicy blocks liveness probes#164
typhoonzero wants to merge 1 commit intomasterfrom
troubleshoot_kubeflow_networkpolicy_with_kubeovn_join

Conversation

@typhoonzero
Copy link
Contributor

@typhoonzero typhoonzero commented Mar 24, 2026

Summary by CodeRabbit

  • Documentation
    • Added FAQ section with Kubeflow installation and operational guides
    • New guidance for Pod Security Admission configuration, login platform customization, external S3/MinIO setup, GPU resource type configuration, and kube-ovn Pod startup troubleshooting

@coderabbitai
Copy link

coderabbitai bot commented Mar 24, 2026

Walkthrough

Added a comprehensive FAQ section to Kubeflow installation documentation covering Pod Security Admission label adjustments, oauth2-proxy OIDC URL configuration, S3/MinIO integration for pipeline runs, GPU resource customization, and kube-ovn Pod startup troubleshooting.

Changes

Cohort / File(s) Summary
FAQ and Troubleshooting Content
docs/en/installation/kubeflow.mdx
Added six new operational guides: PSA label configuration for restricted pod-security, oauth2-proxy platform address setup via ModuleInfo, external S3/MinIO pipeline root configuration, GPU resource type customization, and kube-ovn network timeout troubleshooting with NetworkPolicy manifest.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

Possibly related PRs

  • Add kubeflow deploy doc #95: Modifies the same Kubeflow installation file with overlapping oauth2-proxy, kfp, and networking configuration guidance.
  • AI-None improve 1.4 install #26: Addresses Pod Security Admission remediation for workloads, complementing the PSA label adjustment guidance in this PR.
  • Fix kubeflow install doc #109: Covers oauth2-proxy and OIDC auth URL configuration for Kubeflow, related to the ModuleInfo oidcAuthURL setup in this PR.

Poem

🐰 A hop through the docs, so clear and bright,
Pod security, oauth, GPU might!
MinIO secrets and kube-ovn care,
FAQ wisdom now floating in air!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title partially addresses the changeset—it mentions 'faqs' and the kubeovn networkpolicy issue, but omits other significant additions like PSA labels, OIDC configuration, GPU customization, and S3/MinIO pipeline configuration.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch troubleshoot_kubeflow_networkpolicy_with_kubeovn_join

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/en/installation/kubeflow.mdx`:
- Around line 380-424: Fix the spelling/grammar in the Kubeflow Notebook GPU
config section: change "resouce" to "resource" and "these hardware" to "this
hardware" in the introductory paragraph, replace "can not" with "cannot" (or
"can't") in the NOTE line, and consider changing "NOTE, you can only" to "NOTE:
you can only" for punctuation consistency; update any corresponding text around
the "gpus", "value", and "vendors" blocks so the wording is clear and
grammatically correct.
- Around line 341-378: Correct two spelling typos in the Kubeflow S3/MinIO docs:
replace "configuation" with "configuration" (occurrence near the sentence
starting "When you installed Kubeflow...") and replace "configration" with
"configuration" (occurrence in the last sentence starting "After add this
configmap...") in docs/en/installation/kubeflow.mdx so the documentation reads
correctly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5453358a-2b16-43c7-b82f-0ecf323670cc

📥 Commits

Reviewing files that changed from the base of the PR and between 904d33b and 40d0a64.

📒 Files selected for processing (1)
  • docs/en/installation/kubeflow.mdx

Comment on lines +341 to +378
### How to start a Kubeflow Pipeline Run with external S3/MinIO storage

When you installed Kubeflow with an external S3/MinIO storage service, you need to add a "KFP Launcher" configmap to setup storage used by current namespace or user. You can checkout Kubeflow document https://www.kubeflow.org/docs/components/pipelines/operator-guides/configure-object-store/#s3-and-s3-compatible-provider for more details. If no configuation is set, the pipeline runs may still accessing the default service address like "minio-service.kubeflow:9000" which is not correct.

Below is a simple sample for you to start:

```yaml
apiVersion: v1
data:
defaultPipelineRoot: s3://mlpipeline
providers: |-
s3:
default:
endpoint: minio.minio-system.svc:80
disableSSL: true
region: us-east-2
forcePathStyle: true
credentials:
fromEnv: false
secretRef:
secretName: mlpipeline-minio-artifact
accessKeyKey: accesskey
secretKeyKey: secretkey
kind: ConfigMap
metadata:
name: kfp-launcher
namespace: wy-testns
```

For example, you should setup below values in this configmap to point to your own S3/MinIO storage

- defaultPipelineRoot: where to store the pipeline intermediate data
- endpoint: s3/MinIO service endpoint. Note, should NOT start with "http" or "https"
- disableSSL: whether disable "https" access to the endpoint
- region: s3 region. If using MinIO, any value will be fine
- credentials: AK/SK in the secrets

After add this configmap, the newly started Kubeflow Pipeline Runs will automatically read this configration, and save stuff that is used by Kubeflow Pipeline.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix spelling errors in S3/MinIO configuration section.

The ConfigMap structure and guidance are correct, but there are two typos:

  • Line 343: "configuation" should be "configuration"
  • Line 378: "configration" should be "configuration"
📝 Proposed fix for spelling errors
-If no configuation is set, the pipeline runs may still accessing the default service address like "minio-service.kubeflow:9000" which is not correct.
+If no configuration is set, the pipeline runs may still accessing the default service address like "minio-service.kubeflow:9000" which is not correct.
-After add this configmap, the newly started Kubeflow Pipeline Runs will automatically read this configration, and save stuff that is used by Kubeflow Pipeline.
+After add this configmap, the newly started Kubeflow Pipeline Runs will automatically read this configuration, and save stuff that is used by Kubeflow Pipeline.
🧰 Tools
🪛 LanguageTool

[grammar] ~343-~343: Ensure spelling is correct
Context: ...atible-provider for more details. If no configuation is set, the pipeline runs may still acc...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[grammar] ~378-~378: Ensure spelling is correct
Context: ...eline Runs will automatically read this configration, and save stuff that is used by Kubeflow...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/en/installation/kubeflow.mdx` around lines 341 - 378, Correct two
spelling typos in the Kubeflow S3/MinIO docs: replace "configuation" with
"configuration" (occurrence near the sentence starting "When you installed
Kubeflow...") and replace "configration" with "configuration" (occurrence in the
last sentence starting "After add this configmap...") in
docs/en/installation/kubeflow.mdx so the documentation reads correctly.

Comment on lines +380 to +424
### Configure Kubeflow Notebook to use custom GPU resources

You can add other GPU resouce types so that Kubeflow Notebook web page can create instances leveraging these hardware, e.g. when using Ascend GPUs.

Edit the configmap by running this command:

```shell
kubectl -n kubeflow get configmap | grep jupyter-web-app-config
kubectl -n kubeflow edit configmap jupyter-web-app-config-<actual-cm-suffix>
```

Find below section and add your GPU resource types like "your-custom.com/gpu".

> NOTE, you can only add resource types using integer values, like 1,2,4,8. Also, you can not add "Virtual" or "Shared" GPU resources using both "Cores" and "Memory" like when you are using HAMi.

```yaml
################################################################
# GPU/Device-Plugin Resources
################################################################
gpus:
readOnly: false

# configs for gpu/device-plugin limits of the container
# https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/#using-device-plugins
value:
# the `limitKey` of the default vendor
# (to have no default, set as "")
vendor: ""

# the list of available vendors in the dropdown
# `limitsKey` - what will be set as the actual limit
# `uiName` - what will be displayed in the dropdown UI
vendors:
- limitsKey: "nvidia.com/gpu"
uiName: "NVIDIA"
- limitsKey: "amd.com/gpu"
uiName: "AMD"
- limitsKey: "habana.ai/gaudi"
uiName: "Intel Gaudi"
- limitsKey: "your-custom.com/gpu"
uiName: "Your Custom Vendor"
# the default value of the limit
# (possible values: "none", "1", "2", "4", "8")
num: "none"
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix spelling and grammar errors in GPU configuration section.

The kubectl commands and ConfigMap structure are correct, but there are several typos and grammar issues:

  • Line 382: "resouce" should be "resource"
  • Line 382: "these hardware" should be "this hardware"
  • Line 393: "can not" should be "cannot" (or "can't")
📝 Proposed fixes for spelling and grammar
-You can add other GPU resouce types so that Kubeflow Notebook web page can create instances leveraging these hardware, e.g. when using Ascend GPUs.
+You can add other GPU resource types so that Kubeflow Notebook web page can create instances leveraging this hardware, e.g. when using Ascend GPUs.
-> NOTE, you can only add resource types using integer values, like 1,2,4,8. Also, you can not add "Virtual" or "Shared" GPU resources using both "Cores" and "Memory" like when you are using HAMi.
+> NOTE, you can only add resource types using integer values, like 1,2,4,8. Also, you cannot add "Virtual" or "Shared" GPU resources using both "Cores" and "Memory" like when you are using HAMi.
🧰 Tools
🪛 LanguageTool

[grammar] ~382-~382: Ensure spelling is correct
Context: ...om GPU resources You can add other GPU resouce types so that Kubeflow Notebook web pag...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[grammar] ~382-~382: Ensure spelling is correct
Context: ...eb page can create instances leveraging these hardware, e.g. when using Ascend GPUs. ...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[style] ~393-~393: Unless you want to emphasize “not”, use “cannot” which is more common.
Context: ...integer values, like 1,2,4,8. Also, you can not add "Virtual" or "Shared" GPU resources...

(CAN_NOT_PREMIUM)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/en/installation/kubeflow.mdx` around lines 380 - 424, Fix the
spelling/grammar in the Kubeflow Notebook GPU config section: change "resouce"
to "resource" and "these hardware" to "this hardware" in the introductory
paragraph, replace "can not" with "cannot" (or "can't") in the NOTE line, and
consider changing "NOTE, you can only" to "NOTE: you can only" for punctuation
consistency; update any corresponding text around the "gpus", "value", and
"vendors" blocks so the wording is clear and grammatically correct.

@cloudflare-workers-and-pages
Copy link

Deploying alauda-ai with  Cloudflare Pages  Cloudflare Pages

Latest commit: 40d0a64
Status: ✅  Deploy successful!
Preview URL: https://6b731ccc.alauda-ai.pages.dev
Branch Preview URL: https://troubleshoot-kubeflow-networ.alauda-ai.pages.dev

View logs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant