Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions .claude/guidelines/logging.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Logging - Logger Levels and Sentry Noise

This guideline covers how to choose `Logger` levels in the Lightning Phoenix app so that real bugs stay visible and Sentry stays signal-rich.

## Level Policy

- **`Logger.info` / `Logger.warning`** - expected, user-actionable, or transient conditions. Examples: a user must re-authorise a credential, a project is misconfigured, an upstream provider returned 429/503.
- **`Logger.error`** - reserve for genuine application faults and invariant violations (the things you actually want to be paged about).

## Why It Matters

Sentry's `LoggerHandler` is configured with `level: :error` and `capture_log_messages: true` in `lib/lightning/application.ex`. That means **every `Logger.error` becomes a Sentry error event** - including bare log messages, not just exceptions.

Using `:error` for non-faults creates Sentry noise that buries real bugs. Downgrading to `info`/`warning` keeps the line visible in the logs (important for self-hosted / logs-only operators - the plain-text formatter in `config/config.exs` includes `:run_id`) without paging anyone.

## Single Log Site

Log a given condition at exactly **one** site, not at every layer it propagates through.

For credential resolution that site is `Lightning.Credentials.Resolver` (`lib/lightning/credentials/resolver.ex`) - the run-credential-resolution layer - **not** `LightningWeb.Channels.RunChannel`, which only formats the error and replies.

## Worked Example: Credential Resolution

| Reason | Level | Rationale |
| --- | --- | --- |
| `project_not_found` | `error` | Genuine invariant violation |
| `environment_not_configured` | `warning` | User-actionable config |
| `environment_mismatch` | `warning` | User-actionable config |
| `reauthorization_required` | `info` | User / IdP state |
| `temporary_failure` | `info` | Transient provider error |
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,12 @@ and this project adheres to

### Changed

- Stop reporting expected credential-resolution failures (OAuth re-auth needed,
misconfigured project environment, transient provider errors) to Sentry. These
are now logged once, in `Lightning.Credentials.Resolver`, at `info`/`warning`
instead of `error`; only a genuinely missing project still logs at `error`.
[#4814](https://github.com/OpenFn/lightning/issues/4814)

### Fixed

- Channel join crashes when multiple users open the same workflow concurrently
Expand Down
1 change: 1 addition & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -210,4 +210,5 @@ Detailed guidelines in `.claude/guidelines/`:
- `e2e-testing.md` - Playwright E2E testing
- `yex-guidelines.md` - Critical Yex (Yjs/Elixir) usage rules
- `toast-notifications.md` - Notification patterns
- `logging.md` - Logger level conventions and Sentry noise (info/warning vs error)
- `ui-patterns.md` - Button variants, disabled states, Tailwind conventions
35 changes: 28 additions & 7 deletions lib/lightning/credentials/resolver.ex
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,7 @@ defmodule Lightning.Credentials.Resolver do
{:ok, ResolvedCredential.from(credential, body)}

{:error, reason} ->
log_resolution_error(reason)
{:error, {reason, credential}}
end
end
Expand All @@ -105,14 +106,11 @@ defmodule Lightning.Credentials.Resolver do
{:ok, ResolvedCredential.from(credential, body)}

{:error, :environment_not_found} ->
Logger.error(
"Credential environment does not match project environment",
project_env: project_env
)

log_resolution_error(:environment_mismatch, project_env: project_env)
{:error, {:environment_mismatch, credential}}

{:error, reason} ->
log_resolution_error(reason)
{:error, {reason, credential}}
end
end
Expand Down Expand Up @@ -147,11 +145,11 @@ defmodule Lightning.Credentials.Resolver do
{:ok, env}

%Project{env: nil} ->
Logger.error("Project has no environment configured")
log_resolution_error(:environment_not_configured)
{:error, :environment_not_configured}

nil ->
Logger.error("Project not found for run")
log_resolution_error(:project_not_found)
{:error, :project_not_found}
end
end
Expand Down Expand Up @@ -216,4 +214,27 @@ defmodule Lightning.Credentials.Resolver do
nil
end
end

defp log_resolution_error(reason, metadata \\ [])

defp log_resolution_error(:project_not_found, meta),
do: Logger.error("Project not found for run", meta)

defp log_resolution_error(:environment_not_configured, meta),
do: Logger.warning("Project has no environment configured", meta)

defp log_resolution_error(:environment_mismatch, meta),
do:
Logger.warning(
"Credential environment does not match project environment",
meta
)

defp log_resolution_error(:reauthorization_required, meta),
do: Logger.info("OAuth refresh token has expired", meta)

defp log_resolution_error(:temporary_failure, meta),
do: Logger.info("Could not reach the OAuth provider", meta)

defp log_resolution_error(_reason, _meta), do: :ok
end
15 changes: 1 addition & 14 deletions lib/lightning_web/channels/run_channel.ex
Original file line number Diff line number Diff line change
Expand Up @@ -486,8 +486,6 @@ defmodule LightningWeb.RunChannel do
_project_id,
_run_id
) do
Logger.error("Project has no environment configured")

error =
LightningWeb.ErrorFormatter.format(:environment_not_configured, %{
project: socket.assigns.project_id
Expand All @@ -503,8 +501,6 @@ defmodule LightningWeb.RunChannel do
_project_id,
_run_id
) do
Logger.error("Project not found for run")

error = LightningWeb.ErrorFormatter.format(:project_not_found, %{})
{:reply, {:error, error}, socket}
end
Expand All @@ -519,11 +515,6 @@ defmodule LightningWeb.RunChannel do
project_env =
Lightning.Projects.get_project!(socket.assigns.project_id).env || "unknown"

Logger.error(
"Credential environment does not match project environment",
project_env: project_env
)

error =
LightningWeb.ErrorFormatter.format(
{:environment_mismatch, credential},
Expand All @@ -540,8 +531,6 @@ defmodule LightningWeb.RunChannel do
_project_id,
_run_id
) do
Logger.error("OAuth refresh token has expired")

error =
LightningWeb.ErrorFormatter.format(reason, %{
project: socket.assigns.project_id
Expand All @@ -557,9 +546,7 @@ defmodule LightningWeb.RunChannel do
_project_id,
_run_id
) do
Logger.error("Could not reach the oauth provider")

{:reply, {:error, "Could not reach the oauth provider. Try again later"},
{:reply, {:error, "Could not reach the OAuth provider. Try again later"},
socket}
end
end
2 changes: 1 addition & 1 deletion priv/gettext/en/LC_MESSAGES/errors.po
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ msgstr "Please change your email"

msgid "oauth_reauth_required"
msgstr ""
"Oauth token has expired. Reauthorize with your external system:"
"OAuth token has expired. Reauthorize with your external system:"
" 1. Go to %{credentials_url}"
" 2. Find %{credential_name}"
" 3. Click \"Edit\" and then \"Reauthorize\""
Expand Down
141 changes: 137 additions & 4 deletions test/lightning/credentials/resolver_test.exs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,10 @@ defmodule ResolverTest do

alias Lightning.Credentials.Resolver

require Logger

import Lightning.Factories
import ExUnit.CaptureLog

describe "resolve_credential/1 with regular credential" do
test "returns ResolvedCredential with credential body" do
Expand Down Expand Up @@ -248,10 +251,14 @@ defmodule ResolverTest do

credential = Repo.preload(credential, :oauth_client)

assert {:error, {:reauthorization_required, credential}} =
Resolver.resolve_credential(credential)
{result, log} =
capture_info_log(fn -> Resolver.resolve_credential(credential) end)

assert {:error, {:reauthorization_required, credential}} = result
assert credential.name == "Test Googlesheets Credential"

assert log =~ "[info]"
assert log =~ "OAuth refresh token has expired"
end

test "when refresh fails with rate limit returns temporary_failure error", %{
Expand Down Expand Up @@ -282,8 +289,13 @@ defmodule ResolverTest do

credential = Repo.preload(credential, :oauth_client)

assert {:error, {:temporary_failure, _credential}} =
Resolver.resolve_credential(credential)
{result, log} =
capture_info_log(fn -> Resolver.resolve_credential(credential) end)

assert {:error, {:temporary_failure, _credential}} = result

assert log =~ "[info]"
assert log =~ "Could not reach the OAuth provider"
end

test "when refresh fails with other error returns generic error", %{
Expand Down Expand Up @@ -725,5 +737,126 @@ defmodule ResolverTest do
assert {:error, :not_found} =
Resolver.resolve_credential(run, other_credential.id)
end

test "logs environment_mismatch at warning level when credential lacks the project environment body",
%{user: user} do
project =
insert(:project, env: "staging", project_users: [%{user: user}])

credential =
insert(:credential, user: user, name: "Mismatch Credential")
|> with_body(%{name: "main", body: %{"key" => "value"}})

%{jobs: [job]} =
workflow =
build(:workflow, project: project)
|> with_job(%{
project_credential: %{credential: credential, project: project}
})
|> insert()

dataclip = insert(:dataclip)

%{runs: [run]} =
insert(:workorder, workflow: workflow)
|> with_run(%{dataclip: dataclip, starting_job: job})

{result, log} =
with_log(fn -> Resolver.resolve_credential(run, credential.id) end)

assert {:error, {:environment_mismatch, _credential}} = result

assert log =~ "[warning]"
assert log =~ "Credential environment does not match project environment"
end

test "logs environment_not_configured at warning level for a non-root project with no env",
%{user: user} do
parent = insert(:project, env: "main")

project =
insert(:project,
parent_id: parent.id,
env: nil,
project_users: [%{user: user}]
)

credential =
insert(:credential, user: user, name: "Unconfigured Env Credential")
|> with_body(%{name: "main", body: %{"key" => "value"}})

%{jobs: [job]} =
workflow =
build(:workflow, project: project)
|> with_job(%{
project_credential: %{credential: credential, project: project}
})
|> insert()

dataclip = insert(:dataclip)

%{runs: [run]} =
insert(:workorder, workflow: workflow)
|> with_run(%{dataclip: dataclip, starting_job: job})

{result, log} =
with_log(fn -> Resolver.resolve_credential(run, credential.id) end)

assert {:error, {:environment_not_configured, nil}} = result

assert log =~ "[warning]"
assert log =~ "Project has no environment configured"
end

test "logs project_not_found at error level when the run's project is missing" do
# No project_users / project_credentials so the project can be removed
# without tripping restrict FKs; project lookup is what we exercise.
project = insert(:project)

%{jobs: [job]} =
workflow =
build(:workflow, project: project)
|> with_job()
|> insert()

dataclip = insert(:dataclip)

%{runs: [run]} =
insert(:workorder, workflow: workflow)
|> with_run(%{dataclip: dataclip, starting_job: job})

# Remove the project so the run's in-memory struct resolves to a
# missing project (workflow/workorder/run cascade-delete, but the
# struct still drives get_project_for_run/1 -> nil).
Repo.delete_all(
from(p in Lightning.Projects.Project, where: p.id == ^project.id)
)

fake_credential_id = Ecto.UUID.generate()

{result, log} =
with_log(fn -> Resolver.resolve_credential(run, fake_credential_id) end)

assert {:error, {:project_not_found, nil}} = result

assert log =~ "[error]"
assert log =~ "Project not found for run"
end
end

# The test logger level is :warning (see config/test.exs), which gates
# :info messages at the primary :logger level before any capture handler
# sees them. Per-process levels (Logger.put_process_level/2) can only
# restrict below the primary level, not lift above it, so the primary level
# must be lowered for the duration of the capture and restored afterwards.
defp capture_info_log(fun) do
previous_level = Logger.level()
Logger.configure(level: :info)

try do
with_log([level: :info], fun)
after
Logger.configure(level: previous_level)
end
end
end
2 changes: 1 addition & 1 deletion test/lightning_web/channels/run_channel_test.exs
Original file line number Diff line number Diff line change
Expand Up @@ -654,7 +654,7 @@ defmodule LightningWeb.RunChannelTest do

assert_reply ref,
:error,
"Could not reach the oauth provider. Try again later"
"Could not reach the OAuth provider. Try again later"
end
end

Expand Down
Loading