Skip to content

Add EEP draft for in-app export visibility#85

Open
happi wants to merge 3 commits into
erlang:masterfrom
happi:add-in-app-export-visibility-eep
Open

Add EEP draft for in-app export visibility#85
happi wants to merge 3 commits into
erlang:masterfrom
happi:add-in-app-export-visibility-eep

Conversation

@happi
Copy link
Copy Markdown

@happi happi commented May 19, 2026

This PR proposes application-scoped function visibility for Erlang by adding -app/1 and -app_export/1 module attributes with BEAM runtime enforcement.\n\nThe goal is to provide a package-private-style boundary for functions that are exported for use inside one application but are not public API for external callers.\n\nThe draft includes semantics for remote calls, apply/3, remote funs, distributed calls, hot code loading, backwards compatibility, security considerations, prior art from EEP 5/67, and implementation notes.\n\nA prototype has been used to validate the basic design, but a clean public implementation fork will be linked later.

Comment thread eeps/eep-0080.md Outdated
application name. Using the OTP application name is recommended, but the
attribute defines a visibility domain. This lets the mechanism remain useful
for code that is not packaged as an OTP application while preserving the common
OTP use case.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also enables having a group of applications that share the same scope. For example Cowlib has many internal functions for Cowboy that should not be used by Cowboy users. A common scope would enforce that.

Worth pointing out that this can be circumvented by having a module use the same scope and proxying functions to that module. I still think there is value since the user has to explicitly use the scope, demonstrating intent to use internal functions.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is true. A module can be placed anywhere and declare -app(my_app), and then it can call functions restricted to my_app.

That is intentional in the current design. The boundary is based on explicit module metadata, not filesystem layout, code path location, or OTP application discovery. I think that is preferable because those alternatives become implicit and brittle, especially with releases, generated code, test modules, vendored code, and custom build systems.

This is not meant to be a security sandbox. It is an encapsulation mechanism for making internal API boundaries explicit and enforceable against accidental use. If someone deliberately adds a false -app(...) attribute to circumvent the boundary, they are already modifying code in the system and should understand that they are opting into that visibility domain.

A module can only declare one app visibility domain, so the attribute still gives tools and the VM a clear, inspectable rule: this module belongs to exactly one visibility group, and app-local calls are allowed only within that group.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A module can only declare one app visibility domain

That seems too limiting to me. E.g. cowbody_module_x declared -app (cowlib) so it can use app-exported functions from cowlib app scope, but than it can't declare it's own scope to use app_exports for it's own functions? Should there be a distinction between -app/1 and e.g. -use_app/1 attribute, where -app/1 is one/module and -use_app/1 is many/module?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could put everything in the app cow if you want.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could, but it would a semantic mess.

What if -app/1 would just be used when you want to use app, and -app_export would take 2 arguments: app and list of exports? That would count both as app definition and app export.

Copy link
Copy Markdown

@essen essen May 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand the argument, or at least have a case where the current proposal is limited.

  • Cowboy and Gun both depend on Cowlib which contains many private functions made specifically for Cowboy, Gun and/or both. If used only by one they should still sit in Cowlib to facilitate testing (e.g. parser+builder of RFC components together).
  • Cowlib defines cow scope.
  • Cowboy and Gun also define the cow scope to use the functions from Cowlib.
  • Problem: Cowboy uses Gun only for testing (and vice versa) so the scope is not enough to ensure Cowboy itself doesn't call Gun functions internally (and vice versa).

Using separate scopes for separate modules helps a little but the problem can still occur. The problem is likely catched elsewhere (xref possibly).

The problem would not happen if a module's scope and the scopes used by the module were separate. Cowboy could have all its modules be in the cowboy scope, Gun in gun and Cowlib in cow, and Cowboy modules that need access to Cowlib (and not to Gun) could say they use the cow scope.

So you'd have the scope of the module, and extra scopes used by the module.

@happi happi force-pushed the add-in-app-export-visibility-eep branch from 27a73a2 to 34b5046 Compare May 19, 2026 12:24
@happi happi force-pushed the add-in-app-export-visibility-eep branch from 34b5046 to 7b58015 Compare May 19, 2026 12:54
happi added a commit to happi/otp-app-export that referenced this pull request May 19, 2026
Implements the runtime side of EEP 80 "BEAM-Level In-App Export
Visibility" as a single squashed patch on top of upstream erlang/otp.
Public spec: erlang/eep#85

Mechanism
=========

Two new module attributes:

    -app(App).
    -app_export([Name/Arity, ...]).

A function listed in -app_export/1 is exported normally, but remote
calls to it succeed only when the caller module declares the same
-app(App) value. Cross-app callers and callers with no -app/1 raise

    error:{badappcall, [{TargetModule, TargetFunction, ArgsOrArity, []}]}

Modules without -app/1 behave exactly as today. Functions not listed
in -app_export/1 behave exactly as today.

Implementation summary
======================

Loader (erts/emulator/beam/beam_load.c)
  Parses the new attributes. Stores the module's app atom on the
  Module struct, and for each app-restricted export sets the
  app_restricted flag plus the app_atom field on the Export entry.

Export and Module state (erts/emulator/beam/export.{c,h},
module.{c,h})
  Adds Eterm app_atom and int app_restricted to Export.
  Adds Eterm app_atom to Module.

Process state (erts/emulator/beam/erl_process.{c,h})
  Adds c_p->caller_app (preserved-caller cache, used for the
  error_handler bounce) and c_p->caller_saved_i (a PC inside the
  calling JIT'd function that call_error_handler / apply() can resolve
  back to the caller's module).

Atoms (erts/emulator/beam/atom.names)
  am_app, am_app_export, am_badappcall, am_undefined as needed.

Interpreter check (erts/emulator/beam/beam_common.c,
emu/instrs.tab, emu/macros.tab, emu/beam_emu.c)
  DISPATCH_EXPORT and DISPATCH_EXPORT_TAILCALL gate the dispatch
  through erts_check_app_visibility, passing the caller PC.
  i_call_ext and the apply opcodes write c_p->caller_saved_i so the
  apply() runtime and call_error_handler can recover the caller after
  a tail call or an error_handler bounce.

apply() / fixed_apply() (beam_common.c)
  Populates c_p->caller_app from caller_saved_i when the immediate
  caller has a real app atom; preserves the cached value when the
  caller is system code (e.g. error_handler).

call_error_handler() (beam_common.c)
  When dispatch hits a stub Export and is about to bounce through
  error_handler:undefined_function/3, capture the original caller's
  app from the topmost Erlang-stack CP and, failing that, from
  c_p->caller_saved_i. This is what makes a direct same-app call to a
  never-loaded restricted function succeed: the first dispatch finds
  ep->app_restricted == 0 (the loader has not run for the target
  module yet), so the gate skips, but the bounce preserves the
  caller's identity for the second visibility check after error_handler
  loads the module and re-applies.

ARM JIT (erts/emulator/beam/jit/arm/{beam_asm.hpp,instr_call.cpp,
instr_fun.cpp})
  emit_check_app_visibility calls erts_check_app_visibility_jit with
  the caller module atom baked in as an immediate.
  emit_app_visibility_gate wraps the check with the app_restricted
  fast-path skip and writes c_p->caller_saved_i = adr(gate_pc) before
  the cbz so call_error_handler has a PC fallback.
  i_call_ext, i_call_ext_only, both bypass branches of
  move_call_ext_last, and the apply paths (variable_apply / fixed_apply)
  route their dispatch through the gate.
  emit_call_fun tests FUN_HEADER_KIND_OFFS and gates external funs
  (fun M:F/A); local funs and closures skip the gate (their bodies
  are JIT'd in the originating module and the next call_ext inside the
  body is already gated against that module's app).

x86 JIT (erts/emulator/beam/jit/x86/{beam_asm.hpp,instr_call.cpp,
instr_fun.cpp})
  emit_check_app_visibility / emit_save_app_visibility_caller_pc /
  emit_check_external_fun_app_visibility provide the same surface as
  the ARM helpers and are wired into the equivalent x86 emitters.

Compiler tooling (lib/stdlib/src/erl_lint.erl,
lib/compiler/src/v3_core.erl)
  Recognises -app/1 and -app_export/1 attributes and warns when
  -app_export/1 lists a function not exported or when -app_export/1
  is present without -app/1.

module_info (erts/emulator/beam/erl_bif_info.c)
  Reports app and app_export in module_info(attributes).

Verified scenarios
==================

All paths tested with both -emu_flavor jit and -emu_flavor emu where
applicable (an external escript test_comprehensive plus an internal
validate.erl plus an internal dist_test):

  Direct calls, same-app and cross-app
  Tail calls, same-app and cross-app
  apply/3 with literal args
  Dynamic apply/3 (M, F, A from runtime values)
  Cross-app calls through a stub Export (target not yet loaded), via
    both direct call and apply/3
  Same-app calls through a stub Export
  Caller has no -app at all (shell / escript / erl_eval)
  Cache-pollution sequences (a same-app gated call does not allow a
    later cross-app call to slip through)
  External funs (fun M:F/A) same-app and cross-app
  Closures: same-app closure body that invokes a restricted function
    is allowed; cross-app closure body is blocked at its own call_ext.
  rpc:call between two nodes, with no-app caller and with bar_c-style
    cross-app caller resident on the callee; enforcement happens on
    the callee node using its metadata as the EEP specifies.

Co-authored-by: Erik Stenman <erik@stenmans.org>
@happi happi changed the title Add EEP 80 draft for in-app export visibility Add EEP draft for in-app export visibility May 19, 2026
Comment thread eeps/eep-0080.md Outdated
Comment on lines +22 to +27
The feature adds package-private-style visibility without adding new Erlang
syntax and without changing the behavior of existing modules. Code that does
not use the new attributes behaves as it does today. The proposal is intended
to make internal APIs explicit, prevent accidental coupling between OTP
applications or libraries, and give tools a shared source of visibility
metadata.
Copy link
Copy Markdown

@ieQu1 ieQu1 May 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is the goal, then I don't see why the runtime should enforce it.
It should be sufficient to give this information to the static checking tools, e.g. xref.

Runtime checks are very problematic for various reasons:

  • It complicates live debugging and troubleshooting. Sometimes we have to call internal functions from a remote shell either to query state of a life system or to recover it.
  • Concept of application leaks from the standard library to the VM. I could be wrong, but don't think the VM currenly has any direct knowledge about the applications; rather the application controller uses VM primitives like group leader to implement the concept.
  • Other BEAM languages may have different views on what is considered an application, and how they are organized.
  • Additional costs for every remote call. The document already mentions it, but given the problems above, I don't think it's worth it.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Static tooling would help, and the EEP explicitly expects xref/Dialyzer/language servers to use the metadata. The reason I proposed runtime enforcement is that static checks do not cover all call paths reliably: apply/3, remote funs, dynamically loaded modules, shell-created calls, generated code, mixed compile pipelines, and modules compiled without the checker. Without runtime behavior, the attribute becomes documentation plus optional tooling, which is close to what EEP 67 explored.

On the specific concerns:

Debugging/recovery: I agree this is a real cost. The current design would make intentionally calling an app-restricted function from the shell fail unless the shell/eval context has matching app metadata. One possible mitigation is an explicit unsafe escape hatch for privileged debugging, but I would want that to be clearly outside normal call semantics rather than silently weakening the boundary.

Application concept in the VM: The intent is not for the VM to understand OTP applications or .app files. -app/1 is just an atom carried as module metadata and copied into export metadata. It defines a visibility domain, not an OTP application. That said, the name may be misleading; this is one reason I listed naming as an open question. A name like -visibility_domain/1 or similar might avoid implying that the VM is learning OTP application structure.

Other BEAM languages: This is also why I prefer explicit BEAM/module metadata over deriving anything from OTP application layout, code paths, or build tools. Other languages could either emit the metadata or ignore it. Modules that do not emit it keep current behavior.

Remote-call cost: The prototype numbers suggest public calls are within noise, while allowed restricted calls pay a few ns/op in the x86 JIT microbench. But the larger question is whether runtime enforcement is valuable enough to justify any cost and complexity. That is exactly what I am hoping to validate with the EEP discussion.

So I think the disagreement is not about whether static tooling is useful. It is useful. The question is whether the feature should define enforceable semantics, or remain advisory metadata. My argument for enforcement is that “internal API boundary” is much more predictable when every dynamic call path observes the same rule.

Copy link
Copy Markdown

@ieQu1 ieQu1 May 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static checks do not cover all call paths reliably: apply/3, remote funs, dynamically loaded modules, shell-created calls, generated code

As a library developer, I am not concerned about someone going out of their way to shoot themselves in a foot by calling an internal API. If I was, for some reason, absolutely determined to not allow anyone to call a common function, then I'd put it in an hrl or something.

Remote-call cost: The prototype numbers suggest public calls are within noise, while allowed restricted calls pay a few ns/op in the x86 JIT microbench.

Based on common sense, I presume that the majority of remote calls happen between modules of the same application (scope, namespace, ...), with cross-scope calls being relatively rare. The feature penalizes the former. So if the feature becomes widely used, then we can assume that most calls will be penalized.

@MarkoMin
Copy link
Copy Markdown
Contributor

I think that attribute names are misleading. -app is short for application, but what is described in EEP would be something like -scope or -namespace. -app_export/1 would be suitable if the semantic would be the same as in EEP 67 - meaning that those functions should be used "within the application". I don't like module defining custom -app attributes. I don't even know why is is called like that because it's not tied to application names and it's doesn't restrict access to function based on the application. app in .app.src file and app in modules meaning two very different things would be confusing. The link between attribute names and their semantics is very weak. I think what you described is an API namespacing mechanism.

Also, as I commented, there should be a distinction of when you "use" some app (namespace) and when you define it, e.g. -namespace/1 and -use_namespace/1 should be defined.

Performance: That benchmark results seem really promising. I was thinking about implementation in the VM, but then concluded that it would probably be too slow. However, I think there should be a flag for the BEAM to completely disable those checks in the runtime (i.e. I'd never enforce those, ideally it should be opt-in). I'd also like if you could turn it on/off in runtime without having to restart the VM.

@erszcz
Copy link
Copy Markdown

erszcz commented May 19, 2026

I second the point that if the attribute is or suggests an application, then in this context it'd be assumed to be an OTP application. Calling it a namespace, scope, or boundary (there's prior art in the BEAM ecosystem...), could help clarify it.

It complicates live debugging and troubleshooting. Sometimes we have to call internal functions from a remote shell either to query state of a life system or to recover it.

Strong 👍 I think it's one of BEAM's great strengths that it's just easy to "look under the hood" and do troubleshooting like this. The java-esque private/package/public model is a bit in conflict with that approach.

My argument for enforcement [...]

Since the feature is supposed to be opt-in, we can just choose not to use it in a codebase. The problem is dependencies. Advisory metadata is a strong enough signal for those willing to leverage it. Enforcement on a dependency that's chosen to use the feature, without an escape hatch, could make troubleshooting and fixing dependencies much less pleasant 😬

@josevalim
Copy link
Copy Markdown
Contributor

How will the apply restrictions compose with OTP behaviours? The gen_server module is part of stdlib, but if my application supervisor (my_app_server) has app exports, then gen_server module will likely fail when it calls Mod:init/1. So we'd probably need a linter that -app_export must not overlap with any callback required or optional for behaviours declared by the module.

I additionally voice all concerns brought by @ieQu1. In my opinion this feature fully belongs to a static analyzer. Even if you eventually want to make it part of the VM, I'd suggest to first implement it as a static analysis tool, otherwise it will be a pain to find out only at runtime that I have introduced an exception due to visibility changes. For example, imagine cowlib or elixir starts enforcing those rules and it breaks a dependent. I know in an ideal world folks would not invoke private APIs and that tests should catch any regressions, but those violations should be surface as early as possible, at compile-time, by default.

@happi
Copy link
Copy Markdown
Author

happi commented May 19, 2026

Thanks everyone. I think the discussion has surfaced several separate design questions, and I should treat them separately instead of folding them into one large argument.

First, I agree that -app/1 is probably the wrong name. The proposal is about a visibility domain, not necessarily an OTP application. I will revise the draft toward terminology such as scope or visibility_domain, unless there is a better established BEAM term.

Second, the Cowlib/Cowboy/Gun example shows a real limitation of the current one-domain model. Separating “the domain this module belongs to” from “additional domains this module may use” would express more cases. I still think the single-domain model is the smallest useful semantics, but I will document the two-domain/use-domain model as an alternative or possible extension so the trade-off is explicit.

Third, functions required as callbacks by behaviours outside the visibility domain should use ordinary -export/1. Scoped exports are intended for internal cross-module APIs, not for callbacks invoked by external framework code such as gen_server, supervisor, application, etc. A separate callback-specific export form might be useful in the future, but I do not think this EEP should try to solve that.

Fourth, I still think static-only metadata is a different feature from this proposal. Static tooling is valuable and should use the metadata, but it does not define behaviour for apply/3, remote funs, dynamically loaded modules, generated code, or modules compiled without the checker. The core question is whether Erlang should have enforceable internal API semantics at all. If the answer is yes, the VM needs to participate; if the answer is no, then the proposal becomes advisory metadata and is much closer to EEP 67.

Static analysis could also be used to report what would happen if an existing export were given a narrower scope. That can help maintainers, but it cannot fully answer how library users call the code in the wild. A user calling internal functions would discover quickly that upgrading to a version with stricter visibility breaks their code, which is similar to relying on any undocumented internal API.

The debugging concern is valid. I will think through an explicit unsafe/debug escape hatch rather than silently weakening normal call semantics. In practice, when I need to debug a running system, I often recompile a module with export_all, so this may be solvable as an explicit debug mode rather than part of the normal visibility rule.

@happi
Copy link
Copy Markdown
Author

happi commented May 19, 2026

Small clarification on the debugging point: I think export_all should probably act as a debug escape hatch for scoped exports.

Erlang debugging already has a precedent for explicitly loading code with different visibility during investigation, for example:

c(foo, [export_all]).

That is a deliberate local debugging action, with the usual caveats around code loading and source access in a live system. For example, loading a module again while a process is still running old code can purge the old code and kill that process.

In my experience, if an operator is allowed to run a shell on a live system, they are usually also allowed access to the source code or to a way of producing an equivalent debug build.

@kikofernandez
Copy link
Copy Markdown
Contributor

For general information, OTP is reading the comments and checking the community opinion on this matter.
thank you everyone for the EEP and comments towards it :)

@essen
Copy link
Copy Markdown

essen commented May 20, 2026

Small clarification on the debugging point: I think export_all should probably act as a debug escape hatch for scoped exports.

IMO the shell should have a mode (possibly the default in interactive) that allows calling any functions regardless of scope. It would be far too inconvenient to use the shell for development otherwise.

@ieQu1
Copy link
Copy Markdown

ieQu1 commented May 21, 2026

Some late comments on debugging:

  1. While calling an internal function from the shell of a production system is rather common (provided one studies its logic), I would consider loading code recompiled with "export_all" directive too dangerous due to side effects of code replacement or potentially lost type-guided optimizations for internal functions.
  2. Escape hatches, such as making shell privileged in some way, nullify the benefits of runtime checks, and basically revert security to the honor system we have now, but with extra hoops: I can already hear security researchers typing bugs "combining feature X with feature Y allows to bypass application A's security perimeter".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.