Skip to content

Port dataset-authoring & deployment functionality from the core monorepo #24

@LNSD

Description

@LNSD

Summary

The TypeScript SDK that currently lives in-tree in the Rust monorepo (edgeandnode/amp under typescript/amp/) is going to be removed. The dataset-authoring & deployment functionality it contains needs to be ported here, since amp-typescript does not yet implement it.

amp-typescript already contains (and expands) the read/query SDK: the Arrow Flight client (query/streamQuery/explain), the Admin API client (richer than the monorepo version), the Registry HTTP API client, Auth (PKCE + device flow), protocol-stream, and the query + auth CLI commands. The gap is the amp build / register / deploy / publish / dev / proxy workflow driven by amp.config.ts.

Important nuance for whoever picks this up

The domain schemas already exist here in packages/amp/src/core/domain.ts (DatasetConfig, TableDefinition, FunctionDefinition, FunctionSource, DatasetMetadata, DatasetDerived, DatasetManifest, …), and the Admin API client (packages/amp/src/admin/api.ts) and Registry HTTP API client (packages/amp/src/registry/api.ts, incl. publishMyDataset / publishMyDatasetVersion) are present. So this is about porting the runtime machinery and the CLI that consume those models, not re-declaring the schemas.


Missing functionality to port

A. ABI → SQL authoring DSL — src/config.ts (entirely absent)

Helpers that turn a contract ABI into SQL table definitions:

  • defineDataset(fn) — config entry-point wrapper
  • eventQuery(abiEvent, rpcSource) — builds the evm_decode_log(...) / evm_topic(...) SQL
  • eventTable, eventTables(abi, rpcSource), eventTableName, camelToSnake

Requires abitype (not currently a dependency of packages/amp).

B. Config loader — src/ConfigLoader.ts (absent)

ConfigLoader Effect service:

  • Loads amp.config.{ts,mts,cts,js,mjs,cjs,json} via jiti (already a dependency)
  • Context class with functionSource(relativePath) to read UDF source files relative to the config, with a directory-traversal guard
  • find() — config-file discovery across the candidate list
  • watch()hot-reload stream with config/manifest change detection (Stream.changesWith)
  • build() — load + delegate to ManifestBuilder

C. Manifest builder — src/ManifestBuilder.ts (absent)

ManifestBuilder service: DatasetConfig{ metadata, manifest }. Resolves table schemas by calling the Admin getOutputSchema endpoint (which already exists in the external Admin client) and assembles a DatasetDerived manifest with tables + functions. Includes ManifestBuilderError.

D. Manifest context wiring — src/ManifestContext.ts (absent)

ManifestContext tag + layerFromConfigFile(Option<file>) (find-or-load → build). The glue every authoring CLI command depends on.

E. Registry publish-flow orchestration — src/AmpRegistry.ts (absent)

The HTTP endpoints exist in registry/api.ts, but the publishFlow orchestration is not ported: ownership checks, version-exists checks, DTO mapping (AmpRegistry*Dto), and the tagged errors DatasetOwnershipError, VersionAlreadyExistsError, RegistryApiError, DatasetAlreadyExistsError, DatasetNotFoundError.

F. CLI commands — only auth + query exist here; these 6 are missing

Command What it does
build Build manifest from amp.config.ts; print or -o to file
register Register the dataset definition with Admin under a tag / dev
deploy Deploy a version for extraction; base-dataset terminal-state pre-check + confirm prompt, --end-block, --force
publish Full flow: register → deploy → registry publish; changelog prompt; prints playground URL
dev Watch amp.config.ts, re-register + deploy under dev on each change (hot reload)
proxy Connect↔gRPC proxy in front of the Arrow Flight server (node:http + connectNodeAdapter)

G. Minor / to confirm

  • createClient / createAuthInterceptor (src/index.ts): convenience wrappers, functionally superseded by the service API (ArrowFlight.layer + layerInterceptorToken). Port only if preserving the simpler API shape is desired.
  • amp query cached-token behavior: in the monorepo, query auto-reads the login token from the auth cache when --bearer-token is omitted; here query relies on --token / context values and does not appear to wire the cached AuthInfo in. Worth confirming and closing if desired.

Porting gotchas

  • CLI framework changed: the monorepo uses @effect/cli (Options / Args); this repo uses effect/unstable/cli (Flag / Argument / Command). Commands must be rewritten, not copied.
  • Effect v3 → v4-beta + TS 6: the Schema API differs (Schema.ClassSchema.Struct, .pipe(...).check(...)). The schemas are already migrated in core/domain.ts — reuse them.
  • New dependency: abitype for the DSL (§A).
  • LocalCache.ts is not a gap — the CLI already replicates ~/.amp/cache via KeyValueStore.layerFileSystem in cli.ts.
  • Arrow.ts (apache-arrow schema-gen) is not a gap — this repo replaces it with its own internal/arrow-flight-ipc/* decoder.

Source → target mapping

Monorepo (typescript/amp/src/…) Suggested target (packages/…) Status
config.ts amp/src/authoring/dsl.ts (new) port
ConfigLoader.ts amp/src/authoring/config-loader.ts (new) port
ManifestBuilder.ts amp/src/authoring/manifest-builder.ts (new) port
ManifestContext.ts amp/src/authoring/manifest-context.ts (new) port
AmpRegistry.ts publishFlow + DTOs/errors amp/src/registry/service.ts (new) port
cli/commands/{build,register,deploy,publish,dev,proxy}.ts cli/src/commands/… port
index.ts createClient / createAuthInterceptor amp/src/index.ts optional

(Target paths are suggestions; align with the conventions already established in this repo.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions