Skip to content

Latest commit

 

History

History
449 lines (319 loc) · 14.9 KB

File metadata and controls

449 lines (319 loc) · 14.9 KB

Contributing

sq welcomes new issues, pull requests and discussion.

For user documentation, see sq.io.

Documentation site (site/)

The sq.io website is a Hugo project in site/. From site/, use make for the usual workflow (make deps, make site-dev, make site-test, make site-build, or make ci to match CI). Bun equivalents are in site/README.md.

If you are changing anything under site/, read site/README.md first: it explains the stable vs full link-check split, what PR CI blocks on, and what runs as informational/nightly follow-up.

Changes under site/ are validated by .github/workflows/site-ci.yml.

site/ import background (maintainers)

The site/ tree was imported from the former sq-web repository as a flat add. Prior history remains in the archived sq-web repository.

Branch protection / rulesets (maintainers)

Configure repository rulesets so that pull requests touching site/** require the Site CI check (site-ci.yml), without requiring that check on Go-only PRs (path-scoped rules, or equivalent). Plain “required status” lists interact badly with workflows that use paths filters and do not run on every PR.

Netlify (maintainers)

Production hosting for sq.io is on Netlify. After the monorepo migration, the Netlify site should use repository neilotoole/sq, base directory site, and the existing site/netlify.toml. Re-link the repo in Netlify if needed; confirm deploy previews and the /version function.

Tooling

This documentation presumes you are on macOS. If not, adapt appropriately. This section also presumes you want to do full-stack sq development; if not, you may not need all of these tools. You'll definitely need go.

  • go: brew install go
  • make: brew install make
  • shellcheck: brew install shellcheck
  • java: brew install java
    • The SLQ grammar is generated by the antlr tool (which is Java-based) from SLQ.g4.

Makefile

Yes, we are a Go project, and shouldn't need a Makefile. But, sq is also a fairly complex project, with generated code, CGo (due to embedded SQLite), test containers, related docs website, and a bunch of other stuff. Therefore, if for no other reason, it is recommended to use the Makefile when developing locally.

General advice

As a kick-off point, run make all. This will generate code, format, lint, run tests, build the local binary to dist/, and install a local sq build. See the Makefile for individual targets.

Opening issues

There are already GitHub templates in place: just use the usual GitHub process to open an issue for sq. Remember to search the existing issues first.

Opening a PR

Use the usual GitHub process to open a PR. Before you do so, please:

  • Merge the latest master into your branch: git merge origin/master.
  • Run make all.

CHANGELOG.md

The CHANGELOG.md file is sacrosanct, in that it must be updated every time there is a new release. Note that this is a task for the project maintainers; you do not need to worry about this if creating a PR.

This project follows Keep a Changelog and Semantic Versioning.

Scope: Entries describe the sq CLI and core libraries (what ships in the release binary). Changes that only touch site/ (the sq.io Hugo site) do not need CHANGELOG.md updates unless a maintainer wants a release-note line tied to the sq product.

Unreleased Section

When there is work-in-progress, CHANGELOG.md uses an ## Unreleased section at the top for accumulating changes during development.

## Unreleased

### Fixed

### Changed

### Added

## [v0.48.5] - 2025-01-19
...

Workflow:

  1. Starting new work: Add an ## Unreleased section at the top of the CHANGELOG with the standard subsection headers (Fixed, Changed, Added).

  2. During development: Add entries under ## Unreleased for CLI and library changes as they land (see Scope above). Site-only PRs can skip the changelog.

  3. At release time: When creating a new version (e.g., git tag v1.2.3):

    • Replace ## Unreleased with ## [v1.2.3] - YYYY-MM-DD
    • Remove empty subsections
    • Add the version comparison link at the bottom of the file

The ## Unreleased section should not exist when there is no work-in-progress.

Version Entry Structure

## [vX.Y.Z] - YYYY-MM-DD

Optional brief description paragraph for significant releases.

### Fixed

- Bug fixes

### Changed

- Changes to existing functionality

### Added

- New features go here

When present, sections should appear in this order: Fixed, Changed, Added. Not all sections are required for every release.

Special Markers

  • ☢️ - Breaking changes, place at start of entry
  • 🐥 - Alpha/beta features, place at start of entry
  • 👉 - Important notes/callouts within entry text

Entry Formatting

Reference GitHub issues at the start of entries when applicable:

- [#123]: Description of the change.

Issue link definitions go at the bottom of the file:

[#123]: https://github.com/neilotoole/sq/issues/123

Use fenced code blocks with language hints (shell, sql, json, csv, plaintext). Indent code blocks with two spaces when nested under a list item.

Example entry with code block:

- [#338]: The [`having`](https://sq.io/docs/query#having) function is now
  implemented.

  $ sq '.payment | group_by(.customer_id) | having(sum(.amount) > 200)'

Formatting tips:

  • Link config options to docs: [`config.option`](https://sq.io/docs/config#configoption)
  • Use backticks for commands: `sq add`
  • Use backticks for flags: `--verbose`

Breaking Changes

Prefix with ☢️ and explain what changed. Include before/after examples when helpful:

- ☢️ The `--old-flag` flag has been renamed to `--new-flag`.

Writing Style

  • Start entries with a verb or noun phrase
  • Use present tense for features ("Implements...", "Adds...")
  • Use past tense for fixed bugs ("Fixed...", "Resolved...")
  • Provide concrete examples with shell commands
  • Link to documentation for detailed features
  • Be specific about what changed and why

Version Links

At the bottom of the file, add version comparison links:

[vX.Y.Z]: https://github.com/neilotoole/sq/compare/vA.B.C...vX.Y.Z

For the first release in a sequence, link to the release tag:

[v0.15.2]: https://github.com/neilotoole/sq/releases/tag/v0.15.2

New driver implementations

In sq parlance, a "driver" implements a datasource type, e.g. Postgres, MySQL, CSV, JSON etc. See the sq.io drivers section.

See ARCHITECTURE.md for a diagram that gives an overview of the sq architecture.

There are two varieties of drivers: "SQL", and "non-SQL" (aka "Document") drivers. These are defined by whether they implement just the driver.Driver interface, or also the driver.SQLDriver interface.

For the SQL drivers, it is expected that there exists a sakiladb/DRIVER_NAME docker image, where DRIVER_NAME matches the driver type string (e.g., sakiladb/postgres, sakiladb/mysql, sakiladb/clickhouse). See the sakiladb images. These images contain the Sakila dataset, enabling uniform integration tests across SQL drivers.

Note that SQLite is a special case, because, although it is a SQL-based driver, it is also file-based. That is to say, SQLite implements the driver.SQLDriver interface, but it does not need a standalone docker container to serve up its SQL interface.

Getting started: Examine an existing driver implementation as a reference.

For SQL drivers, drivers/postgres or drivers/mysql are good templates.

As mentioned above, for SQL drivers, you'll need a sakiladb/DRIVER_NAME docker image: open a sq issue when you need that docker image.

For document drivers, see drivers/csv or drivers/json.

All drivers

Driver type registration

Each driver defines a Type constant that corresponds to a value in libsq/source/drivertype/drivertype.go. For example:

// In libsq/source/drivertype/drivertype.go
const ClickHouse = Type("clickhouse")

// In drivers/clickhouse/clickhouse.go
const Type = drivertype.ClickHouse

The driver type string (e.g., "clickhouse") is used in:

  • Connection URL schemes: clickhouse://host:port/database
  • Source handles: @my_clickhouse_db (the handle itself is user-defined, but the driver type determines how the source is processed)
  • The sakiladb docker image name: sakiladb/clickhouse

Package structure

A typical driver package contains:

  • {driver}.go: Main driver implementation (Provider, Driver, connection handling).
  • grip.go: Database handle wrapper (Grip implementation).
  • metadata.go: Schema introspection and type mapping.
  • render.go: SQL statement generation (for SQL drivers).
  • errors.go: Driver-specific error handling and wrapping (optional).
  • internal_test.go: Exports unexported functions for external test packages.
  • {driver}_test.go: Integration tests using the external test package.

Test file organization

Driver tests use Go's external test package pattern (package driver_test). To test unexported functions, create an internal_test.go file in the main package that exports them as variables:

// internal_test.go
package clickhouse

// Exported variables for testing unexported functions from external test
// packages. The naming convention is to capitalize the first letter of the
// unexported function name (e.g., buildCreateTableStmt becomes
// BuildCreateTableStmt).

var (
    KindFromDBTypeName   = kindFromDBTypeName
    BuildCreateTableStmt = buildCreateTableStmt
)

Then import and use these in your *_test.go files:

// metadata_test.go
package clickhouse_test

import "github.com/neilotoole/sq/drivers/clickhouse"

func TestKindFromDBTypeName(t *testing.T) {
    got := clickhouse.KindFromDBTypeName("String")
    require.Equal(t, kind.Text, got)
}

Test handles

Test handles for sakila sources are defined in testh/sakila/sakila.go. Add your driver's handle there:

const (
    CH25 = "@sakila_ch25"
    CH   = CH25  // Alias for latest version
)

Integration tests that require a running database should use tu.SkipShort(t, true) to skip when running in short mode (go test -short):

func TestSmoke(t *testing.T) {
    tu.SkipShort(t, true)

    th := testh.New(t)
    src := th.Source(sakila.CH)
    // ... test code
}

SQL drivers

Type mapping

SQL drivers must map between the database's native types and sq's kind.Kind type system. Key considerations:

  • Wrapper types: Some databases use wrapper types like Nullable(T) or LowCardinality(T) (ClickHouse). Your type mapping function must unwrap these to determine the underlying kind.
  • Parameterized types: Types like Decimal(18,4), FixedString(255), or VARCHAR(100) need prefix matching, not exact string comparison.
  • Default to kind.Text: Unknown types should map to kind.Text as a safe fallback.

Example pattern for handling wrapped types:

func kindFromDBTypeName(dbType string) kind.Kind {
    // Strip Nullable wrapper: Nullable(Int64) -> Int64
    if strings.HasPrefix(dbType, "Nullable(") {
        dbType = dbType[9 : len(dbType)-1]
    }

    // Strip LowCardinality wrapper
    if strings.HasPrefix(dbType, "LowCardinality(") {
        dbType = dbType[15 : len(dbType)-1]
        return kindFromDBTypeName(dbType) // Recurse for nested wrappers
    }

    switch {
    case dbType == "String", strings.HasPrefix(dbType, "FixedString"):
        return kind.Text
    case strings.HasPrefix(dbType, "Int"), strings.HasPrefix(dbType, "UInt"):
        return kind.Int
    // ... etc
    default:
        return kind.Text
    }
}

Database-specific quirks

Document any database-specific behaviors that affect driver implementation:

  • Transaction support: Some databases (e.g., ClickHouse) don't support traditional ACID transactions.
  • DDL requirements: ClickHouse's MergeTree engine requires an ORDER BY clause, and nullable columns cannot be used in the sorting key.
  • Update syntax: Some databases use non-standard UPDATE syntax (e.g., ClickHouse uses ALTER TABLE ... UPDATE).
  • Schema vs catalog: Terminology varies between databases. Document how your driver maps "catalog" and "schema" concepts.

Nullable column handling

When creating tables, be aware of how nullable columns interact with other database features. For example, in ClickHouse:

// ClickHouse's MergeTree engine doesn't allow nullable columns in ORDER BY.
// Find the first NOT NULL column, or use tuple() if all are nullable.
orderByCol := ""
for _, col := range tblDef.Cols {
    if col.NotNull {
        orderByCol = col.Name
        break
    }
}
if orderByCol != "" {
    sb.WriteString("ORDER BY " + enquote(orderByCol))
} else {
    sb.WriteString("ORDER BY tuple()")
}

Dialect configuration

SQL drivers must return a properly configured dialect.Dialect from the Dialect() method. Key settings include:

  • Enquote function: How to quote identifiers (backticks, double quotes, brackets).
  • Placeholder style: ? for positional, $1 for numbered.
  • IntBool: Whether the database uses integers (0/1) for boolean values.

Non-SQL drivers

Non-SQL (document) drivers handle file-based data sources like CSV, JSON, and Excel files. These drivers implement only driver.Driver, not driver.SQLDriver.

Key considerations: