Skip to content

Add reading and writing of GP6 (.gpx) and GP7 (.gp) files#62

Closed
knoguchi wants to merge 6 commits into
Perlence:masterfrom
knoguchi:feature/gpx
Closed

Add reading and writing of GP6 (.gpx) and GP7 (.gp) files#62
knoguchi wants to merge 6 commits into
Perlence:masterfrom
knoguchi:feature/gpx

Conversation

@knoguchi

Copy link
Copy Markdown

Summary

Adds support for the GP6 (.gpx) and GP7 (.gp) container formats to both guitarpro.parse and guitarpro.write, giving the library read/write coverage for these formats alongside the existing GP3/GP4/GP5 binary support.

  • gpx.py — the container layer: BCFZ bitstream (de)compression, the BCFS sector archive reader/writer, and GP7 ZIP handling. The decompression algorithm is based on code contributed by J. Jørgen von Bargen on the old feature/gpx branch.
  • gpif.py — maps the embedded score.gpif XML into and out of the existing Song model. The model is a tree (Measure → Voice → Beat → Note) while GPIF stores flat, cross-referenced lists joined by id; the reader resolves those references and the writer hoists the tree back into them.
  • io.pyparse() detects the BCFZ/BCFS/PK magic and dispatches to the new reader; write() dispatches to the new writer for version=(6, 0, 0) / (7, 0, 0) or a .gpx / .gp extension. The binary GP3/4/5 path is unchanged.

Coverage

Reads and writes the core musical content: song info, tracks and tunings, master bars and time signatures, repeats, voices, beats, durations (incl. dotted notes and tuplets), dynamics and notes (fret/string/tie). Advanced note and beat effects (bends, harmonics, slides, grace notes, palm mute, …) are not yet translated.

Tests

parse → write → parse round-trips with full Song equality and hash equality, matching the existing testReadWriteEquals pattern. New tests cover:

  • Reading three real GP6 files (counts verified against the raw resolved gpif graph)
  • GP7 ZIP container reading
  • BCFZ compress/decompress identity
  • GP6/GP7 write round-trips + extension-based dispatch
  • Edge cases: zero-padded BCFZ final byte, non-seekable streams, missing score.gpif

All 208 tests pass.

Two caveats worth flagging

  1. BCFZ compression emits literal runs only. A faithful LZ encoder was both quadratic and, in this port, incorrect on non-repetitive data. A literal-only stream is a valid BCFZ payload that decodes identically and keeps the encoder linear, at ~1.12× size. This is intentional and documented.
  2. Round-trips are verified through PyGuitarPro, not against the Guitar Pro application. I don't have a Guitar Pro 6/7 install to confirm the written files open in the app. The score.gpif is well-formed and self-consistent, but app-level compatibility is unverified.

I'm aware the project previously listed GPX as out of scope — happy to adjust scope, naming, or the compression approach if you'd consider it.

🤖 Generated with Claude Code

knoguchi and others added 5 commits June 25, 2026 19:53
GP6 and GP7 store the score as a score.gpif XML document inside a
container: GP6 uses a BCFZ-compressed BCFS virtual filesystem, while GP7
is a plain ZIP archive. This adds a reader for both.

- gpx.py: BCFZ bitstream decompression (based on the algorithm from the
  feature/gpx branch contributed by J. Jorgen von Bargen), the BCFS
  sector archive reader, and GP7 ZIP extraction.
- gpif.py: maps the score.gpif XML into the existing Song model,
  resolving the format's cross-referenced bars/voices/beats/notes/rhythms
  lists. Covers song info, tracks, tunings, master bars and time
  signatures, voices, beats, durations and notes.
- io.py: parse() now detects the BCFZ/BCFS/ZIP magic and dispatches to
  the new reader; the binary GP3/4/5 path is unchanged.

Writing GP6/GP7 is not supported. Advanced note/beat effects are not yet
translated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The BCFZ payload may end mid-byte, so the last compression token can
require bits beyond the end of the stream. Yield zero padding bits past
end-of-stream instead of raising IndexError, matching the reference
decoder's EOF handling. Add "Dear Song.gpx" which exercises this path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Synthesize a GP7 archive from an existing .gpx score's score.gpif and
assert it parses into an identical Song, covering the ZIP extraction
branch without committing a binary GP7 fixture.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
guitarpro.write can now produce GP6 and GP7 files, completing the
read/write parity the library offers for GP3/4/5.

- gpif.py: GPIFWriter serializes a Song into a score.gpif document,
  hoisting the measure/voice/beat/note tree into the format's flat,
  cross-referenced Bars/Voices/Beats/Notes/Rhythms lists.
- gpx.py: BCFZ compression (literal-run encoding, which is valid BCFZ and
  keeps the encoder linear) and a BCFS image builder; GP7 is written as a
  ZIP archive.
- io.py: write() dispatches to the GPX writer for version (6,0,0)/(7,0,0)
  or a .gpx/.gp extension.

Round-trips parse -> write -> parse with full Song equality. Writes cover
the same subset of the model that reading populates; advanced effects are
not yet serialized. Also tightened voice reading to skip unused (-1)
voice slots so the voice count is deterministic across a round-trip.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Buffer non-seekable streams before the magic peek so parse() works on
  pipes and other non-seekable file-like objects.
- Raise GPException (not KeyError) when a GP6 container has no score.gpif.
- Emit one Bar per track in every MasterBar so the Bars reference list has
  a consistent length.
- Drop the internal buildBCFS helper from the module's public __all__.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@knoguchi knoguchi marked this pull request as draft June 26, 2026 03:38
@knoguchi

knoguchi commented Jun 26, 2026

Copy link
Copy Markdown
Author

I'd like someone to test GP7 files. Also I'm working on another PR to support GP6/GP7 effects such as hammer/pull, slides, harmonics, fingering, accent, beat text.

The declared uncompressed length is only an upper bound: real GP6 streams
that use back-references end before reaching it, with the final byte
zero-padded. The previous code padded past end-of-stream with zero bits,
which decode to empty literal runs and made decompression loop forever on
any such file (the literal-heavy sample files happened to hit the declared
length exactly and so were unaffected).

Raise at end-of-stream and stop, matching the reference decoder. Verified
against the alphaTab GP6 test corpus: all 35 files now decompress, parse
and round-trip.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@knoguchi

Copy link
Copy Markdown
Author

just saw #58 for the same purpose and closed.

@knoguchi knoguchi closed this Jun 26, 2026
@knoguchi knoguchi deleted the feature/gpx branch June 26, 2026 05:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant