Skip to content

FEAT: Add legacy .xls support to converter task#76

Merged
Shivang Nagta (ShivangNagta) merged 3 commits into
patterninc:mainfrom
ShivangNagta:feat/legacy-xls-support
Jun 19, 2026
Merged

FEAT: Add legacy .xls support to converter task#76
Shivang Nagta (ShivangNagta) merged 3 commits into
patterninc:mainfrom
ShivangNagta:feat/legacy-xls-support

Conversation

@ShivangNagta

@ShivangNagta Shivang Nagta (ShivangNagta) commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Description

Adds support for converting legacy Excel 97-2003 (.xls) files to CSV
via a new format: xls on the converter task. The existing xlsx converter uses
excelize, which only reads modern Office Open XML (.xlsx/.xlsm) and cannot
parse the legacy binary .xls format - so a separate handler is required.

Implementation

  • New format: xls handler (xls.go),
    backed by github.com/yamitzky/xlrd-go](https://github.com/yamitzky/xlrd-go) - a
    pure-Go port of Python's xlrd.
  • Mirrors the xlsx converter's options (sheets, skip_rows, skip_rows_by_sheet,
    sanitize_headers, sanitize_sheet_names) and per-sheet CSV output (one record per
    sheet, sheet name in the xlsx_sheet_name context key).
  • Includes hidden sheets
  • Since excelise(used in xlsx) returns formatted data (number, dates, booleans) unlike xlrd-go,
    which is a much lower level library providing cell value and cell type instead, we need to perform
    the formatting ourself, so the number of lines of codes is more than expected.

Testing

Created and ran unit tests - TestFormatDate, TestFormatNumber, TestBoolText, TestIsBuiltinDateFormat, TestTrimTrailingEmpty the test file is not included in the commit.

Tested against automotive_browse_tree_guide.xls containing 4 sheets(1 hidden).

YAML used:

tasks:
  - name: read_xls
    type: file
    path: /some_path/automotive_browse_tree_guide.xls
  - name: to_csv
    type: converter
    format: xls
  - name: write_csv
    type: file
    path: /some_path/___CATERPILLAR___CONTEXT___xlsx_sheet_name___.csv

One of the sheet in excel
excel

Its output in csv
csv

All output csv files:
Screenshot 2026-06-16 at 3 27 55 PM

Checks for all supported options:
Screenshot 2026-06-16 at 4 43 51 PM

Types of changes

  • Docs change / refactoring / dependency upgrade
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • My code follows the code style of this project.
  • My change requires a change to the documentation and I have updated the documentation accordingly.
  • I have added tests to cover my changes.

@ShivangNagta Shivang Nagta (ShivangNagta) marked this pull request as ready for review June 16, 2026 10:46
@ShivangNagta Shivang Nagta (ShivangNagta) requested a review from a team as a code owner June 16, 2026 10:46
Copilot AI review requested due to automatic review settings June 16, 2026 10:46

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

@Mayureshpawar29

Mayuresh Pawar (Mayureshpawar29) commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Shivang Nagta (@ShivangNagta) Maintaining a separate fork of grate repo would add quite a bit of overhead. Let's see if we can find an alternative library that supports hyperlink parsing.

@ShivangNagta

Copy link
Copy Markdown
Contributor Author

Shivang Nagta (Shivang Nagta (@ShivangNagta)) Maintaining a separate fork of grate repo would add quite a bit of overhead. Let's see if we can find an alternative library that supports hyperlink parsing.

I get the concern, let me see if i could find something else

"os"

"github.com/patterninc/grate"
gratexls "github.com/patterninc/grate/xls"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please look into one of the packages from https://pkg.go.dev/

Ref: https://pkg.go.dev/github.com/extrame/xls

@ShivangNagta Shivang Nagta (ShivangNagta) merged commit 80f47c8 into patterninc:main Jun 19, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants