Go File Processor

Parallel and resilient processing of massive files with Worker Pool in Go.

Note: Archival Project
This was my second major project in Go, built as a deep dive into the language's idiomatic concurrency patterns and high-performance I/O. It is now archived but serves as a solid reference for ETL (Extract, Transform, Load) implementations in Golang.

Go File Processor is a high-performance command-line tool and library designed to efficiently convert massive CSV files (millions of records) into structured JSON. It demonstrates the power of Go's concurrency primitives to achieve maximum throughput with minimal memory overhead.

🚀 Core Learning Objectives

This project was a hands-on laboratory to master several Go concepts:

Concurrency via Worker Pool: Leveraging goroutines and channels to process data in parallel without overwhelming the system.
Memory Efficiency (Streaming): Using io.Reader and io.Writer to process gigabytes of data with a constant, tiny memory footprint.
The Middleware Pattern: Implementing a "Chain of Responsibility" for data transformation that is both flexible and type-safe.
Atomic Operations: Using sync/atomic for high-speed metrics tracking, avoiding the overhead of mutexes.
Idiomatic Project Layout: Following standard Go folder structures (cmd/, internal/) and build automation with Makefile.

Demonstration

As a Library

proc := processor.NewCSVToJSONProcessor()
config := processor.Config{WorkerCount: 8}

// Fluent transformation chain
config.AddTransformer(processor.EmailFilter(`@company.com$`))
config.AddTransformer(processor.FieldMasker("email"))

metrics, err := proc.Process("input.csv", "output.json", config)

As a CLI

./fileproc -input data.csv -output data.json -workers 4

Tech Stack & Architecture

Technology	What I Learned
Worker Pool	How to orchestrate multiple goroutines for parallel work.
Channels	Managing safe communication and backpressure between stages.
Streaming I/O	Processing files record-by-record instead of loading to RAM.
Atomic Counters	Implementing thread-safe counters with maximum performance.
Structured Logs	Using `slog` for modern, machine-readable observability.

Pipeline Flow

The system uses a streaming model to maintain low memory usage: Input CSV -> Producer -> Job Channel -> [Workers + Transformers] -> Result Channel -> Consumer -> Output JSON

Makefile Targets

Target	Description
`make build`	Compiles the `fileproc` binary.
`make test`	Runs the full unit test suite.
`make bench`	Runs benchmarks to see the speed of Parallel vs Sequential.
`make generate-data`	Generates a 100k row test file for performance testing.

📚 Final Thoughts

Building this project taught me that Go isn't just about syntax; it's about a philosophy of simplicity and performance. The transition from sequential processing to a parallel worker pool showed me how Go empowers developers to build tools that scale effortlessly.

Author

Enoque Sousa

⬆ Back to top

Made with ❤️ by Enoque Sousa

Project Status: Archived — Educational Milestone

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github/workflows		.github/workflows
assets		assets
cmd		cmd
docs		docs
internal		internal
scripts		scripts
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
cspell.json		cspell.json
go.mod		go.mod

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Go File Processor

🚀 Core Learning Objectives

Demonstration

As a Library

As a CLI

Tech Stack & Architecture

Pipeline Flow

Makefile Targets

📚 Final Thoughts

Author

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Go File Processor

🚀 Core Learning Objectives

Demonstration

As a Library

As a CLI

Tech Stack & Architecture

Pipeline Flow

Makefile Targets

📚 Final Thoughts

Author

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages