Skip to content

jogeshwar01/solana-indexer

Repository files navigation

Solana Blockchain Indexer

A real-time Solana blockchain data indexer that streams transactions and account changes using Yellowstone gRPC Geyser plugin, processes them through Kafka, and stores structured data in ClickHouse for analytics and querying.

Architecture

Architecture Diagram

Features

  • Real-time Streaming: Consumes Solana blockchain data from Kafka topics
  • Protobuf Decoding: Decodes Yellowstone gRPC protobuf messages (ConfirmedBlock, ConfirmedTransaction, Transaction)
  • Table Storage: Stores all data in ClickHouse
  • RESTful API: Provides endpoints for querying and managing data
  • TypeScript: Fully typed codebase for better development experience
  • Graceful Shutdown: Proper cleanup of Kafka and ClickHouse connections

Tech Stack

  • Backend: Node.js + Express + TypeScript
  • Message Queue: Apache Kafka
  • Database: ClickHouse
  • Protobuf: protobufjs for message decoding
  • Blockchain: Solana (via Yellowstone gRPC Geyser plugin)

Prerequisites

  • Node.js 18+ and npm
  • Apache Kafka running on localhost:9092
  • ClickHouse running on localhost:8123
  • Solana node with Yellowstone gRPC Geyser plugin streaming to Kafka topic grpc1

Installation

  1. Clone and setup the project:
git clone <repository-url>
cd project/indexer/backend
npm install
  1. Build TypeScript:
npm run build
  1. Start the server:
# Development mode with auto-reload
npm run dev

# Production mode
npm start

The server will start on http://localhost:3000

Project Structure

backend/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ database/
β”‚   β”‚   └── clickhouse.ts          # ClickHouse client and table management
β”‚   β”œβ”€β”€ kafka/
β”‚   β”‚   └── consumer.ts            # Kafka consumer setup
β”‚   β”œβ”€β”€ protobuf/
β”‚   β”‚   └── decoder.ts             # Protobuf message decoding
β”‚   β”œβ”€β”€ routes/
β”‚   β”‚   └── index.ts               # API routes
β”‚   β”œβ”€β”€ services/
β”‚   β”‚   └── message-processor.ts   # Core message processing logic
β”‚   β”œβ”€β”€ types/
β”‚   β”‚   └── index.ts               # TypeScript interfaces
β”‚   β”œβ”€β”€ utils/
β”‚   β”‚   └── timestamp.ts           # Timestamp formatting utilities
β”‚   └── server.ts                  # Main server entry point
β”œβ”€β”€ message.proto                  # Solana protobuf schema
β”œβ”€β”€ package.json
β”œβ”€β”€ tsconfig.json
└── README.md

API Endpoints

General Information

  • GET / - Server status and configuration

Block Data Management

  • GET /stats - Database statistics (total blocks, protobuf decoded, decode success)
  • GET /blocks?limit=10 - Get recent blocks from ClickHouse
  • DELETE /blocks - Clear all blocks from database

Database Schema

ClickHouse Table

blocks

Stores all processed Kafka messages with decoded data:

CREATE TABLE blocks (
  timestamp DateTime64(3),
  topic String,
  partition UInt32,
  offset String,
  message_type String,
  decoded_data String,
  is_protobuf Bool,
  decode_success Bool
) ENGINE = MergeTree()
ORDER BY (timestamp, topic, partition, offset)

Fields:

  • timestamp - When the message was processed
  • topic - Kafka topic name (grpc1)
  • partition - Kafka partition number
  • offset - Kafka message offset
  • message_type - Type of decoded message (ConfirmedBlock, JSON, Raw, etc.)
  • decoded_data - JSON string of the decoded message content
  • is_protobuf - Whether message was successfully decoded as protobuf
  • decode_success - Whether any decoding was successful

Usage Examples

Get Database Statistics

curl http://localhost:3000/stats

Response:

{
  "stats": {
    "total_blocks": 1250,
    "protobuf_decoded": 1200,
    "decode_success": 1240
  }
}

Query Recent Blocks

curl http://localhost:3000/blocks?limit=5

Clear All Data

curl -X DELETE http://localhost:3000/blocks

ClickHouse Direct Queries

-- Get message type distribution
SELECT
  message_type,
  COUNT(*) as count
FROM blocks
GROUP BY message_type
ORDER BY count DESC;

-- Get recent protobuf messages
SELECT
  timestamp,
  message_type,
  is_protobuf,
  decode_success
FROM blocks
WHERE is_protobuf = true
ORDER BY timestamp DESC
LIMIT 10;

-- Get processing success rate
SELECT
  toDate(timestamp) as date,
  COUNT(*) as total_messages,
  SUM(decode_success) as successful_decodes,
  (SUM(decode_success) * 100.0 / COUNT(*)) as success_rate
FROM blocks
GROUP BY toDate(timestamp)
ORDER BY date DESC;

Development

Available Scripts

  • npm run build - Compile TypeScript to JavaScript
  • npm run dev - Start development server with auto-reload
  • npm run dev:watch - Start with file watching
  • npm start - Start production server

Environment Variables

  • PORT - Server port (default: 3000)

Monitoring and Debugging

The application provides comprehensive logging:

  • Raw message hex bytes for protobuf debugging
  • Message type identification
  • Decode success/failure status
  • ClickHouse operation status

All data is stored in ClickHouse for analysis and debugging.

Data Flow

  1. Kafka Consumer receives messages from grpc1 topic
  2. Message Processor attempts to decode messages in this order:
    • Protobuf (ConfirmedBlock β†’ ConfirmedTransaction β†’ Transaction)
    • JSON parsing
    • Raw data storage
  3. ClickHouse Writer stores all messages with metadata in blocks table
  4. API Endpoints provide access to stored data and statistics

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit a pull request

About

Solana Indexer with Geyser Plugin (yellowstone-grpc), Kafka, and Clickhouse to stream and index real-time blockchain transactions.πŸ’ 

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors