β οΈ Alpha Release: This is currently in alpha. APIs may change between releases. This is not production-ready software. For production use, consider waiting for the stable release or pinning to a specific alpha version.
A comprehensive, high-performance Apache Arrow Flight streaming framework for Node.js that enables efficient, real-time data streaming across distributed systems. Built with a modular plugin architecture, FlightStream provides both server-side streaming capabilities and client-side data access patterns, making it ideal for modern data pipelines, analytics applications, and microservices architectures.
- Data Engineering: Stream CSV files to analytics engines (Apache Spark, DuckDB, Pandas)
- API Modernization: Replace REST APIs with efficient columnar data transfer
- Real-time Analytics: Power dashboards and BI tools with live data streams
- Microservices: Enable high-performance data sharing between services
- Multi-language Integration: Connect applications written in different programming languages
Extensible adapter system for any data source - CSV, databases, cloud storage
# Clone and install
git clone https://github.com/ggauravr/flightstream.git
cd flightstream
npm install
# Start the example server
npm start
# Test with the first dataset found in the data/ directory
npm test
# Test with a specific dataset
npm test <dataset>The server automatically discovers CSV files in the data/ directory and serves them via Arrow Flight protocol.
That's it! The server will automatically discover CSV files in the data/ directory and stream them via Arrow Flight protocol. The test client will connect and display the streamed data in real-time. As you can see a CSV with ~41k rows is streamed to the client in .25s!
The test client will connect and display the streamed data specificed by the dataset id in real-time. In the example above, CSV with ~800k rows is streamed to the client in <4s!
- Flight Server: Started on
localhost:8080with CSV adapter - Sample Data: Automatically discovered from
./data/directory - Test Client: Connected via gRPC and streamed Arrow data
- Live Reload: Server restarts automatically when you modify code
The monorepo contains focused, reusable packages:
| Package | Version | Description |
|---|---|---|
| @flightstream/core-server | 1.0.0-alpha.7 |
Core Flight server framework with gRPC support |
| @flightstream/core-client | 1.0.0-alpha.3 |
Core Flight client framework with connection management |
| @flightstream/core-shared | 1.0.0-alpha.3 |
Shared utilities and protocol helpers |
| @flightstream/adapters-csv | 1.0.0-alpha.5 |
CSV file adapter with streaming and schema inference |
| @flightstream/utils-arrow | 1.0.0-alpha.5 |
Advanced Arrow utilities and type system |
- Data Lakes: Serve files efficiently from S3, GCS, Snowflake, or local storage
- Analytics Pipelines: Stream data to Apache Spark, DuckDB, or custom analytics
- Real-time ETL: High-performance data transformation and streaming
- API Modernization: Replace REST APIs with efficient columnar data transfer for real-time analytics products
- Multi-language Integration: Connect Python, Java, C++, and JavaScript applications
The project includes working examples:
- Basic Server (
examples/basic-server/): Complete CSV server implementation - Basic Client (
examples/basic-client/): Client with connection management and streaming
- GitHub: ggauravr/flightstream
- Issues: Report bugs and request features
- Discussions: Community discussions
- Contributions: Please see the Contributing Guide for details
This project is licensed under the MIT License.
- Apache Arrow for the columnar data format
- DuckDB for the embedded analytical database and the mind-blowing single-node performance
- gRPC for the high-performance RPC framework
- Apache Arrow Flight for the amazing message transfer protocol



