Like a durable parquet floor, GigAPI provides rock-solid data foundation for your queries and analytics
GigAPI by Gigapipe is our twist on future query engines – one where you focus on your data, not your infrastructure, servers or capacity. By combining the performance of DuckDB with cloud-native architecture principles we've created a simple and light solution designed for unlimited time series and analytical datasets that makes traditional server-based OLAP databases feel like costly relics and decimating infrastructure costs by 50-90% without performance loss. All 100% opensource - no open core cloud gimmicks.
Warning
GigAPI is an open beta developed in public. Bugs and changes should be expected. Use at your own risk.
- Fast: DuckDB SQL + Parquet powered OLAP API Engine
- Flexible: Schema-less Parquet Ingestion & Compaction
- Simple: Low Maintenance, Portable, Infinitely Scalable
- Smart: Independent storage/write and compute/read components
- Extensible: Built-In Query Engine (DuckDB) or DIY (ClickHouse, Datafusion, etc)
S3 Support Coming Soon
services:
gigapi:
image: ghcr.io/gigapi/gigapi:latest
container_name: gigapi
hostname: gigapi
restart: unless-stopped
volumes:
- ./data:/data
ports:
- "7971:7971"
environment:
- GIGAPI_ENABLED=true
- GIGAPI_MERGE_TIMEOUT_S=10
- GIGAPI_ROOT=/data
- PORT=7971
gigapi-querier:
image: ghcr.io/gigapi/gigapi-querier:latest
container_name: gigapi-querier
hostname: gigapi-querier
volumes:
- ./data:/data
ports:
- "7972:7972"
environment:
- DATA_DIR=/data
- PORT=7972
Env Var Name | Description | Default Value |
---|---|---|
GIGAPI_ROOT | Root directory for the databases and tables | |
GIGAPI_MERGE_TIMEOUT_S | Merge timeout in seconds | 10 |
GIGAPI_SAVE_TIMEOUT_S | Save timeout in seconds | 1.0 |
GIGAPI_NO_MERGES | Disables merges when set to true | false |
PORT | Port number for the server to listen on | 7971 |
As write requests come in to GigAPI they are parsed and progressively appeanded to parquet files alongside their metadata. The ingestion buffer is flushed to disk at configurable intervals using a hive partitioning schema. Generated parquet files and their respective metadata are progressively compacted and sorted over time based on configuration parameters.
GigAPI provides an HTTP API for clients to write, currently supporting the InfluxDB Line Protocol format
cat <<EOF | curl -X POST "http://localhost:7971/write?db=mydb" --data-binary @/dev/stdin
weather,location=us-midwest,season=summer temperature=82
weather,location=us-east,season=summer temperature=80
weather,location=us-west,season=summer temperature=99
EOF
Note
more ingestion protocols coming soon!
GigAPI is a schema-on-write database managing databases, tables and schemas on the fly. New columns can be added or removed over time, leaving reconciliation up to readers.
/data
/mydb
/weather
/date=2025-04-10
/hour=14
*.parquet
metadata.json
/hour=15
*.parquet
metadata.json
GigAPI managed parquet files use the following naming schema:
{UUID}.{LEVEL}.parquet
GigAPI files are progressively compacted based on the following logic (subject to future changes)
Merge Level | Source | Target | Frequency | Max Size |
---|---|---|---|---|
Level 1 -> 2 | .1 |
.2 |
MERGE_TIMEOUT_S = 10 |
100 MB |
Level 2 -> 3 | .2 |
.3 |
MERGE_TIMEOUT_S * 10 |
400 MB |
Level 3 -> 4 | .3 |
.3 |
MERGE_TIMEOUT_S * 10 * 10 |
4 GB |
As read requests come in to GigAPI they are parsed and transpiled using the GigAPI Metadata catalog to resolve data location based on database, table and timerange in requests. Series can be used with or without time ranges, ie for calculating averages, etc.
$ curl -X POST "http://localhost:7972/query?db=mydb" \
-H "Content-Type: application/json" \
-d '{"query": "SELECT count(*), avg(temperature) FROM weather"}'
{"results":[{"avg(temperature)":87.025,"count_star()":"40"}]}
GigAPI readers can be implemented in any language and with any OLAP engine supporting Parquet files.
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#6a329f',
'primaryTextColor': '#fff',
'primaryBorderColor': '#7C0000',
'lineColor': '#6f329f',
'secondaryColor': '#006100',
'tertiaryColor': '#fff'
}
}
}%%
graph TD;
GigAPI-->ParquetWriter;
ParquetWriter-->Storage;
ParquetWriter-->Metadata;
Storage-->Compactor;
Compactor-->Storage;
Compactor-->Metadata;
Storage-.->LocalFS;
Storage-.->S3;
HTTP-API-- GET/POST --> GigAPI;
DuckDB-->Storage;
DuckDB-->Metadata;
subgraph GigAPI[GigAPI Server]
ParquetWriter
Compactor
Metadata;
DuckDB;
end
Footnotes
-
DuckDB ® is a trademark of DuckDB Foundation. All rights reserved by their respective owners. 1 ↩
-
ClickHouse ® is a trademark of ClickHouse Inc. No direct affiliation or endorsement. 2 ↩
-
InfluxDB ® is a trademark of InfluxData. No direct affiliation or endorsement. 3 ↩
-
Released under the MIT license. See LICENSE for details. All rights reserved by their respective owners. 4 ↩