Timescaledb: SQL Made Scalable For Time-Series Data: 1 Background
Timescaledb: SQL Made Scalable For Time-Series Data: 1 Background
Timescaledb: SQL Made Scalable For Time-Series Data: 1 Background
hello@timescale.com
1
time periods. In other words, writes primarily occur as tom query language includes a numerical column (e.g.,
new inserts, not as updates to existing rows. Further, cpu > 0.7). String columns pose other challenges for
while the database needs to be able to support backfill most NoSQL time-series databases. For example, one
for delayed data, writes are made primarily to recent time approach for indexing string columns (e.g., ip_address,
intervals. uuid, event_name) is to use a concatenated string of all
(2) Workloads have a natural partitioning across both indexed values as a key. This creates a tradeoff between
time and space. Writes typically are made to the lat- the number of indexed fields and performance: searching
est time interval(s) and across the “partitioning key” in for one field requires a scan across all other indexed fields.
the space dimension (e.g., data sources, devices, users, These problems make multi-field queries (e.g., “all met-
etc.). Queries typically ask questions about a specific rics from devices of a certain type with low battery”)
time series or data source, or across many data sources less efficient. Such queries are quite common in dash-
constrained to some time interval. Yet the queries might boards, reports, and data exploration. Further, they lack
not be limited to a particular metric, but may regularly the reliability, tooling, and ecosystem of more widely-
select multiple metrics at once (or use predicates that rely used traditional RDBMS systems.
on multiple metrics). Data lakes (e.g., SQL / Hadoop / Spark on HDFS).
Such workloads open up a new space of database ar- Distributed block/file systems like HDFS avoid the need
chitecture possibilities, of which TimescaleDB heavily to predefine data models or schemas, and easily scale
takes advantage. Notably, not only are these characteris- by adding more servers. They also can handle high
tics different from traditional OLTP workloads, but also write rates by large, immutable batches of data. How-
from analytical (OLAP) workloads that focus on read- ever, they pay the cost at query time, lacking the highly
heavy rollups and aggregations of single metrics. structured indexes needed for fast and resource-efficient
queries. Further, data backfill and updates are difficult
2 Limitations of existing solutions and very expensive, given their underlying storage in large
immutable blocks.
Existing solutions require users to choose between either So, while query engines on top of HDFS can expose
scalability or rich query support. a SQL or rich programming interface (e.g., Presto, Pig,
Vanilla RDBMS (e.g., PostgreSQL, MySQL). Tradi- SparkSQL, etc.) many types of queries that can be han-
tional SQL databases have two key problems in handling dled efficiently by RDBMS with appropriate index sup-
high ingest rates: They have poor write performance port turn into full table scans in their underlying storage
for large tables, and this problem only becomes worse interface. While HDFS allows such scans to be eas-
over time as data volume grows linearly in time. These ily parallelized, query latency and throughput still suffer
problems emerge when table indexes can no longer fit significantly. This relegates such architectures more for
in memory, as each insert will translate to many disk single-tenant data exploration or report generation, not
fetches to swap in portions of the indexes’ B-Trees. Fur- dashboarding or real-time operational monitoring.
ther, any data deletion (to save space or to implement
data retention policies) will require expensive “vacuum- 3 TimescaleDB architecture
ing” operations to defragment the disk storage associated
with such tables. Also, out-of-the-box open-source so- TimescaleDB is an open-source database for time-series
lutions for scaling-out RDBMS across many servers are data. Its goal is to enable both the scale-out nature of
still lacking. NoSQL databases, and the reliability and query support
NoSQL and time-series DBs. In response, devel- of a traditional relational DB. We’ve designed a new clus-
opers sometimes adopt NoSQL databases (e.g., Cas- tered DB in this manner, built around a Postgres core
sandra, Mongo) or modern time-series databases (e.g., running on each server.
OpenTSDB, InfluxDB) for their needs. These are typi- TimescaleDB supports the key features of a modern
cally column-oriented databases for fast ingest and fast time-series database, as summarized in Figure 1.
analytical queries over one column, often exposing a Hypertables and chunks. At a high-level, the database
simpler key-value interface. Depending on data model exposes the abstraction of a single continuous table—a
choices, this works well for visualizing either a single hypertable—across all space and time intervals, such that
metric (e.g., the CPU utilization of a device) or some ag- one can query it via vanilla SQL. A hypertable is defined
gregate metric (e.g., average CPU over all devices). But, with a standard schema with column names and types,
they often lack a rich query language or secondary index with at least one column specifying a time value, and—in
support, and suffer high latency on complex queries. clustered deployments—one column specifying a “parti-
For example, databases like InfluxDB will fall back tioning key” over which the dataset can be additionally
to full table scans if the WHERE clause of their cus- partitioned.
2
• Time-series data optimized • Optimizations for complex queries
• Full SQL interface − Intelligent chunk selection for queries
• Scaling up and out − Minimize scanning for distinct items
− Transparent time/space partitioning − Limit pushdowns
− Parallelized ops across chunks and servers − Parallelized aggregation
− Right-sized chunks for single nodes • Leverage existing query planner
• High data write rates − Join against relational data
− Batched commits − Geo-spatial queries
− In-memory indexes • Flexible management
− Transactional support − Leverage existing DB ecosystem and tooling
− Support for data backfill − Highly reliable (streaming replication, backups)
− Automated data retention policies
Internally, TimescaleDB automatically splits the hy- • Simple read requirements: When most of your
pertable into chunks, where a chunk corresponds to a query patterns are simple in nature (e.g., key-based
“two-dimensional” split according to a specific time in- lookups, or one dimensional rollups over time).
terval and a region of the partition key’s space (e.g., using • Low available storage: When resource constraints
hashing). Each chunk is implemented using a standard place storage at a premium, and heavy compression
database table that is automatically placed on one of the is required. (Although this is an area of active devel-
database nodes (or replicated between multiple nodes), opment, and we expect TimescaleDB to improve.)
although this detail is largely hidden from users. A single • Sparse and/or unstructured data: When your
TimescaleDB deployment can store multiple hypertables, time-series data is especially sparse and/or gener-
each with different schemas. ally unstructured. (But even if your data is partially
structured, TimescaleDB includes a JSONB field
Engineered up from PostgreSQL. By choosing to en- type for the unstructured part(s). This allows you
gineer up from PostgreSQL, rather than building from to maintain indexes on the structured parts of your
scratch, TimescaleDB gains four immediate benefits. data combined with the flexibility of unstructured
storage.)
1. Rock-solid reliability. At its core, TimescaleDB’s
reliability manifests from Postgres’ 20-year open-
source record and strong developer community. 4 Technical details
2. Mature ecosystem. TimescaleDB users can con-
nect via standard ODBC, JDBC, or Postgres for Ideally database users should be able to interact with time-
third-party visualization tools, BI tools, manage- series data as if it were in a simple continuous database
ment interfaces, web platforms and ORMs. table. However, for reasons discussed above, using a sin-
3. Standard interface. TimescaleDB users do not gle table does not scale. Yet requiring users to manually
need to learn a new query language and management partition their data exposes a host of complexities, e.g.,
framework, and can leverage their existing comfort forcing users to constantly specify which partitions to
with SQL and Postgres. query, how to compute JOINs between them, or how to
4. Operational Ease of Use. Users can reuse known properly size these tables as workloads change.
and trusted methods for backups, snapshots, active
replication, and other operational tasks. To avoid this management complexity while still scal-
ing and supporting efficient queries, TimescaleDB hides
In fact, all of TimescaleDB is implemented as a Post- its automated data partitioning and query optimizations
gres extension, rather than a fork, so can be installed on behind its hypertable abstraction. Creating a hypertable
a standard distribution of Postgres. and its corresponding schema is a simple, standard SQL
command (see Figure 2), and this hypertable can then be
When you might want to consider alternatives. accessed as if it were a single table using standard SQL
Database design typically requires making deliberate commands (see Figure 3). Further, just like a normal
trade-offs, and we would be remiss not to mention the database table, this schema can be altered via standard
scenarios where there may be better alternatives: SQL commands; transparently to the user, TimescaleDB
3
# Create a schema for a new hypertable # Metrics about resource-constrained devices
CREATE TABLE sensor_data ( SELECT time, cpu, freemem, battery FROM devops
"time" timestamp with time zone NOT NULL, WHERE device_id=’foo’
device_id TEXT NOT NULL, AND cpu > 0.7 AND freemem < 0.2
location TEXT NULL, ORDER BY time DESC
temperature NUMERIC NULL, LIMIT 100;
humidity NUMERIC NULL,
pm25 NUMERIC # Calculate total errors by latest firmware versions
); # per hour over the last 7 days
SELECT date_trunc(’hour’, time) as hour, firmware,
# Create a hypertable from this data COUNT(error_msg) as errno FROM data
SELECT create_hypertable WHERE firmware > 50
(’sensor_data’, ’time’, ’device_id’, 16); AND time > now() - interval ’7 day’
GROUP BY hour, firmware
# Migrate data from existing Postgres table into ORDER BY hour DESC, errno DESC;
# a TimescaleDB hypertable
INSERT INTO sensor_data (SELECT * FROM old_data); # Find average bus speed in last hour
# for each NYC borough
# Query hypertable like any SQL table SELECT loc.region, AVG(bus.speed) FROM bus
SELECT device_id, AVG(temperature) from sensor_data INNER JOIN loc ON (bus.bus_id = loc.bus_id)
WHERE temperature IS NOT NULL AND humidity > 0.5 WHERE loc.city = ’nyc’
AND time > now() - interval ’7 day’ AND bus.time > now() - interval ’1 hour’
GROUP BY device_id; GROUP BY loc.region;
Figure 2: Creating a new hypertable is a simple SQL command. Figure 3: Query examples using standard SQL and the abstrac-
Subsequently, users can interact with a hypertable as if it’s tion of a single global table. Note that the complexities of data
a standard, single database table, including issuing full SQL partitioning for scalability is hidden from the user.
queries on it.
discuss later, information about chunks’ ranges allows the
is atomically modifying the schemas of all the underlying query planner to determine which chunks to query when
chunks that comprise a hypertable. resolving a query, particularly when WHERE or GROUP
TimescaleDB provides this functionality by hooking BY clauses include the time or space (partitioning key)
into the query planner of Postgres, so that it receives dimension (this is commonly referred to as constraint
the native SQL parse tree. It then can use this tree to exclusion analysis).
determine which servers and hypertable chunks (native Local indexes can be built on any database column,
database tables) to access, how to perform distributed and not only on the main partitioning key, and are defined as
parallel optimizations, etc. the conjunction of both time and the column being in-
Many of these same optimizations even apply to single- dexed. For example, in the hypertable shown in Figure 2,
node deployments, where automatically splitting hyper- if the partitioning key is device id, secondary indexes also
tables into chunks and related query optimizations still can be defined on location and any of the numerical sen-
provides a number of performance benefits. sor readings (temperature, humidity, particulate matter).
We highlight and explain TimescaleDB’s architecture By defining the local index on both these columns and
and design choices in the rest of this section. time, TimescaleDB’s planner again knows how to opti-
mize queries given any use of time in the query predicate.
4.1 Scaling up and out
Parallelizing across chunks and servers. Chunks are
Transparent time/space partitioning. As discussed, dynamically created by the runtime and sized to opti-
the database scales by partitioning hypertables in two di- mize performance in both cluster and single-node envi-
mensions: by time interval, and by a “partitioning key” ronments. When run as a cluster, chunks are placed on
over some primary index for the data (e.g., device iden- different servers; on a single machine, chunks can also be
tifiers for sensor data, locations, customers, users, etc.). automatically spread across disks. With either approach,
Each time/space partition is called a chunk, which is cre- partitioning by space parallelizes inserts to recent time
ated and placed on a server and disk automatically by the intervals. Similarly, query patterns often slice across
system. An illustration of TimescaleDB’s architecture is time or space, so also enjoy performance improvements
shown in Figure 4. through smart chunk placement.
Every database node knows about the time and space By default, chunks belonging to the same region of
ranges comprising data in each chunk, and each node partition keyspace, yet varying by time intervals, are col-
builds local indexes on each individual chunk. As we located on the same servers. This avoids queries touching
4
PARTITION KEY
further support high data ingest rates. These approaches
improve performance when employed on either HDDs or
SSDs.
LAT E ST
QUERY ACROSS
SPACE In-memory indexes. Because chunks are right-sized to
TIME servers, and thus the database never builds massive single
tables, TimescaleDB avoids swapping indexes to disks for
CHUNKS recent time intervals (where most writes occur). Yet it can
OLDEST efficiently support any type of Postgres index on columns,
SERVER QUERY ACROSS TIME
from more traditional text or numerical columns, to more
specialized indexes on array data types or GIS (spatial)
Figure 4: TimescaleDB architecture. Data is partitioned columns.
across both time and space, with the resulting chunks organized
Transactional support. TimescaleDB supports full
for common query patterns. Writes are largely sent to the latest
chunks, while queries slice across both time and space.
transactions over entries with the same partition key. In a
monitoring application, for example, this ensures transac-
tional semantics on a per-device basis and guarantees that
all servers when performing queries for a single object in
multiple device measurements, which may each involve
space (e.g., a particular device), which helps reduce tail
many individual sensor metrics, are atomically inserted.
latency under higher query loads.
Right sizing chunks for single nodes. Even in single- Support for data backfill. Even though TimescaleDB’s
node settings, chunking still improves performance over architecture is optimized for the scenario when most
the vanilla use of a single database table for both inserts writes are to the latest time intervals, it fully supports the
and deletes. Right-sized chunks ensure that all of the “backfill” of delayed data. Additionally, the database’s
B-trees for a table’s indexes can reside in memory during automated chunk management is also aware of the pos-
inserts to avoid thrashing while modifying arbitrary lo- sibility of backfill, so a modest amount of delayed data
cations in those trees. Further, by avoiding overly large can be configured to not “overflow” a chunk’s size limit
chunks, we can avoid expensive “vacuuming” operations to unnecessarily create additional undersized chunks for
when removing deleted data according to automated re- that interval.
tention policies, as the runtime can perform such opera- Performance benefits for single nodes. While data par-
tions by simply dropping chunks (internal tables), rather titioning is traditionally seen as a mechanism for scaling
than deleting individual rows. At the same time, avoid- out to many servers, TimescaleDB’s approach also pro-
ing too-small chunks improves query performance by not vides meaningful performance improvements even when
needing to read additional tables and indexes from disk. employed on a single machine.
TimescaleDB performs such time/space partitioning To evaluate this behavior, we performed experiments
automatically based on table sizes, rather than an ap- where clients both insert individual rows to the databases,
proach based on static time intervals more commonly as well as large batches of rows in single operations. Each
practiced (e.g., by creating a separate table per day). row includes 12 values in separate columns: a times-
When systems lack TimescaleDB’s transparent hyper- tamp, an indexed randomly-chosen primary id, and 10
table abstraction, interval-based partitioning might make additional numerical metrics. Such batched inserts are
the manual table selection and joins at least tractable (al- common practice for databases employed in more high-
though not easy). But, such intervals work poorly as data scale production environments, e.g., when ingesting data
volumes change, e.g., a time interval appropriate for a from a distributed queue like Kafka. Figure 5 illustrates
service pulling data from 100 devices is not appropriate the results.
when that system scales to 100K devices. TimescaleDB In both scenarios, standard Postgres tables hit a per-
avoids the need to make this choice by managing data formance cliff after tens of millions of rows. Not only
partitioning automatically. does throughput drop off, but the variance increases sig-
nificantly. In single-row inserts, the database achieves
4.2 High data write rates only hundreds of inserts per second quite regularly. With
Batched commits. Writes are typically made to re- batched inserts, the insert rate for Postgres converges to
cent time intervals, rather than old tables. This allows only thousands of inserts period, around a 30x decrease
TimescaleDB to efficiently write batch inserts to a small from its performance at the onset.
number of tables as opposed to performing many small TimescaleDB, on the other hand, maintains constant
writes. Further, our scale-out design also takes advantage insert performance and low variance regardless of the
of time-series workloads to recent time intervals, in order database size, as individual chunks remain appropriately
to parallelize writes across many servers and/or disks to sized. When inserting in batches, TimescaleDB starts out
5
Insert batch size: 1, Cache: 4 GB memory 250,000 Insert batch size: 10000, Cache: 16 GB memory
16,000
PostgreSQL PostgreSQL
TimescaleDB TimescaleDB
14,000 200,000
Insert rate [rows / second]
(a) Single-row inserts, 4GB cache (b) Batched inserts, 16GB cache
Figure 5: Insert throughput into TimescaleDB’s chunked hypertable (red) compared to a standard table in PostgreSQL (blue) on
a single database server. Experiments run using PostgreSQL 9.6.2 on a Azure standard DS4 v2 (8 core) machine with SSD-based
(premium LRS) storage. Every inserted row has 12 columns. Trend lines constitute a polynomial fit of the underlying data; each
datapoint shows the average insert throughput over a 20s period. Throughput drops in PostgreSQL as tables grow large, while
hypertable insert performance remains constant regardless of total data volume.
at an average rate of about 140,000 rows per second, much When coupled with its other optimizations, such queries
like vanilla Postgres. But unlike Postgres, TimescaleDB touch only the necessary chunks and perform efficiently-
maintains its stable performance regardless of data size. indexed queries on each individual chunk.
Minimizing ordered data from distinct chunks.
4.3 Optimizations for complex queries
TimescaleDB provides a number of additional query op-
Intelligently selecting chunks needed to satisfy timizations that benefit both single-node and clustered
queries. Common queries to time-series data include deployments. Postgres already enables a “merge append”
(i) slicing across time for a given object (e.g., device), optimization when combining in-order data from multi-
(ii) slicing across many objects for a given time interval, ple chunks, where a query efficiently combines data from
or (iii) querying the last reported data records across (a these tables in sorted order, adding rows one-by-one to
subset of) all objects or some other distinct object label. the query result in the proper sorted order, and then stop-
Such queries to time or space, which may require scan- ping when the global query has been satisfied (e.g., based
ning over many chunks (disks, servers), are illustrated in on its LIMIT). This optimization ensures that a subquery
Figure 4. is only incrementally processed on a table if its result
While users perform these queries as if interacting with would benefit the final global result set. This is particu-
a single hypertable, TimescaleDB leverages internally- larly beneficial for queries with complex predicates, such
managed metadata to only query those chunks that may that finding the “next” item that matches the predicate can
possibly satisfy the query predicate. By aggressively involve scanning a significant amount of data.
pruning many chunks and servers to contact in its query TimescaleDB extends such merge-appends optimiza-
plan, TimescaleDB improves both query latency and tions to bring these significant efficiencies to time-based
throughput. aggregates. Such aggregates appear quite regularly for
Minimizing scanning back in time: “LIMIT BY” time-series analysis, such as “tell me the average of a
queries for distinct items. Similarly, for items like metric per hour, for the last 12 hours that the device has
unique devices, users, or locations, one often wants to reported data” which expressed as SQL involves “GROUP
ask questions like “give me the last reading for every de- BY hour ORDER BY hour DESC LIMIT 12.” This way,
vice.” While this query can be natively expressed in SQL even without a strict time-range specified by the user (un-
using windowing operators, such a query would turn into like in many time-series databases), the database will only
a full table scan for most relational databases. In fact, process those minimal set of chunks and data needed to
this full table scan could continue back to the beginning answer this query.
of time to capture “for every device” or at best sacrifice Parallelized aggregation. Much like its LIMIT push-
completeness with some arbitrarily-specified time range. down, TimescaleDB also pushes down aggregations for
To efficiently support such queries, TimescaleDB au- many common functions (e.g., SUM, AVG, MIN, MAX,
tomatically tracks metadata about “distinct” items in COUNT) to the servers on which the chunks reside.
the database, as specified in the hypertable’s schema. Primarily a benefit for clustered deployments, this dis-
6
tributed query optimization greatly minimizes network tion techniques employed by Postgres, namely, streaming
transfers by performing large rollups or GROUP_BYs in replication and cold/hot standbys, as well as backups. It
situ on the chunks’ servers, so that only the computed also uses Postgres’ write-ahead log (WAL) for consistent
results need to be joined towards the end of the query, checkpointing. In other words, even though replication or
rather than raw data from each chunk. backup policies can be defined (or commands issued) on
the hypertable, TimescaleDB performs these actions by
4.4 Leveraging the RDBMS query planner replicating or checkpointing the hypertable’s constituent
Because each node runs a full-fledged Postgres query chunks.
planner and data model, deploying new optimizations for Automated data retention policies. TimescaleDB al-
particular queries, indexes, and data types are easy. lows for easily defining data retention policies based
From the start, TimescaleDB supports Postgres’ full on time. For example, users can configure the sys-
SQL interface. In the current implementation, some tem to cleanup/erase data more than X weeks old.
queries are optimized more than others. However, subse- TimescaleDB’s time-interval-based chunking also helps
quent releases over time will include additional query op- make such retention policies more efficient, as the
timizations, allowing the database to “grow” with users’ database can then just DROP its internal data tables that
needs, without requiring any changes by users. are expired, as opposed to needing to delete individual
Join against relational data. Today, you can compute rows and aggressively vacuum the returning tables. For
joins between hypertables and standard relational tables, efficiency, these policies can be implemented lazily, i.e.,
which are either stored directly in the database or accessed individual records that are older than the expiry period
via foreign data wrappers to external databases. Future might not be immediately deleted. Rather, when all data
optimizations will minimize data movement during joins. in a chunk becomes expired, then the entire chunk can
Most time-series databases today do not support such just be dropped.
JOINs. That lack of support requires that users denor-
malize their data by storing additional metadata or la- 5 Conclusion and Status
bels with every time-series record. This approach greatly
expands data sizes, and makes updating metadata very Many time-series applications today are asking more
expensive. Alternatively, application writers “silo” their complex questions of their data than before: analyzing
data between databases, and then require the applica- historical trends, monitoring current behavior, identify-
tion writer to perform this join between relational and ing possible problems, predicting future behavior, etc.
time-series data outside of the database, which increases The data, in turn, is being collected at higher volumes
overall system complexity. and velocities. In order to serve these applications, the
modern time-series database needs to marry both scala-
Geo-spatial queries. TimescaleDB supports arbitrary
bility and support for highly performant complex queries.
Postgres data types within the time series, including GPS-
TimescaleDB achieves that combination through its
coordinate data by leveraging PostGIS, which provides
automatic time/space partitioning, optimized query plan-
best-in-class support for geo-spatial data. Because a
ning, and deep integration with PostgreSQL. At the same
chunk can use any type of indexing on its data, GIN/GiST
time, TimescaleDB delivers this performance through an
indexes are supported on GIS data right out of the box.
easy-to-use interface, thanks to its hypertable abstraction,
4.5 Flexible management as well as full SQL support.
TimescaleDB is in active development by a team of
Because chunks are native database tables internally, we
PhDs based in New York City and Stockholm, backed by
can readily leverage the rich set of existing Postgres ca-
top-tier investors. An open-source single-node version
pabilities in TimescaleDB.
featuring scalability and full SQL support is currently
Tooling ecosystem. TimescaleDB leverages the database available for download. A clustered version is in private
management tooling and features that have developed beta with select customers. We welcome any feedback at
within the Postgres ecosystem over two decades. Users hello@timescale.com.
can connect to TimescaleDB via standard JDBC or
ODBC connectors and psql command-line tools. Yet
given the way that TimescaleDB’s partitioning is imple- Publication: v6 – April 7, 2017
mented, users typically only see their hypertables (on
which they can specify these management functions),
rather than their concomitant chunks.
Highly reliable (replication and backups).
TimescaleDB can reuse the battle-tested replica-