{{ viewTitle }}
{{ viewDesc }}
MergeTree tables, then fans every insert out to a set of materialized-view indexes and dictionaries. Each one re-sorts or pre-aggregates the same rows so a lookup that would scan billions touches only thousands. Engines are colour-coded below; CREATE scripts are at the bottom.blocks.
transactions.
active_addresses_last_30_days_from_dict / _to_dict — back the from_address_murmur / to_address_murmur MATERIALIZED columns, so only active wallets get a non-zero morton code in the sort key.Morton (Z-order) primary key
eth.transactionsA row should be findable by either party — but only for the wallets where that pays off. mortonEncode interleaves the bits of two values into one sort key, co-locating rows that share a value in either position. The catch: it isn't fed the raw addresses but the *_address_murmur columns — which are non-zero only for active wallets. Every other address hashes to 0.
CREATE TABLE eth.transactions
(
...
from_address FixedString(20),
-- murmur = the address hash ONLY for active wallets
-- (in the dict: > 256 txns over the last 30 days); else 0
from_address_murmur UInt64 MATERIALIZED if(dictHas(
'active_addresses_last_30_days_from_dict',
murmurHash3_64(from_address)), murmurHash3_64(from_address), 0),
to_address FixedString(20),
to_address_murmur UInt64 MATERIALIZED if(dictHas(
'active_addresses_last_30_days_to_dict',
murmurHash3_64(to_address)), murmurHash3_64(to_address), 0),
...
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(block_timestamp)
ORDER BY (mortonEncode(to_address_murmur, from_address_murmur),
block_number, transaction_index)
SETTINGS index_granularity = 256
Both *_address_murmur columns are MATERIALIZED — computed on insert, non-zero only when the address is in the active dictionary; everyone else hashes to 0.
The ORDER BY sorts on mortonEncode(to, from), so an active address's rows cluster whether it's the sender or the receiver. PARTITION BY month and index_granularity 256 keep every scan bounded.
morton 0SELECT from_address, to_address, from_address_murmur, to_address_murmur,
mortonEncode(to_address_murmur, from_address_murmur) AS morton
FROM eth.transactions
ORDER BY morton, block_number, transaction_index
LIMIT 4 -- the very front of the table
Inactive↔inactive transfers — almost the entire chain — all hash to 0, so they share morton 0 and sit together at the very front, ordered only by block_number (note the genesis-era blocks). They're located through the reverse index's block-number bitmap, never by morton.
SELECT from_address, to_address, from_address_murmur, to_address_murmur,
mortonEncode(to_address_murmur, from_address_murmur) AS morton
FROM eth.transactions
WHERE to_address_murmur != 0 AND from_address_murmur = 0
ORDER BY morton DESC, block_number DESC
LIMIT 4
This active recipient (to_murmur = 4828955614929454015) is paid by ordinary, inactive senders (from_murmur = 0). Because only the active side carries a hash, mortonEncode(to_murmur, 0) is identical for every one — all of its received transactions land on a single morton code, i.e. one contiguous run on disk the primary index seeks straight to. No scan.
The bargain: spend morton's interleave only on the ~32k wallets busy enough to be looked up by address, and let the billion-address long tail collapse to one code reached by bitmap. It's never about an address value being large or small — only active or not.
-- granules = rows touched; ranges = their contiguous runs
SELECT
_part,
count() AS rows,
uniqExact(intDiv(_part_offset,256)) AS granules,
arrayCount(
x -> x > 1,
arrayDifference(arraySort(
groupUniqArray(intDiv(_part_offset,256))
))
) + 1 AS ranges,
max(_part_offset) - min(_part_offset) + 1 AS span
FROM eth.transactions
WHERE <this address>
GROUP BY _part
0x288204d5…0bbf
part 202103
Both murmurs are 0 → morton 0. Its 93 rows scatter across 17 granules in 5 separate ranges (span 6,984 ≫ 93) — the reader makes 5 seeks via the block bitmap.
0xeaf9a34b…dc42
part 202304
to_murmur ≠ 0 → one morton code. 155 rows packed into 2 adjacent granules forming 1 contiguous range (span = 155, zero gaps) — a single seek.
Granules are the rows touched; ranges are the separate seeks to reach them. Active: 1 range. Inactive: 5 ranges. Fewer, fatter reads — that's the payoff morton buys for the wallets that get looked up.
Reverse-index materialized views
*_reverse_idxThe table is sorted by morton code, so finding a transaction by hash would otherwise be a full scan. A materialized view inverts that lookup, mapping each hash to its block. (The address → day rollup that powers wallet activity is covered in the Window section.)
-- 1. hash → block : a flat lookup table CREATE MATERIALIZED VIEW transactions_hash_reverse_idx_stream TO transactions_hash_reverse_idx AS SELECT murmurHash3_32(hash) AS hash, block_number FROM eth.transactions;
-- using it: resolve the block from the index, then read just that block
SELECT * FROM eth.transactions
WHERE block_number IN (
SELECT block_number FROM transactions_hash_reverse_idx
WHERE hash = murmurHash3_32(unhex(substring('0x63f7…20c59a', 3)))
)
AND hash = unhex(substring('0x63f7…20c59a', 3));
A flat map of murmurHash3_32(hash) → block_number, sorted by the hash. Resolve the block first, then read just that one block instead of the whole chain.
A projection can be the reverse index
No separate table needed: a projection index is a lightweight projection that stores only the row order for an expression — internally SELECT _part_offset ORDER BY murmurHash3_32(hash). ClickHouse uses it to resolve WHERE hash = … on its own — a reverse index rebuilt automatically on every insert, with no MV and no subquery to write.
-- a projection INDEX: stores just the permutation for an expression
-- (ClickHouse projection-index feature, TYPE basic = a normal index)
ALTER TABLE eth.transactions
ADD PROJECTION order_by_hash INDEX murmurHash3_32(hash) TYPE basic;
-- the planner uses it automatically; no subquery, no join
SELECT * FROM eth.transactions
WHERE hash = unhex(substring('0x63f7…20c59a', 3));
optimize_use_projections = 0, measured with the condition cache off so it doesn't skew the count. Turn the condition cache on and the same lookup drops to ~7K.
Versus the MV approach, there's nothing to query explicitly — the projection index lives in the table and the planner picks it when the WHERE filters on hash (matched via murmurHash3_32(hash) under the hood).
See it live in the Transaction tab: Projection uses order_by_hash; Naive runs the same query with projections off and scans the whole chain.
Windowed last-N over the aggregating index
transactions_address_reverse_idxThe reverse index is an AggregatingMergeTree grouped by (address, direction, day), keeping per day a transaction count, a SET of morton codes, and a BITMAP of block numbers. To return an address's most recent N transactions, walk its day-groups newest-first, keep a running total with a window function, and stop the moment the newer groups already cover N — then probe the main table by just those codes & blocks. No fixed cap, so it can never come up short.
WITH idx AS (
SELECT morton, block_number,
sum(transactions_count) OVER (ORDER BY block_date DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS newer
FROM eth.transactions_address_reverse_idx
WHERE address = murmurHash3_64(unhex('…'))
QUALIFY newer < 25 -- keep groups until 25 is covered
)
SELECT ... FROM eth.transactions AS t
WHERE mortonEncode(t.to_address_murmur, t.from_address_murmur)
IN (SELECT arrayJoin(morton) FROM idx)
OR block_number IN (SELECT arrayJoin(bitmapToArray(block_number)) FROM idx)
ORDER BY block_number DESC, transaction_index DESC
LIMIT 25
For this address (143 txns over 49 days), the last 25 are covered by the 9 newest day-groups — 29 candidate rows. The other 40 days, including a 48-tx day just below the cut, are never read. On a long-lived address it's a handful of recent groups instead of its entire history.
Projections (a pre-aggregated rollup)
PROJECTION tx_per_dayA projection is a second copy of the table stored inside it — it can re-sort the rows or, as here, pre-aggregate them with a GROUP BY. The aggregation does the heavy lifting: tx_per_day collapses 383M transactions into 365 daily rows, so a year's count reads 365 rows instead of re-scanning the chain. ClickHouse swaps it in automatically when a query's GROUP BY matches; flip any Analytics panel to Naive (optimize_use_projections = 0) to force the raw path.
-- transactions per day across all of 2023 SELECT toDate(block_timestamp) AS day, count() FROM eth.transactions WHERE toDate(block_timestamp) BETWEEN '2023-01-01' AND '2023-12-31' GROUP BY day SETTINGS optimize_use_projections = 1 -- then 0
Both paths return the same 365 rows. The projection reads one pre-grouped row per day; the raw path re-scans every transaction in the year — 383 million rows, 1.8 GB.
PROJECTION tx_per_day (
SELECT toDate(block_timestamp) AS date,
count() AS total_count,
countIf(receipt_status) AS completed_count
GROUP BY date )
It's stored and merged alongside the table, so it's always in sync — no manual refresh. The planner picks it transparently whenever a query's GROUP BY matches.
Bloom skip index on a sort-key column
address_bloomSkip indexes aren't only for non-key columns. The reverse index is already ORDER BY (address, …), yet it also carries a bloom on address. The sparse primary index narrows to candidate granules but can't prove a value is absent — the bloom can, with no false negatives, so ClickHouse drops granules the primary key would otherwise read.
-- one address, run twice: skip index ON vs OFF
SELECT count()
FROM eth.transactions_address_reverse_idx
WHERE address = murmurHash3_64(unhex('eaf9a34b…dc42'))
SETTINGS use_skip_indexes = 1 -- then 0
Both runs return the same 212 rows. The index is partitioned by month: with the bloom off, the primary key reads one candidate granule in every one of the 264 parts; with it on, the 215 parts where this address never appears are skipped — only 49 are touched.
INDEX address_bloom address TYPE bloom_filter(0.0001) GRANULARITY 100
A sparse index only orders by the first key
ClickHouse's primary index is sparse — one mark per granule, in full sort-key order. A filter on the leading key bisects straight to a handful of granules. But a later key is only sorted within each run of the one before it, so filtering on it alone gives the sparse index nothing to bisect and degrades to a near-full scan. block_number is the case in point — blocks arrive in order, but the table is ORDER BY (mortonEncode(…), block_number, …), so morton scatters the blocks across granules. A minmax + bloom_filter skip index on block_number puts the pruning back.
INDEX block_number_minmax block_number TYPE minmax GRANULARITY 2 INDEX block_number_bloom block_number TYPE bloom_filter(0.001) GRANULARITY 15
minmax keeps the [min, max] block per granule-group and drops any whose range can't hold N; the bloom confirms presence for the survivors. So a filter on a non-leading key prunes the table again — the mechanism behind the Block tab's rows-read counter.
Query condition cache
ClickHouse 25.3+ · use_query_condition_cacheDifferent from the query result cache, which memorises the whole answer (and can go stale): the condition cache records, for each granule, a single bit — did any row match this WHERE? A repeat query with the same predicate skips every 0-bit granule, even on a column with no index, projection, or sort-key help — and it still applies when the rest of the query changes, as long as the filter is the same.
-- a selective filter on a non-indexed column, run twice SELECT count() -- 87 transfers over 30,000 ETH FROM eth.transactions WHERE block_timestamp >= '2023-06-01' AND block_timestamp < '2023-07-01' AND value > 30000e18 SETTINGS use_query_condition_cache = 1
The first run pays the scan and records the bitmap; every later query with that filter rides it. It can't go stale — parts are immutable, so a recorded bit stays valid; only brand-new parts are still unknown.
SYSTEM CLEAR QUERY CONDITION CACHE resets it.Query result cache & paging prefetch
use_query_cache · LIMIT over-fetchToggle query cache in the header. The first run of a deterministic query is stored, so an identical re-run returns straight from memory — rows read ≈ 0, answered in microseconds. The listings lean on this with one deliberate trick: they over-fetch.
-- the table shows 10 rows per page, but every listing query asks for 100 SELECT … FROM eth.transactions WHERE … ORDER BY … LIMIT 100; -- one deterministic batch — identical text on every page flip
Paging by the displayed 10 rows would issue a new LIMIT … OFFSET … per page — every one a cache miss. Fetching 100 at once makes pages 1–10 one cached query; the client just shows a different 10-row slice.
The result cache only reuses an answer when the query text and settings match exactly. A fixed LIMIT 100 — not a per-page offset — keeps the text identical, so a re-run or a return to page 1 reads ≈ 0 rows.
Column data types & codecs
the smallest type winsThe biggest compression lever isn't a codec — it's the data type. The S3 feed arrives as wide String and Int64 columns; narrowing each to the smallest type that fits the data cut the transactions table nearly in half — before touching a single codec.
eth.transactions.
A 66-char hex String becomes FixedString(32): drop the 0x, unhex 64→32 bytes, and lose String's length byte — >2× smaller. Hashes are pure entropy, so codecs can't shrink them; the type is the only lever (hash even uses CODEC(NONE)).
Int64 wastes 4–7 bytes per value. block_number → UInt32 (130+ yrs of headroom), transaction_index → UInt16 (~1,500 tx/block), receipt_status & transaction_type → UInt8 (2 and 4 values).
block_hash repeats for every transaction in a block — thousands of identical values. Dictionary-encoding drops uncompressed size 1.17 GiB → 80 MiB, and a GROUP BY on it from 263 MiB → 8 MiB peak memory.
Apply only where they pay
Once the types are right, a codec is the second lever — and not a free one. ClickHouse processes data uncompressed, so a heavier codec is paid back as CPU on every read. Reach for it only on the largest columns, where the byte savings dwarf the decode cost.
Per-column reads balloon: from_address 0.86s → 2.65s (3×), transaction_index 0.05s → 0.99s (19×). A whole-table scan is ~1.8× slower and writes ~2× slower.
input CODEC(ZSTD(3)) from_address, to_address CODEC(ZSTD(3)) nonce, *_gas_used CODEC(T64, LZ4) hash CODEC(NONE) -- random