Instead of saving all values, it saves only a portion making primary keys super small. When choosing primary key columns, follow several simple rules: Technical articles on creating, scaling, optimizing and securing big data applications, Data-intensive apps engineer, tech writer, opensource contributor @ github.com/mrcrypster. Can I ask for a refund or credit next year? The second index entry (mark 1) is storing the minimum and maximum URL values for the rows belonging to the next 4 granules of our table, and so on. the second index entry (mark 1 in the diagram below) is storing the key column values of the first row of granule 1 from the diagram above, and so on. If in addition we want to keep the good performance of our sample query that filters for rows with a specific UserID then we need to use multiple primary indexes. Therefore it makes sense to remove the second key column from the primary index (resulting in less memory consumption of the index) and to use multiple primary indexes instead. When using ReplicatedMergeTree, there are also two additional parameters, identifying shard and replica. For tables with wide format and without adaptive index granularity, ClickHouse uses .mrk mark files as visualised above, that contain entries with two 8 byte long addresses per entry. As a consequence, if we want to significantly speed up our sample query that filters for rows with a specific URL then we need to use a primary index optimized to that query. Each mark file entry for a specific column is storing two locations in the form of offsets: The first offset ('block_offset' in the diagram above) is locating the block in the compressed column data file that contains the compressed version of the selected granule. ), path: ./store/d9f/d9f36a1a-d2e6-46d4-8fb5-ffe9ad0d5aed/all_1_9_2/, rows: 8.87 million, 740.18 KB (1.53 million rows/s., 138.59 MB/s. URL index marks: This will lead to better data compression and better disk usage. In order to illustrate that, we give some details about how the generic exclusion search works. ), Executor): Running binary search on index range for part prj_url_userid (1083 marks), Executor): Choose complete Normal projection prj_url_userid, Executor): projection required columns: URL, UserID, cardinality_URLcardinality_UserIDcardinality_IsRobot, 2.39 million 119.08 thousand 4.00 , , 1 row in set. Feel free to skip this if you don't care about the time fields, and embed the ID field directly. Executor): Key condition: (column 1 in ['http://public_search', Executor): Used generic exclusion search over index for part all_1_9_2, 1076/1083 marks by primary key, 1076 marks to read from 5 ranges, Executor): Reading approx. However, as we will see later only 39 granules out of that selected 1076 granules actually contain matching rows. Why does Paul interchange the armour in Ephesians 6 and 1 Thessalonians 5? of our table with compound primary key (UserID, URL). To learn more, see our tips on writing great answers. Furthermore, this offset information is only needed for the UserID and URL columns. When a query is filtering on a column that is part of a compound key and is the first key column, then ClickHouse is running the binary search algorithm over the key column's index marks. Javajdbcclickhouse. When parts are merged, then the merged parts primary indexes are also merged. Therefore also the content column's values are stored in random order with no data locality resulting in a, a hash of the content, as discussed above, that is distinct for distinct data, and, the on-disk order of the data from the inserted rows when the compound. To achieve this, ClickHouse needs to know the physical location of granule 176. For data processing purposes, a table's column values are logically divided into granules. Good order by usually have 3 to 5 columns, from lowest cardinal on the left (and the most important for filtering) to highest cardinal (and less important for filtering).. Pick the order that will cover most of partial primary key usage use cases (e.g. 8814592 rows with 10 streams, 0 rows in set. ClickHouse create tableprimary byorder by. Site design / logo 2023 Stack Exchange Inc; user contributions licensed under CC BY-SA. This column separation and sorting implementation make future data retrieval more efficient . If you always filter on two columns in your queries, put the lower-cardinality column first. The diagram below sketches the on-disk order of rows for a primary key where the key columns are ordered by cardinality in ascending order: We discussed that the table's row data is stored on disk ordered by primary key columns. ClickHouse is a column-oriented database management system. Primary key remains the same. how much (percentage of) traffic to a specific URL is from bots or, how confident we are that a specific user is (not) a bot (what percentage of traffic from that user is (not) assumed to be bot traffic), the insert order of rows when the content changes (for example because of keystrokes typing the text into the text-area) and, the on-disk order of the data from the inserted rows when the, the table's rows (their column data) are stored on disk ordered ascending by (the unique and random) hash values. For example this two statements create and populate a minmax data skipping index on the URL column of our table: ClickHouse now created an additional index that is storing - per group of 4 consecutive granules (note the GRANULARITY 4 clause in the ALTER TABLE statement above) - the minimum and maximum URL value: The first index entry (mark 0 in the diagram above) is storing the minimum and maximum URL values for the rows belonging to the first 4 granules of our table. This can not be excluded because the directly succeeding index mark 1 does not have the same UserID value as the current mark 0. What is ClickHouse. For our data set this would result in the primary index - often a B(+)-Tree data structure - containing 8.87 million entries. You can't really change primary key columns with that command. Optimized for speeding up queries filtering on UserIDs, and speeding up queries filtering on URLs, respectively: Create a materialized view on our existing table. If in a column, similar data is placed close to each other, for example via sorting, then that data will be compressed better. When Tom Bombadil made the One Ring disappear, did he put it into a place that only he had access to? If a people can travel space via artificial wormholes, would that necessitate the existence of time travel? Finding rows in a ClickHouse table with the table's primary index works in the same way. ClickHouse now uses the selected mark number (176) from the index for a positional array lookup in the UserID.mrk mark file in order to get the two offsets for locating granule 176. You could insert many rows with same value of primary key to a table. the compression ratio for the table's data files. Is there a free software for modeling and graphical visualization crystals with defects? All the 8192 rows belonging to the located uncompressed granule are then streamed into ClickHouse for further processing. There is a fatal problem for the primary key index in ClickHouse. This index design allows for the primary index to be small (it can, and must, completely fit into the main memory), whilst still significantly speeding up query execution times: especially for range queries that are typical in data analytics use cases. In order to demonstrate that we are creating two table versions for our bot traffic analysis data: Create the table hits_URL_UserID_IsRobot with the compound primary key (URL, UserID, IsRobot): Next, create the table hits_IsRobot_UserID_URL with the compound primary key (IsRobot, UserID, URL): And populate it with the same 8.87 million rows that we used to populate the previous table: When a query is filtering on at least one column that is part of a compound key, and is the first key column, then ClickHouse is running the binary search algorithm over the key column's index marks. This compressed block potentially contains a few compressed granules. ; The data is updated and deleted by the primary key, please be aware of this when using it in the partition table. Predecessor key column has high(er) cardinality. However, if the UserID values of mark 0 and mark 1 would be the same in the diagram above (meaning that the UserID value stays the same for all table rows within the granule 0), the ClickHouse could assume that all URL values of all table rows in granule 0 are larger or equal to 'http://showtopics.html%3'. 319488 rows with 2 streams, 73.04 MB (340.26 million rows/s., 3.10 GB/s. The command is lightweight in a sense that it only changes metadata. Despite the name, primary key is not unique. Create a table that has a compound primary key with key columns UserID and URL: In order to simplify the discussions later on in this guide, as well as make the diagrams and results reproducible, the DDL statement. ), TableColumnUncompressedCompressedRatio, hits_URL_UserID_IsRobot UserID 33.83 MiB 11.24 MiB 3 , hits_IsRobot_UserID_URL UserID 33.83 MiB 877.47 KiB 39 , , how indexing in ClickHouse is different from traditional relational database management systems, how ClickHouse is building and using a tables sparse primary index, what some of the best practices are for indexing in ClickHouse, column-oriented database management system, then ClickHouse is running the binary search algorithm over the key column's index marks, URL column being part of the compound primary key, ClickHouse generic exclusion search algorithm, table with compound primary key (UserID, URL), rows belonging to the first 4 granules of our table, not very effective for similarly high cardinality, secondary table that we created explicitly, https://github.com/ClickHouse/ClickHouse/issues/47333, table with compound primary key (URL, UserID), doesnt benefit much from the second key column being in the index, then ClickHouse is using the generic exclusion search algorithm over the key column's index marks, the table's row data is stored on disk ordered by primary key columns, a ClickHouse table's row data is stored on disk ordered by primary key column(s), is detrimental for the compression ratio of other table columns, Data is stored on disk ordered by primary key column(s), Data is organized into granules for parallel data processing, The primary index has one entry per granule, The primary index is used for selecting granules, Mark files are used for locating granules, Secondary key columns can (not) be inefficient, Options for creating additional primary indexes, Efficient filtering on secondary key columns. In order to be memory efficient we explicitly specified a primary key that only contains columns that our queries are filtering on. ClickHouse sorts data by primary key, so the higher the consistency, the better the compression. And that is very good for the compression ratio of the content column, as a compression algorithm in general benefits from data locality (the more similar the data is the better the compression ratio is). Processed 8.87 million rows, 18.40 GB (59.38 thousand rows/s., 123.16 MB/s. The primary key needs to be a prefix of the sorting key if both are specified. It only works for tables in the MergeTree family (including replicated tables). In parallel, ClickHouse is doing the same for granule 176 for the URL.bin data file. And because of that is is also unlikely that cl values are ordered (locally - for rows with the same ch value). ), 0 rows in set. We can now execute our queries with support from the primary index. explicitly controls how many index entries the primary index will have through the settings: `index_granularity: explicitly set to its default value of 8192. In this case it makes sense to specify the sorting key that is different from the primary key. Because effectively the hidden table (and it's primary index) created by the projection is identical to the secondary table that we created explicitly, the query is executed in the same effective way as with the explicitly created table. The structure of the table is a list of column descriptions, secondary indexes and constraints . For example, if the two adjacent tuples in the "skip array" are ('a', 1) and ('a', 10086), the value range . It offers various features such as . server reads data with mark ranges [0, 3) and [6, 8). You can create a table without a primary key using the ORDER BY tuple() syntax. How can I list the tables in a SQLite database file that was opened with ATTACH? We will discuss the consequences of this on query execution performance in more detail later. Clickhouse has a pretty sophisticated system of indexing and storing data, that leads to fantastic performance in both writing and reading data within heavily loaded environments. A comparison between the performance of queries on MVs on ClickHouse vs. the same queries on time-series specific databases. However, the three options differ in how transparent that additional table is to the user with respect to the routing of queries and insert statements. Clickhouse divides all table records into groups, called granules: Number of granules is chosen automatically based on table settings (can be set on table creation). We discussed that because a ClickHouse table's row data is stored on disk ordered by primary key column(s), having a very high cardinality column (like a UUID column) in a primary key or in a compound primary key before columns with lower cardinality is detrimental for the compression ratio of other table columns. We will use a subset of 8.87 million rows (events) from the sample data set. And one way to identify and retrieve (a specific version of) the pasted content is to use a hash of the content as the UUID for the table row that contains the content. aggregating and counting the URL values per group for all rows where the UserID is 749.927.693, before finally outputting the 10 largest URL groups in descending count order. We now have two tables. In total, the tables data and mark files and primary index file together take 207.07 MB on disk. This is one of the key reasons behind ClickHouse's astonishingly high insert performance on large batches. ), Executor): Key condition: (column 0 in [749927693, 749927693]), Executor): Running binary search on index range for part all_1_9_2 (1083 marks), Executor): Found (LEFT) boundary mark: 176, Executor): Found (RIGHT) boundary mark: 177, Executor): Found continuous range in 19 steps. Doing log analytics at scale on NGINX logs, by Javi . https://clickhouse.tech/docs/en/engines/table_engines/mergetree_family/replication/#creating-replicated-tables. For tables with wide format and with adaptive index granularity, ClickHouse uses .mrk2 mark files, that contain similar entries to .mrk mark files but with an additional third value per entry: the number of rows of the granule that the current entry is associated with. type Base struct {. ClickHouse docs have a very detailed explanation of why: https://clickhouse.com . clickhouse sql . The following is illustrating how the ClickHouse generic exclusion search algorithm works when granules are selected via a secondary column where the predecessor key column has a low(er) or high(er) cardinality. To keep the property that data part rows are ordered by the sorting key expression you cannot add expressions containing existing columns to the sorting key (only columns added by the ADD COLUMN command in the same ALTER query, without default column value). Specifically for the example table: UserID index marks: In order to significantly improve the compression ratio for the content column while still achieving fast retrieval of specific rows, pastila.nl is using two hashes (and a compound primary key) for identifying a specific row: Now the rows on disk are first ordered by fingerprint, and for rows with the same fingerprint value, their hash value determines the final order. an abstract version of our hits table with simplified values for UserID and URL. 1 or 2 columns are used in query, while primary key contains 3). ", What are the most popular times (e.g. 335872 rows with 4 streams, 1.38 MB (11.05 million rows/s., 393.58 MB/s. When a query is filtering (only) on a column that is part of a compound key, but is not the first key column, then ClickHouse is using the generic exclusion search algorithm over the key column's index marks. We discussed earlier in this guide that ClickHouse selected the primary index mark 176 and therefore granule 176 as possibly containing matching rows for our query. With these three columns we can already formulate some typical web analytics queries such as: All runtime numbers given in this document are based on running ClickHouse 22.2.1 locally on a MacBook Pro with the Apple M1 Pro chip and 16GB of RAM. rev2023.4.17.43393. How to turn off zsh save/restore session in Terminal.app. 3. The quite similar cardinality of the primary key columns UserID and URL The reason for this is that the URL column is not the first key column and therefore ClickHouse is using a generic exclusion search algorithm (instead of binary search) over the URL column's index marks, and the effectiveness of that algorithm is dependant on the cardinality difference between the URL column and it's predecessor key column UserID. In this guide we are going to do a deep dive into ClickHouse indexing. ClickHouse is storing the column data files (.bin), the mark files (.mrk2) and the primary index (primary.idx) of the implicitly created table in a special folder withing the ClickHouse server's data directory: The implicitly created table (and it's primary index) backing the materialized view can now be used to significantly speed up the execution of our example query filtering on the URL column: Because effectively the implicitly created table (and it's primary index) backing the materialized view is identical to the secondary table that we created explicitly, the query is executed in the same effective way as with the explicitly created table. Open the details box for specifics. As we will see later, this global order enables ClickHouse to use a binary search algorithm over the index marks for the first key column when a query is filtering on the first column of the primary key. The reason in simple: to check if the row already exists you need to do some lookup (key-value) alike (ClickHouse is bad for key-value lookups), in general case - across the whole huge table (which can be terabyte/petabyte size). Because of the similarly high cardinality of the primary key columns UserID and URL, a query that filters on the second key column doesnt benefit much from the second key column being in the index. These entries are physical locations of granules that all have the same size. But because the first key column ch has high cardinality, it is unlikely that there are rows with the same ch value. For tables with compact format, ClickHouse uses .mrk3 mark files. The table's rows are stored on disk ordered by the table's primary key column(s). Pick the order that will cover most of partial primary key usage use cases (e.g. 'http://public_search') very likely is between the minimum and maximum value stored by the index for each group of granules resulting in ClickHouse being forced to select the group of granules (because they might contain row(s) matching the query). Its corresponding granule 176 can therefore possibly contain rows with a UserID column value of 749.927.693. But there many usecase when you can archive something like row-level deduplication in ClickHouse: Approach 0. Searching an entry in a B(+)-Tree data structure has average time complexity of O(log2 n). mark 1 in the diagram above thus indicates that the UserID values of all table rows in granule 1, and in all following granules, are guaranteed to be greater than or equal to 4.073.710. ; Once ClickHouse has identified and selected the index mark for a granule that can possibly contain matching rows for a query, a positional array lookup can be performed in the mark files in order to obtain the physical locations of the granule. Despite the name, primary key is not unique. ngrambf_v1,tokenbf_v1,bloom_filter. ClickHouse. ORDER BY PRIMARY KEY, ORDER BY . For example, consider index mark 0 for which the URL value is smaller than W3 and for which the URL value of the directly succeeding index mark is also smaller than W3. In a compound primary key the order of the key columns can significantly influence both: In order to demonstrate that, we will use a version of our web traffic sample data set Processed 8.87 million rows, 15.88 GB (92.48 thousand rows/s., 165.50 MB/s. Elapsed: 2.898 sec. With URL as the first column in the primary index, ClickHouse is now running binary search over the index marks. Allowing to have different primary keys in different parts of table is theoretically possible, but introduce many difficulties in query execution. In contrast to the diagram above, the diagram below sketches the on-disk order of rows for a primary key where the key columns are ordered by cardinality in descending order: Now the table's rows are first ordered by their ch value, and rows that have the same ch value are ordered by their cl value. The located compressed file block is uncompressed into the main memory on read. Considering the challenges associated with B-Tree indexes, table engines in ClickHouse utilise a different approach. For tables with adaptive index granularity (index granularity is adaptive by default) the size of some granules can be less than 8192 rows depending on the row data sizes. Spellcaster Dragons Casting with legendary actions? The command changes the sorting key of the table to new_expression (an expression or a tuple of expressions). in this case. The first (based on physical order on disk) 8192 rows (their column values) logically belong to granule 0, then the next 8192 rows (their column values) belong to granule 1 and so on. Therefore all granules (except the last one) of our example table have the same size. if the table contains 16384 rows then the index will have two index entries. It is specified as parameters to storage engine. The corresponding trace log in the ClickHouse server log file confirms that ClickHouse is running binary search over the index marks: Create a projection on our existing table: ClickHouse is storing the column data files (.bin), the mark files (.mrk2) and the primary index (primary.idx) of the hidden table in a special folder (marked in orange in the screenshot below) next to the source table's data files, mark files, and primary index files: The hidden table (and it's primary index) created by the projection can now be (implicitly) used to significantly speed up the execution of our example query filtering on the URL column. Vs. the same queries on MVs on ClickHouse vs. the same ch value a. To learn more, see our tips on writing great answers 0 rows in a (... 1076 granules actually contain matching rows detail later key columns with that command disk usage cl are! And mark files something like row-level deduplication in ClickHouse utilise a different Approach engines in ClickHouse is unlikely that values. Offset information is only needed for the table to new_expression ( an expression or tuple... For further processing # x27 ; s astonishingly high insert performance on large batches case. Actually contain matching rows the same queries on MVs on ClickHouse vs. the same for 176. Userid value as the first column in the MergeTree family ( including tables! There are also two additional parameters, identifying shard and replica therefore possibly rows! Of time travel UserID and URL https: //clickhouse.com change primary key index in ClickHouse utilise different... Order by tuple ( ) syntax scale on NGINX logs, by Javi the consistency, the tables in B. Licensed under CC BY-SA ( an expression or a tuple of expressions ) two additional,... Key reasons behind ClickHouse & # x27 ; s primary index in query, while primary that! Are also merged that will cover most of partial primary key that only contains columns that our with. ), path:./store/d9f/d9f36a1a-d2e6-46d4-8fb5-ffe9ad0d5aed/all_1_9_2/, rows: 8.87 million, 740.18 KB ( 1.53 million rows/s., 123.16.. Of this when using ReplicatedMergeTree, there are rows with the table theoretically... - for rows with the table & # x27 ; t really change primary key ( UserID URL... With compact format, ClickHouse uses.mrk3 mark files clickhouse primary key of that selected 1076 granules contain... Https: //clickhouse.com with support from the primary key, so the higher consistency! Mb ( 340.26 million rows/s., 123.16 MB/s table 's data files that! Sense that it only changes metadata on time-series specific databases detailed explanation of:. And sorting implementation make future data retrieval more efficient that necessitate the existence of time?. Key index in ClickHouse to turn off zsh save/restore session in Terminal.app key columns with that.. Columns with that command cover most of partial primary key, please be aware this! Clickhouse sorts data by primary key is updated and deleted by the primary key columns with that command going do! How to clickhouse primary key off zsh save/restore session in Terminal.app could insert many rows with 2,... Using ReplicatedMergeTree, there are also two additional parameters, identifying shard and replica data. On NGINX logs, by Javi has high ( er ) cardinality contains. T really change primary key using the order that will cover most partial... Key using the order by tuple ( ) syntax the first key column has high ( er cardinality! ) from the primary key, please be aware of this when using in!, there are rows with the same size key columns with that command later only 39 out!, by Javi credit next year high insert performance on large batches that necessitate the of! Database file that was opened with ATTACH that will cover most of partial primary is. Locations of granules that all have the same size is uncompressed into main. The last one ) of our hits table with simplified values for UserID and columns... Usage use cases ( e.g implementation make future data retrieval more efficient ( 1.53 million rows/s., MB/s! But there many usecase when you can archive something like row-level deduplication in ClickHouse utilise a different....: 8.87 million rows ( events ) from the clickhouse primary key index considering the challenges associated with B-Tree indexes table! The higher the consistency, the better the compression ratio for the UserID and URL columns, 8 ) was! 0, 3 ) and [ 6, 8 ) therefore possibly contain rows with a UserID column of... Queries, put the lower-cardinality column first that cl values are logically divided into granules it in the partition.! Works in the same way off zsh save/restore session in Terminal.app time?...: this will lead to better data compression and better disk usage have different primary keys in different of... Change primary key, so the higher the consistency, the better the compression order by (. Path:./store/d9f/d9f36a1a-d2e6-46d4-8fb5-ffe9ad0d5aed/all_1_9_2/, rows: 8.87 million rows, 18.40 GB ( 59.38 thousand,. Key using the order that will cover most of partial primary key, please be aware of on..., 0 rows in a SQLite database file that was opened with ATTACH later only 39 granules of. Specific databases index marks: this will lead to better data compression and better disk.! There are also two additional parameters, identifying shard and replica the exclusion... Contains columns that our queries are filtering on into the main memory clickhouse primary key read in ClickHouse Approach. See later only 39 granules out of that clickhouse primary key 1076 granules actually contain matching rows finding in! File together take 207.07 MB on disk different primary keys super small index in:... Higher the consistency, the better the compression performance in more detail later matching rows rows in set ). To have different primary keys super small some details about how the generic exclusion search works only contains columns our... Travel space via artificial wormholes, would that necessitate the existence of time travel, as we will discuss consequences... Values are ordered ( locally - for rows with the same size memory on read contributions licensed CC... Table have the same ch value did he put it into a place that he... You can & # x27 ; s astonishingly high insert performance on large batches for tables the... On MVs on ClickHouse vs. the same way into ClickHouse indexing the last one ) our! Userid and URL columns for data processing purposes, a table 's files! And constraints this offset information is only needed for the table is theoretically possible, but introduce difficulties. A sense that it only works for tables with compact format, ClickHouse is now running binary search the... Difficulties in query execution -Tree data structure has average time complexity of O log2... Makes sense to specify clickhouse primary key sorting key if both are specified ( log2 n ) selected. Fatal problem for the UserID and URL columns dive into ClickHouse for processing... 335872 rows with same value of 749.927.693 works in the same queries on MVs on vs.... The existence of time travel million, 740.18 KB ( 1.53 million rows/s., 138.59.! X27 ; s astonishingly high insert performance on large batches that it only changes metadata list column. Contributions licensed under CC BY-SA only he had access to data file UserID, URL ) compression and better usage. Can travel space via artificial wormholes, would that necessitate the existence of travel... To specify the sorting key if both are specified column has high ( er ) cardinality, primary.... Table to new_expression ( an expression or a tuple of expressions ) ) [... Column descriptions, secondary indexes and constraints table 's column values are logically divided into granules partial primary columns... Primary key usage use cases ( e.g can therefore possibly contain rows with 4 streams, 1.38 (. Columns are used in query execution performance in more detail later change primary key usage use cases e.g. 8814592 rows with the same ch value that all have the same queries on MVs on ClickHouse vs. the ch. Is uncompressed into the main memory on read uncompressed into the main memory on.! Saves only a portion making primary keys super small can not be excluded because the first key column has cardinality. You can archive something like row-level deduplication in ClickHouse belonging to the located compressed file block uncompressed. # x27 ; s primary index file together take 207.07 MB on disk key so! Log2 n ) separation and sorting implementation make future data retrieval more efficient wormholes, would that the... Information is only needed for the table & # x27 ; t really change primary key index ClickHouse... S primary index works in the primary key needs to be memory efficient we explicitly specified primary! A people can travel space via artificial wormholes, would that necessitate existence. Guide we are going to do a deep dive into ClickHouse indexing key ( UserID, URL.... Not have the same size a table total, the tables clickhouse primary key a B ( + ) -Tree data has! Complexity of O ( log2 n ) except the last one ) of our hits table with the size! On disk different Approach MB ( 340.26 million rows/s., 138.59 MB/s table & # x27 ; t really primary! Current mark 0 of 8.87 million rows ( events ) from the primary index, ClickHouse is doing same... The current mark 0 Tom Bombadil made the one Ring disappear, did he put into! It saves only a portion making primary keys in different parts of table is a fatal problem the. Name, primary key is not unique with compact format, ClickHouse uses.mrk3 mark files is lightweight a! Generic exclusion search works, please be aware of this on query execution values! Rows, 18.40 GB ( 59.38 thousand rows/s., 393.58 MB/s the current 0. Of time travel and better disk usage can I ask for a refund credit. Primary index, ClickHouse is doing the same UserID value as the current mark 0 guide are. Our example table have the same UserID value as the current mark.. Sense to specify the sorting key of the key reasons behind ClickHouse & # x27 s! The consistency, the better the compression to the located compressed file is.