Discussion:
Best compaction strategy
raman gugnani
2018-10-25 12:59:12 UTC
Permalink
Hi All,

I have one table in which i have some data which has TTL of 2days and some
data which has TTL of 60 days. What compaction strategy will suits the most.

1. LeveledCompactionStrategy (LCS)
2. SizeTieredCompactionStrategy (STCS)
3. TimeWindowCompactionStrategy (TWCS)
--
Raman Gugnani

8588892293
Alexander Dejanovski
2018-10-25 13:28:19 UTC
Permalink
Hi Raman,

TWCS is the best compaction strategy for TTL data, even if you have
different TTLs (set the time window based on your largest TTL, so it would
be 1 day in your case).
Enable unchecked tombstone compaction to clear the data with 2 days TTL
along the way. This is done by setting :

ALTER TABLE my_table WITH compaction =
{'class':'TimeWindowCompactionStrategy',
'unchecked_tombstone_compaction':'true', ...}

If you're running 3.11.1 at least, you can turn on the
unsafe_aggressive_sstable_expiration introduced by CASSANDRA-13418
<https://issues.apache.org/jira/browse/CASSANDRA-13418>.

Cheers,
Post by raman gugnani
Hi All,
I have one table in which i have some data which has TTL of 2days and some
data which has TTL of 60 days. What compaction strategy will suits the most.
1. LeveledCompactionStrategy (LCS)
2. SizeTieredCompactionStrategy (STCS)
3. TimeWindowCompactionStrategy (TWCS)
--
Raman Gugnani
8588892293 <(858)%20889-2293>
--
-----------------
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com
Jonathan Haddad
2018-10-25 19:43:36 UTC
Permalink
To add to what Alex suggested, if you know what keys use what TTL you could
store them in different tables, with different window settings.

Jon
Post by Alexander Dejanovski
Hi Raman,
TWCS is the best compaction strategy for TTL data, even if you have
different TTLs (set the time window based on your largest TTL, so it would
be 1 day in your case).
Enable unchecked tombstone compaction to clear the data with 2 days TTL
ALTER TABLE my_table WITH compaction =
{'class':'TimeWindowCompactionStrategy',
'unchecked_tombstone_compaction':'true', ...}
If you're running 3.11.1 at least, you can turn on the
unsafe_aggressive_sstable_expiration introduced by CASSANDRA-13418
<https://issues.apache.org/jira/browse/CASSANDRA-13418>.
Cheers,
Post by raman gugnani
Hi All,
I have one table in which i have some data which has TTL of 2days and
some data which has TTL of 60 days. What compaction strategy will suits the
most.
1. LeveledCompactionStrategy (LCS)
2. SizeTieredCompactionStrategy (STCS)
3. TimeWindowCompactionStrategy (TWCS)
--
Raman Gugnani
8588892293 <(858)%20889-2293>
--
-----------------
Alexander Dejanovski
France
@alexanderdeja
Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com
--
Jon Haddad
http://www.rustyrazorblade.com
twitter: rustyrazorblade
Dor Laor
2018-10-25 19:26:34 UTC
Permalink
TWCS is good for time series but if your workload updates the same keys
within the TTL it's the wrong strategy.
This diagram is a good rule of the thumb

[image: image.png]
Post by Alexander Dejanovski
Hi Raman,
TWCS is the best compaction strategy for TTL data, even if you have
different TTLs (set the time window based on your largest TTL, so it would
be 1 day in your case).
Enable unchecked tombstone compaction to clear the data with 2 days TTL
ALTER TABLE my_table WITH compaction =
{'class':'TimeWindowCompactionStrategy',
'unchecked_tombstone_compaction':'true', ...}
If you're running 3.11.1 at least, you can turn on the
unsafe_aggressive_sstable_expiration introduced by CASSANDRA-13418
<https://issues.apache.org/jira/browse/CASSANDRA-13418>.
Cheers,
Post by raman gugnani
Hi All,
I have one table in which i have some data which has TTL of 2days and
some data which has TTL of 60 days. What compaction strategy will suits the
most.
1. LeveledCompactionStrategy (LCS)
2. SizeTieredCompactionStrategy (STCS)
3. TimeWindowCompactionStrategy (TWCS)
--
Raman Gugnani
8588892293 <(858)%20889-2293>
--
-----------------
Alexander Dejanovski
France
@alexanderdeja
Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com
Jeff Jirsa
2018-10-27 01:42:16 UTC
Permalink
This isn’t true if your clustering is time based because the read path can selectively include/exclude sstables based on the clustering keys
--
Jeff Jirsa
TWCS is good for time series but if your workload updates the same keys within the TTL it's the wrong strategy.
This diagram is a good rule of the thumb
<image.png>
Post by Alexander Dejanovski
Hi Raman,
TWCS is the best compaction strategy for TTL data, even if you have different TTLs (set the time window based on your largest TTL, so it would be 1 day in your case).
ALTER TABLE my_table WITH compaction = {'class':'TimeWindowCompactionStrategy', 'unchecked_tombstone_compaction':'true', ...}
If you're running 3.11.1 at least, you can turn on the unsafe_aggressive_sstable_expiration introduced by CASSANDRA-13418.
Cheers,
Post by raman gugnani
Hi All,
I have one table in which i have some data which has TTL of 2days and some data which has TTL of 60 days. What compaction strategy will suits the most.
LeveledCompactionStrategy (LCS)
SizeTieredCompactionStrategy (STCS)
TimeWindowCompactionStrategy (TWCS)
--
Raman Gugnani
8588892293
--
-----------------
Alexander Dejanovski
France
@alexanderdeja
Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com
Continue reading on narkive:
Loading...