Cassandra data loss
My cassandra database has just lost most of it's data. It was only test data, but I still need to understand what is going on and make sure it does not happen with real data.
I'm running cassandra 1.1 as a service on a Windows Server. The db is fed with data from a c# application. A script terminated and restarted the cassandra service. After that, all data from the last 20 hours or so was gone. Older data was still there.
It is possible that the data in question was never written to disc at all. However, the db answered queries correctly during the 20h in question, so the data must have been in memory at least.
The config is identical to the default config except for storage location etc. Flushing strategy is
commitlog_sync: periodic commitlog_sync_period_in_ms: 10000
Any hint is appreciated, including what to try and what to look for in the log files or in the config.
Edit: After experimenting a little more I can now reproduce the following:
- insert new data - ok
- query the new data - ok
- stop and restart db - all new data is now gone :( (old data is still there)
- nothing in the log file, just "Log replay complete, 0 replayed mutations"
Edit2: Starting with a fresh, empty db and everything works fine now (same config of course). Using the backup of my broken db and I can reproduce the problem above again. Have I discovered a bug in cassandra? Apparently my db is in a state where the commit logs are either not written or not replayed correctly.
"New mutations aren't replayed, but old ones are still there" sounds like https://issues.apache.org/jira/browse/CASSANDRA-4782, which was fixed in 1.1.6. The most recent 1.1 release is 1.1.8; you should upgrade to that.