What is the best way to sync 2 sqlite tables over http and json?
Should I consider the use of SHA1 hash or a UUID of each entry or is a last synced timestamp sufficient? How do I make sure there are no duplicate entries? Is there a simpler algorithm I could follow?
I am assuming changes are likely to be at the end. I don't know the character of insert and updates but here is my idea;
- I would SHA1 (or MD5, it doesn't matter in this case) days of the current month and months before. Comparing against these fingerprints is a fast way to see were the differences are. (I am leaving today unhashed)
- If previous months have differences;
- If volume for a month is too big we can then split the month and simply generate daily fingerprint on the fly instead of comparing the whole month.
- Otherwise we can treat a monthly change the same way we treat a daily change.
- After finding out where the changes occur, master copy would send a list of all unique id's for that period. (Always sending today's info)
- The slave then deletes what has to be deleted and compiles a list of id's to be inserted.
- The master sends only those records (in full).
The time categories (day, month) can be adjusted according to the data volume.
Of course this is a naive and simple algorithm. If I was processing sensitive/critical data I would look for a transactional algorithm.