SQLite - Get a specific row index for a Sorted/Filtered Query

I'm creating a caching system to take data from an SQLite database table using a sorted/filtered query and display it. The tables I'm pulling from can be potentially very large and, of course, I need to minimize impact on memory by only retaining a maximum number of rows in memory at any given time. This is easily done by using LIMIT and OFFSET to load only the records I need and update the cache as needed. Implementing this is trivial. The problem I'm having is determining where the insertion index is for a new record inserted into a particular query so I can update my UI appropriately. Is there an easy way to do this? So far the ideas I've had are:

  1. Dump the entire cache, re-count the Query results (there's no guarantee the new row will be included), refresh the cache and refresh the entire UI. I hope it's obvious why that's not really desirable.
  2. Use my own algorithm to determine whether the new row is included in the current query, if it is included in the current cached results and at what index it should be inserted into if it's within the current cached scope. The biggest downfall of this approach is it's complexity and the risk that my own sorting/filtering algorithm won't match SQLite's.

Of course, what I want is to be able to ask SQLite: Given 'Query A' what is the index of 'Row B', without loading the entire query results. However, so far I haven't been able to find a way to do this.

I don't think it matters but this is all occurring on an iOS device, using the objective-c programming language.

More Info

The Query and subsequent cache is based off of user input. Essentially the user can re-sort and filter (or search) to alter the results they're seeing. My reticence in simply recreating the cache on insertions (and edits, actually) is to provide a 'smoother' UI experience.

I should point out that I'm leaning toward option "2" at the moment. I played around with creating my own caching/indexing system by loading all the records in a table and performing the sort/filter in memory using my own algorithms. So much of the code needed to determine whether and/or where a particular record is in the cache is already there, so I'm slightly predisposed to use it. The danger lies in having a cache that doesn't match the underlying query. If I include a record in the cache that the query wouldn't return, I'll be in trouble and probably crash.

Answers


You don't need record numbers.

Save the values of the ordered field in the first and last records of the LIMITed query result. Then you can use these to check whether the new record falls into this range.

In other words, assuming that you order by the Name field, and that the original query was this:

SELECT Name, ...
  FROM mytab
  WHERE some_conditions
  ORDER BY Name
  LIMIT x OFFSET y

then try to get at the new record with a similar query:

SELECT 1
  FROM mytab
  WHERE some_conditions
    AND PrimaryKey = LastInsertedValue
    AND Name BETWEEN CachedMin AND CachedMax

Similarly, to find out before (or after) which record the new record was inserted, start directly after the inserted record and use a limit of one, like this:

SELECT Name
  FROM mytab
  WHERE some_conditions
    AND Name > MyInsertedName
    AND Name BETWEEN CachedMin AND CachedMax
  ORDER BY Name
  LIMIT 1

This doesn't give you a number; you still have to check where the returned Name is in your cache.


Typically you'd expect a cache to be invalidated if there were underlying data changes. I think dropping it and starting over will be your simplest, maintainable solution. I would recommend it unless you have a very good reason.

You could write another query that just returned the row count (example below) to see if your cache should be invalidated. That would save recreating the cache when it did not change.

SELECT name,address FROM people WHERE area_code=970;
SELECT COUNT(rowid) FROM people WHERE area_code=970;

The information you'd need from sqlite to know when your cache was invalidated would require some rather intimate knowledge of how the query and/or index was working. I would say that is fairly high coupling.

Otherwise, you'd want to know where it was inserted with regards to the sorting. You would probably key each page on the sorted field. Delete anything greater than the insert/delete field. Any time you change the sorting you'd drop everything.

Something like the below would be a start if you were using C++. I realize you aren't doing C++, but hopefully it is evident as to what I'm trying to do.

struct Person {
  std::string name;
  std::string addr;
};

struct Page {
  std::string key;
  std::vector<Person> persons;
  struct Less {
    bool operator()(const Page &lhs, const Page &rhs) const {
      return lhs.key.compare(rhs.key) < 0;
    }
  };
};

typedef std::set<Page, Page::Less> pages_t;
pages_t pages;

void insert(const Person &person) {
  if (sql_insert(person)) {
    pages_t::iterator drop_cache_start = pages.lower_bound(person);
    //... drop this page and everything after it
  }
}

You'd have to do some wrangling to get different datatypes of key to work nicely, but its possible.

Theoretically you could just leave the pages out of it and only use the objects themselves. The database would no longer "own" the data though. If you only fill pages from the database, then you'll have less data consistency worries.

This may be a bit off topic, you aren't re-implementing views are you? It doesn't cache per se, but it isn't clear if that is a requirement of your project.


Need Your Help

conversion from int to void * is possible?

c++ linux multithreading

I am learning multithreading concept from http://www.tutorialspoint.com/cplusplus/cpp_multithreading.htm.

Autoconf: Check if a program in an unsupported language compiles

build autotools autoconf configure

To conditionally enable a part of an autotooled project, I need to check whether a short program stub in a language not supported by autotools out of the box compiles or not.

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.