Heroku database performance experience needed?
We are experiencing some serious scaling challenges for our intelligent search engine/aggregator. Our database holds around 200k objects. From profiling and newrelic it seems most of our troubles may come from the database. We are using the smallest dedicated database Heroku provide (Ronin).
We have been looking into indexing and caching. So far we managed to solve our problems by reducing database calls and caching content intelligently, but now even this seems to reach an end. We are constantly asking ourselves if our code/configuration is good enough or if we are simply not using enough "hardware".
We suspect that the database solution we buy from Heroku may be performing insufficiently. For example, just doing a simple count (no joins, no nothing) on the 200k items takes around 250ms. This seems like a long time, even though postgres is known for its bad performance on counts?
We have also started to use geolocation lookups based on latitude/longitude. Both columns are indexed floats. Doing a distance calculation involves pretty complicated math, but we are using the very well recommended geocoder gem that is suspected to run very optimized queries. Even geocoder still takes 4-10 seconds to perform a lookup on, say, 40.000 objects, returning only a limit of the first nearest 10. This again sounds like a long time, and all the experienced people we consult says that it sound very odd, again hinting at the database performance.
So basically we wonder: What can we expect from the database? Might there be a problem? And what can we expect if we decide to upgrade?
An additional question I have is: I read here that we can improve performance by loading the entire database into memory. Are we supposed to configure this ourselves and if so how?
UPDATE ON THE LAST QUESTION: I got this from the helpful people at Heroku support:
"What this means is having enough memory (a large enough dedicated database) to store your hot data set in memory. This isn't something you have to do manually, Postgres is configured automatically use all available memory on our dedicated databases.
I took a look at your database and it looks like you're currently using about 1.25 GB of RAM, so you haven't maxed your memory usage yet."
UPDATE ON THE NUMBERS AND FIGURES
Okay so now I've had time to look into the numbers and figures, and I'll try to answer the questions below as follows:
- First of all, the db consists of around 29 tables with a lot of relations. But in reality most queries are done on a single table (some additional resources are joined in, to provide all needed information for the views).
- The table has 130 columns.
- Currently it holds around 200k records but only 70k are active - hence all indexes are made as partial-indexes on this "state".
- All columns we search are indexed correctly and none is of text-type, and many are just booleans.
Answers to questions:
- Hmm the baseline performance it's kind of hard to tell, we have sooo many different selects. The time it takes varies typically from 90ms to 250ms selecting a limit of 20 rows. We have a LOT of counts on the same table all varying from 250ms to 800ms.
- Hmm well, that's hard to say cause they wont give it a shot.
- We have around 8-10 users/clients running requests at the same time.
- Our query load: In new relic's database reports it says this about the last 24 hours: throughput: 9.0 cpm, total time: 0.234 s, avg time: 25.9 ms
- Yes we have examined the query plans of our long-running queries. The count queries are especially slow, often over 500ms for a pretty simple count on the 70k records done on indexed columns with a result around 300
I've tuned a few Rails apps hosted on Heroku, and also hosted on other platforms, and usually the problems fall into a few basic categories:
- Doing too much in ruby that could be done at the db level (sorting, filtering, join data, etc)
- Slow queries
- Inefficient use of indexes (not enough, or too many)
- Trying too hard to do it all in the db (this is not as common in rails, but does happen)
- Not optimizing cacheable data
- Not effectively using background processing
Right now its hard to help you because your question doesn't contain any specifics. I think you'll get a better response if you pinpoint the biggest issue you need help with and then ask.
Some info that will help us help you:
- What is the average response time of your actions? (from new relic, request-log-analyzer, logs)
- What is the slowest request that you want help with?
- What are the queries and code in that request?
- Is the site's performance different when you run it locally vs. heroku?
In the end I think you'll find that it is not an issue specific to Heroku, and if you had your app deployed on amazon, engineyard, etc you'd have the same performance. The good news is I think that your problems are common, and shouldn't be too hard to fix once you've done some benchmarking and profiling.