Optimize feed fetching

Hello I'm working on a site now that have to fetch users feeds. But how can I best optimize fetching if I have a database with, lets say, 300 feeds. I'm going to set up a cron-job to which fetches the feeds, but should I do it like 5 every second minute or something?

Any ideas on how to do this the best way in PHP?

Answers


Based on the new information I think I would do something like this:

Let the "first" client initiate the updatework and store timestamp with it. Everey other clients that will ask for the information get a cashed information until that information are to old. Next hit from a client will then refresh the cashe that then will be used by all clients till next time its to old.

The client that will actually initiate the updatework should not have to wait for it to finnish, just serv the old cashed version and continue to do it till the work is done.

That way you dont have to update anything if no clients are requesting it.


If I understand you question, you are basically working on a feed agregator site?

You can do the following; start by refreshing every 1 hor (for example). When you have anough entries from some feed - calculate the average interval between entries. Then use that interval as an interval for fetching that feed.

For example, if the site published 7 articles in the last 7 days - you can fetch feeds from it every 24hours (1day).

I use this algorithm with a few changes, when I calculate this average interval I divide it by 2 (to be sure not to fetch too rarely). If the result is less than 60 minutes - I set the interval to 1h or it is bigger than 24 I set it to 24h.

For example, something like this:

    public function updateRefreshInterval() {
            $sql = 'select count(*) _count ' .
                    'from article ' .
                    'where created>adddate(now(), interval -7 day) and feed_id = ' . (int) $this->getId();
            $array = Db::loadArray( $sql );

            $count = $array[ '_count' ];

            $interval = 7 * 24 * 60 * 60 / ( $count + 1 );
            $interval = $interval / 2;
            if( $interval < self::MIN_REFRESH_INTERVAL ) {
                    $interval = self::MIN_REFRESH_INTERVAL;
            }
            if( $interval > self::MAX_REFRESH_INTERVAL ) {
                    $interval = self::MAX_REFRESH_INTERVAL;
            }

            Db::execute( 'update feed set refresh_interval = ' . $interval . ' where id = ' . (int) $this->getId() );
    }

The table is 'feed', 'refreshed' is the timestampt when the feed was last time refreshed and 'refresh_interval' is the desired time interval between two fetches of the same feed.


Need Your Help

Matlab Stereo Camera Calibration Scene Reconstruction Error

matlab camera camera-calibration matlab-cvst 3d-reconstruction

I am trying to use the Computer Vision System Toolbox to calibrate the pair of cameras below in order to be able to generate a 3-D point cloud of a vehicle at a range between 1 to 5m. The output i...

eclipse-plugin how to open standard View like InternalWebBrowser with Java

java eclipse-plugin eclipse-rcp

In Eclipse plugin development How to open standard View like InternalWebBrowser or bring up/activate Console View (all those standard things)?

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.