The greatest challenge of BOINCstats is keeping it going. Over time, the number of BOINC users and the features on this site have greatly increased. This alone puts an enormous stress on the server. At the beginning of this year, the problems were becoming so bad that the site was down whenever the stats were updated, simply because the load for running stats and serving the web pages was too high.
Thanks to some generous donations and a collaboration with PrimeGrid, I was able to add a dedicated server just for updating the stats. The ‘old’ server was already upgraded once a few months before, and is now serving as the webserver. It keeps a copy of the stats database so essentially there are two copies of the stats on two servers, to keep the load under control.
Since then, more users came to BOINC and BOINCstats, and BAM! is launched. BAM! runs for the most part on the database server, only the webinterface runs on the webserver.
The load was still increasing and I have made drastic changes in the code of the website to lower the load on the databases. These measures include caching of webpages and images to prevent the same data being requested from the database.
The highest load is seen just after the stats are updated (either incremental or daily update): the caches are empty and need to be filled again. And by now, almost everybody knows when BOINCstats has the new daily stats online and this is reflected by the number of visitors at that time.
With all these things I have no problems. BOINCstats is pretty popular, and is for me personally a huge success. And I want to make it even better and attract more visitors. High load is (I think) perfectly normal for any popular website. I simply have to find ways to keep it going, and I already explained some of the measures I took. If all continues to grow as it does in the last months, I anticipate the need for a new (faster) server for the beginning of next year.
But, the one thing I can’t get under control is scraping. As explained before, scraping is the automated downloading of BOINCstats webpages, to extract just a small part of the page for use on another site or for other stats.
‘Professional’ scrapers write a program that fetches hundreds to thousands of pages from BOINcstats in sequence, which cripples the database. This can be compared with a load of ten times the number of visitors BOINCstats now has.
Instead of writing their own stats engine, they simply take the numbers from BOINCstats in the most inefficient way, without asking permission.
To accommodate all scrapers, I should add at least one web/database server to handle their requests, and that’s simply not an option.
If you watch your stats on BOINCstats and copy some of the numbers to an excel sheet or something else, or when you show the BOINCstats signature or another BOINCstats image on your site, you are NOT considered to be a scraper.
Most scrapers are found by either checking how much bandwidth a single IP address uses, or when the site goes down due to a large number of requests from a single IP. By simply viewing your stats you can’t bring the site down or have much traffic.
99.99% of you will never run into problems of being accused of being a scraper. I’m pretty good at filtering them out. The other 0.01% knows perfectly well what they are doing (especially after this warning).
I hope this clarifies things, but most probably I leave you with more questions

.
Please do not PM, IM or email me for support (they will go unread/ignored). Use the forum for support.