@skippybosco: Yes, I agree with djbaxter this IS a good idea. That's why I already had my installation configured that way.
I also increased my local cache retention time from 30 minutes to 480 minutes (8 hours) and I increased my local log retention time from 30 to 60 days. increasing the cache time dramatically reduces the number of lookup requests I send to the central server and greatly reduces the likelihood I'm going to find that server too busy to handle my requests. At the moment, my server processes and rejects about 120 bot registrations a day (that's 5 per hour). By increasing my cache time from 30 minutes to 8 hours, I'm retaining just 40 records in my recent registration attempts cache rather than 3; but I've reduced my load on the central server to roughly 1/13th of what it would have been otherwise.
In short, from what I can tell by increasing my cache time to 16 times as long, I've reduced the load my site places on the central server over 92% and I've also improved the performance of SFS on my site because its getting a local cache hit in many more cases rather than waiting for 3 database queries to occur on an overloaded remote server. Frankly, I recommend those changes to EVERY site that's using SFS. I suspect we'd totally eliminate the central server performance issues if we did this.
@pedigree:
I'm confused about something here. It occurred to me this evening that I never actually saw anything that said this addon checks its own local database of rejected registrations using the registering user's email address, IP address and username BEFORE it goes off to the central server to check the database there. Yes, I realize it looks in the local cache covering the last xx (user configurable) minutes of registration attempts before it goes to check the host, but since the "standard" user-configurable cache time is set at 30 minutes whereas the local rejected user registrations table involves weeks or months of rejected registration history (I'm keeping 60 days of history) it seems to me the load on the central SFS server could be HUGELY reduced by increasing the cache time to 4 or 8 hours and then checking BOTH the cache from the last (user configurable) minutes and the local rejection log database for the last (user configurable) days rather than going off to check the central database practically everytime a bot tries to register on any site.
For example, my site receives about 120 bot registration attempts per day. That's 5 per hour or 2.5 bot registration attempts every 30 minutes. Compared to the 1,200 rejected bot registrations captured in our local SFS rejection log in the past 10 days, that suggests the 30 minute local cache is so small it's almost useless.
As I examine my own local SFS rejection log which already contains 1200 bounced registration attempts after just 10 days, I can see many of these bots come back time after time every day and try registering with the same IP address, username and email address over and over again. Furthermore, many of the bot registration attempts occur day after day several times per day and then the bot goes away and comes back again after 24 hours or so. With those behavior patterns in mind, it looks to me as if the load on your central server could be cut WAY down to maybe 5% or 10% of the current load if the hack was just modified to first check the local cache, then query the local rejection log for the last 5 or 10 or 15 or 30 (user configurable) days BEFORE going off to ask the remote server if this registrant has been reported as a spammer by any other site. If you combined this mod with a signficant increase in the cache time it looks to me like you'd eliminate most of the query requests the central server now sees -- especially on sites that have been around a while.
It's just a suggestion, pedigree, but I suspect if you'd just increase the cache time and make this one simple change to your look-up logic your problems with database and name server performance at the central server will completely disappear. As it stands now, with at least 1000 sites using your product and accessing your database and an average of lets say 120 bot registration attempts per day per site (that's my own site's average), that says at least 5,000 bot registration attempts per hour (that's up to 15,000 central database queries per hour) are being handled by your product worldwide. However, since recording the fact that they're using your software is NOT mandatory here for ANY webmaster, I'll bet that 1000 site estimate is low and the actual number is 2 or 3 times that. Even if we assume half those 5,000 bot registration attempts are never reaching your central database because the local cache is blocking them (with a 30 minute cache time I'll bet the percentage being blocked locally is much lower than 50%), that still means your local database receives up to 7,500 query requests per hour to look up the IP address, email address or username of a bot who in all probability has visited the requesting site one or more times in the past few hours, days or weeks.
If my "SWAG" is right and there are actually 2,000 or 3,000 sites using your software rather than the 1,000 sites shown as having clicked "install" here at vbulletin.org, then your central server could be receiving 15,000 - 22,500 query requests per hour. That starts to sound like a helluva LOT of database work and would certainly explain why the central server is overloaded. To make matters worse, if my guess about the local cache hit rate is correct, then you're getting a much smaller percentage of local cache hits than the 50% I assumed above. In that case, your central server's database workload could be as high as 30,000 to 67,500 query requests per hour rather than 7,500, 15,000 or 22,500.
But the good news is, if my hunch about the cache time being too short is correct, you could reduce that query load by 93% just by increasing the cache time from 30 minutes to 8 hours. That would cut a central server query load of 50,000 requests per hour to about 4,000 per hour. And if the 7,500 queries per hour request is a more accurate estimate, by increasing the cache time, you'd decrease the central server workload from 7,500 to about 600 queries per hour.
If you increase the cache time, check the local cache first and then query the local log file second BEFORE going to the central server, you are effectively spreading that 7,500 query per hour workload out across thousands of servers. In doing that I bet you'd eliminate 95% of the load on your central server and everyone who uses your product would see much better performance.
In my mind, that's definitely worth thinking about.
So tell me, what have I missed here, pedigree? Where has my reasoning gone wrong?
I hope this helps.
|