View Full Version : Is there a way to restrict how often guests can refresh?
spamgirl
08-20-2015, 02:44 PM
I was wondering if there is any add-on that can limit how frequently guests are allowed to refresh? I'd like members to be able to refresh as much as they want, but guests to be limited, regardless of the server load. Thanks for your help!
Elite_360_
08-20-2015, 03:29 PM
Their is no way to stop someone from refreshing their browser.
Max Taxable
08-20-2015, 03:34 PM
Leverage browser cache of static content, this way the browser doesn't load the entire KB on refresh. In fact it will load only elements it didn't already encounter on first load.
Example, if your site loads 400kb, on refresh it should only be 1 or 2 percent of that. Because the rest is cached.
spamgirl
08-20-2015, 03:40 PM
It's actually to ensure people aren't using scripts to scrape our site. We don't want to turn off public access, but we do want people to stop taking content from our site and reposting it elsewhere. Having to track down their host information and file a copyright complaint is getting to be a real time suck.
We'd just like an error to be shown if they refresh more than once every minute, which I know is possible when server load is high (if it's above x then certain membergroups see an error message, while other member groups do not). I'd even be happy if it only updated the page content every minute.
Max Taxable
08-21-2015, 12:31 AM
You need "Ban Spiders by User Agent" then, a good comprehensive list of bad bots is available and contains most of the known content scrapers, and you can add any you see to the list as well.
spamgirl
08-21-2015, 12:48 AM
You need "Ban Spiders by User Agent" then, a good comprehensive list of bad bots is available and contains most of the known content scrapers, and you can add any you see to the list as well.
The problem is that it's a single person scraping our site for their own, and I don't know their IP, otherwise I'd just ban them. :(
Max Taxable
08-21-2015, 03:48 PM
You get the IP and their user agent string while they are on your site, from the WoL or even the server logs.
But let me get this straight - you want to restrict the reload of all visitors, because you have one person manually scraping content?
spamgirl
08-21-2015, 04:01 PM
You get the IP and their user agent string while they are on your site, from the WoL or even the server logs.
But let me get this straight - you want to restrict the reload of all visitors, because you have one person manually scraping content?
We have hundreds of guests on the site, I have no way to determine which the scraper is.
I just want to temporarily slow them until we can figure out what's going on. If you have a better idea, I'd be happy to take your advice. :)
Max Taxable
08-22-2015, 01:19 AM
We have hundreds of guests on the site, I have no way to determine which the scraper is.
I just want to temporarily slow them until we can figure out what's going on. If you have a better idea, I'd be happy to take your advice. :)I would solve this problem by installing Paul M's "Track Guest Visits" and studying the log it provides daily, looking for IP addresses that load a lot of pages. That mod tracks visitors that way. It also gives you their user agent and tells exactly what pages they visited as well, and it's all timestamped even.
You must identify the bad actor and stop IT, not penalize all visitors. You want to slow down your page loading or otherwise restrict visitors, get ready for the hit from google in your search results and pagerank.
spamgirl
08-22-2015, 11:50 AM
I would solve this problem by installing Paul M's "Track Guest Visits" and studying the log it provides daily, looking for IP addresses that load a lot of pages. That mod tracks visitors that way. It also gives you their user agent and tells exactly what pages they visited as well, and it's all timestamped even.
You must identify the bad actor and stop IT, not penalize all visitors. You want to slow down your page loading or otherwise restrict visitors, get ready for the hit from google in your search results and pagerank.
Yeah, you're right. :( I'll give that a try, thank you!
Zachery
08-22-2015, 04:58 PM
If you want to stop people from scraping your site, don't put it on the internet.
TheLastSuperman
08-22-2015, 08:14 PM
If you want to stop people from scraping your site, don't put it on the internet.
I know you weren't sitting there all riled up, intentionally posting something to sound mean or rude yet I thought back to an old saying from when we were kids, most of us were taught this; "If you don't have anything nice to say, don't say anything at all" - That's not you in my opinion. Since tone is always missing I can't assume but do you ever re-read what you type and realize its not offering one bit of help sometimes? I think the OP has a valid concern and wants helpful suggestions not a reply that can't be taken any other way but being a smarty-pants.
Spamgirl,
I think Max had an excellent idea... it may take more time to review the logs for certain guests with Paul's mod but if you do it now and find who you think the culprit is, it might help! Remember though that overseas a person can unplug their modem/router and BAM instant new IP address so if they happen to be where that can happen, lets hope they only scrape content and aren't toooooo web savvy :cool:.
spamgirl
08-22-2015, 08:28 PM
Spamgirl,
I think Max had an excellent idea... it may take more time to review the logs for certain guests with Paul's mod but if you do it now and find who you think the culprit is, it might help! Remember though that overseas a person can unplug their modem/router and BAM instant new IP address so if they happen to be where that can happen, lets hope they only scrape content and aren't toooooo web savvy :cool:.
FWIW, I get what Zachary is saying, but that doesn't mean I won't try to at least stem the flow. If we sit back and don't fight, we let the monsters win, and I refuse to do that in any situation. Nothing is hopeless. :)
Anyhoo, I agree that Max had an excellent idea! Already three IPs are sticking out like a sore thumb, and one of them seems to be the culprit (with a scraper I didn't even know about potentially being a second problem user). Based on their shitty web design skills, I'm hopeful that means they aren't tech savvy at all. :) Thank you all so much for your advice!
--------------- Added 1440343872 at 1440343872 ---------------
I've found the IPs and tried to block them with .htaccess. I included my own IP in order to test it, but I am still able to access the forum, I just can't see the CSS or images. Here is what I did:
order allow,deny
deny from ###.#.#.
deny from ###.#.#.
deny from ###.#.#.
allow from all
Does anyone know why it would be so wonky? I put it in the main folder of my forum (html1). My site is hosted on EC2, if that matters. I tried it last week and it worked, so I don't know why it wouldn't now...
Zachery
08-24-2015, 11:03 PM
Sometimes the truth hurts, but its important to understand the limitations of what you can do. You can ban an ip, but it will probably change and come back.
You can make it so only registered users can view content, but then your search rankings go down.
You can make some content pay only, but chances are if its stuff people want someone will steal it, and hopefully they don't do it with a stolen credit card.
I do think you should fight, just be ready for the long haul.
If they're actually stealing and rehosting your content on their site, you could try a DMCA, but it may or may not work.
spamgirl
08-24-2015, 11:22 PM
Sometimes the truth hurts, but its important to understand the limitations of what you can do. You can ban an ip, but it will probably change and come back.
You can make it so only registered users can view content, but then your search rankings go down.
You can make some content pay only, but chances are if its stuff people want someone will steal it, and hopefully they don't do it with a stolen credit card.
I do think you should fight, just be ready for the long haul.
If they're actually stealing and rehosting your content on their site, you could try a DMCA, but it may or may not work.
I've been doing the DMCA, but they just change hosts every day. Now I'm blocking by IP, and just redoing it constantly. I've actually found *multiple* scrapers since installing the Track Guests extension, go figure. :/ I'll just keep up the good fight and hope I annoy them into scraping someone else lol
bridge2heyday
08-28-2015, 10:00 AM
Is this what you are looking for ?
Limited Guest Viewing -- Motivate Guests to Register (https://vborg.vbsupport.ru/showthread.php?t=231352)
It's not easy to prevent people from scraping your site.
IP's can be changed/proxies can be used and headers can be spoofed (thus making methods to detect the user-agent useless).
There may be one way to stop scrapers, that is to add a JavaScript check to your site before people are able to view your site. CloudFlare does this to prevent certain DDoS attacks. However people could simply just go to your site in a normal browser and save each file individually to their desktop.
spamgirl
08-28-2015, 03:33 PM
Is this what you are looking for ?
Limited Guest Viewing -- Motivate Guests to Register (https://vborg.vbsupport.ru/showthread.php?t=231352)
That is! Thank you so much. :)
Zachery
08-30-2015, 10:48 PM
That is! Thank you so much. :)
FYI, some times search engines can penalize you for this. It won't work for anyone who is blocking cookies, or who decides to use specific user agents that are generally white listed.
More often than not it just leads to:
- More Users leaving your site
- Some Users registering just to view content, but not participate.
Max Taxable
08-31-2015, 12:30 AM
FYI, some times search engines can penalize you for this.Pretty sure spiders are immune to it. If memory serves. I used it for awhile.
But for the rest, you're right. All it really does is irritate people.
Zachery
08-31-2015, 10:55 PM
It can be considered content cloaking.
spamgirl
08-31-2015, 11:07 PM
I'm not actually planning to use it, the Track Guests extension was extremely helpful. I was just thankful that it was suggested :)
vBulletin® v3.8.12 by vBS, Copyright ©2000-2025, vBulletin Solutions Inc.