PDA

View Full Version : Limit Viewing/Allow Spidering


dynamite
01-05-2004, 10:00 PM
I would like to make a request for the the ability to limit Guest to viewing of the thread titles only. When they attempt to view the actual thread, they will receive a message stating that they must register to view posts.
On the flip side, I still want spiders like Google to be able to fully spider the site so I can continue to draw traffic from the search engine.

The reason for this request is because my bandwidth is greatly increased by users who aren't even participating on the site. Most come to view certain information contained in a thread, get that info, never return, and never participate. I figure if they have to take the time to register, verify their email, etc., then they might actually take the time to participate.

Thanks for any help you can give me!

NTLDR
01-05-2004, 10:21 PM
You do realise that giving google full access to the threads, and not normal guests will A) still allow guests to view the posts from googles cache and B) it would be potentialy very easy to bypass anyway, depending on how its done.

dynamite
01-06-2004, 12:47 AM
Yeah, for the majority this will inconvenience enough to either sign up or go away. I know there are the caches, but I think the majority of people don't pay attention to them, otherwise they wouldn't even be coming to my site, and besides, if they are reading the caches then they might not even come to the site anyways. I figure when I have 20,000+ unique visitor a month and only 3,000 registered users, and a few hundred actually participating, something is out of whack.
I know I could just require viewers to register to see anything, but I think this turns them away even more, and could affect my page ranking on Google which is a 6 right now.

dynamite
01-13-2004, 01:19 AM
I think this would be fairly simple to do... guess this means I am going to try myself because it looks like others would like it too: https://vborg.vbsupport.ru/showthread.php?t=60137

Just one thing, can someone point me to as to where it identifies the spiders so I can look at that code. Thanks!

dynamite
01-13-2004, 02:14 AM
Quick question... does anyone know if this hack (https://vborg.vbsupport.ru/showpost.php?p=235328&postcount=1) has any impact on spiders (ie. will it only allow them to spider 5 threads and then block them)? If not, I believe this one could be adjusted to fit vB3 and sever the same puropse (that is if it is ok with Zzed to modify it)

Zachery
01-13-2004, 02:29 AM
Quick question... does anyone know if this hack (https://vborg.vbsupport.ru/showpost.php?p=235328&postcount=1) has any impact on spiders (ie. will it only allow them to spider 5 threads and then block them)? If not, I believe this one could be adjusted to fit vB3 and sever the same puropse (that is if it is ok with Zzed to modify it)
either LET spiders visit, or DONT

and i belive its done with sessions so search engins wouldnt properly spider it anyway

dynamite
01-13-2004, 02:13 PM
either LET spiders visit, or DONT

and i belive its done with sessions so search engins wouldnt properly spider it anywayWell, the thing is is that I WANT spiders to be able to do their job, I just don't want guests to be able to view anything other than the thread title without registering.

NTLDR
01-13-2004, 02:48 PM
I'm pretty sure this is against googles policy and could result in your site being removed from there index IIRC.

dano
01-13-2004, 03:30 PM
Cant you get around that with the archive built into vb3? Google and other spiders can still crawl that and you wont have any issues turning off the board to non members.

NTLDR
01-13-2004, 03:39 PM
If you hacked the forum to prevent viewing by guests, but allowed it on the archive why bother to register when you could read the thread in the archive?

dano
01-13-2004, 03:42 PM
If you hacked the forum to prevent viewing by guests, but allowed it on the archive why bother to register when you could read the thread in the archive?
I doubt too many people would think of that option. But who knows. I think this hack could help him maybe

https://vborg.vbsupport.ru/showthread.php?t=59859&page=1&pp=15

dynamite
01-13-2004, 06:18 PM
That is what I want, but the only problem is that blocks the spiders also, and with out the spiders, there would be NO traffic. Plus I need them for adsense also.

I have been trying to play around with it all day, but I am having trouble coming up with what to add to it to make it recognize that it is the spider and allow it.

Something to the effect of:

if ($bbuserinfo['userid'] == 0 AND $guests['spider'] != $spider))
{
print_no_permission();
}

I have no idea what I'm really doing, so I would appreaciate it if someone could point me in the right direction here with it checking for the spiders also.

Zachery
01-13-2004, 06:47 PM
That is what I want, but the only problem is that blocks the spiders also, and with out the spiders, there would be NO traffic. Plus I need them for adsense also.

I have been trying to play around with it all day, but I am having trouble coming up with what to add to it to make it recognize that it is the spider and allow it.

Something to the effect of:

if ($bbuserinfo['userid'] == 0 AND $guests['spider'] != $spider))
{
print_no_permission();
}

I have no idea what I'm really doing, so I would appreaciate it if someone could point me in the right direction here with it checking for the spiders also.
that would also block spiders

anyway what it comes down to is, allow spiders and guests to view or dont, if the spiders are indexing your site your guests can search the archive OR google, so it doesnt matter. just allow both :)

dynamite
01-13-2004, 08:31 PM
OK... can somebody with some understanding of coding take a look at this for me. It seems to be working by allowing Google to spide but denying those who are guests from what I can tell by Who's Online. Please let me know if this seems OK to you.

if ((strpos(strtolower($_SERVER['HTTP_USER_AGENT']), "googlebot") == false) AND ($bbuserinfo['userid'] == 0 or $bbuserinfo['usergroupid'] == 3))
{
print_no_permission();
}

Granted, if these seems OK, then basically I would just have to allow the other user agents for the spiders I want... right?

Zachery
01-13-2004, 08:33 PM
OK... can somebody with some understanding of coding take a look at this for me. It seems to be working by allowing Google to spide but denying those who are guests from what I can tell by Who's Online. Please let me know if this seems OK to you.

if ((strpos(strtolower($_SERVER['HTTP_USER_AGENT']), "googlebot") == false) AND ($bbuserinfo['userid'] == 0 or $bbuserinfo['usergroupid'] == 3))
{
print_no_permission();
}

Granted, if these seems OK, then basically I would just have to allow the other user agents for the spiders I want... right?
im fairly sure that will just disallow all google spiders and anyone in usergroup 3 :)

dynamite
01-13-2004, 08:49 PM
This is what sucks not having a full understanding of coding and basically mixing and matching. Here is a screenshot of what my Who's Online looks like. As you can see Google Spidering are not getting the https://vborg.vbsupport.ru/external/2004/01/4.gif while the Guests are. From the looks of that it seems to be allowing the spider to do their job, but block the guests

Zachery
01-13-2004, 08:53 PM
basicy what that says is

if your google bot OR in group 3 show the no permissions screen :)

dynamite
01-13-2004, 09:00 PM
So basically, I need to switch it to != false that should take care of it.

What I want it to say is

if the user is NOT googlebot, AND they have no userid OR are in groupid 3, then show the no permissions page

Zachery
01-13-2004, 09:00 PM
So basically, I need to switch it to != false that should take care of it.

What I want it to say is

if the user is NOT googlebot, AND they have no userid OR are in groupid 3, then show the no permissions page
yes :)

Link14716
01-13-2004, 09:24 PM
I think you are wrong Faranth.

The original code said the if it is NOT a googlebot and blah blah, show the no permissions screen.

Zachery
01-13-2004, 09:25 PM
I think you are wrong Faranth.

The original code said the if it is NOT a googlebot and blah blah, show the no permissions screen.
we talked about that already and told him to reverse it ....

Link14716
01-13-2004, 09:26 PM
Reverse what? He had it right the first time. != false means that if it's true, a.k.a. the user IS a googlebot, show it a no permissions screen. He had it right the first time. :p

Zachery
01-13-2004, 09:28 PM
Reverse what? He had it right the first time. != false means that if it's true, a.k.a. the user IS a googlebot, show it a no permissions screen. He had it right the first time. :p
my bad lol ^^ ive been abit off all day

Link14716
01-13-2004, 09:30 PM
So have I... wow, did I repeat a sentance?

dano
01-13-2004, 09:43 PM
So have I... wow, did I repeat a sentance?
Whats the bottom line here to do this then. If its figured out will someone post it? Or is it what he posted the first time?

Also, if I switch the forums off and the guests cant see and neither can the crawlers, the crawlers will still be able to get the archive right? Thats what I am most concerned with.

NTLDR
01-13-2004, 09:46 PM
The archive gets switched off along with the forums.

dynamite
01-13-2004, 10:59 PM
The whole point of what I have been trying to do here is give the spiders free roam of the site. They can spider the normal forums (for those who use adsense), as well as the archives, but a normal person would not be able to do this. They would have to register in order to view anything else than the thread title.

dano
01-21-2004, 02:52 PM
The whole point of what I have been trying to do here is give the spiders free roam of the site. They can spider the normal forums (for those who use adsense), as well as the archives, but a normal person would not be able to do this. They would have to register in order to view anything else than the thread title.
Did you or anyone ever figure this one out?

dynamite
01-21-2004, 08:36 PM
Well, it seemed to work, but then I started noticing some of the spiders were allowed and some were not. I haven't figured that one out yet, but in the process of the testing I managed to get my Google PR cut in half. I'm waiting for that to build back up, plus I am going to try this on a site that doesn't really matter about the PR. So if I do figure out anymore, then I will post an update.

Zachery
01-21-2004, 08:37 PM
Well, it seemed to work, but then I started noticing some of the spiders were allowed and some were not. I haven't figured that one out yet, but in the process of the testing I managed to get my Google PR cut in half. I'm waiting for that to build back up, plus I am going to try this on a site that doesn't really matter about the PR. So if I do figure out anymore, then I will post an update.
whats the point of blocking spiders but not guests? they can use Google or whatever other search engin to view your pages

dano
01-21-2004, 08:49 PM
Well, it seemed to work, but then I started noticing some of the spiders were allowed and some were not. I haven't figured that one out yet, but in the process of the testing I managed to get my Google PR cut in half. I'm waiting for that to build back up, plus I am going to try this on a site that doesn't really matter about the PR. So if I do figure out anymore, then I will post an update.
Very cool, I look forward to the update. I have like 250 guests online right now, and I would really like to make them register. :)

Zachery
01-21-2004, 09:13 PM
Very cool, I look forward to the update. I have like 250 guests online right now, and I would really like to make them register. :)
whats the point of having 50000 users with 500 posts?

id rather have a low member count with a high thread post count :)

hadley
05-27-2004, 11:49 PM
I'll tell you the point:

I have a site with a ton more content than the forums alone, and forcing people to register is one way (via subsequent emails) to introduce them to other parts of the site that they may regard as more valuable than the forums. I want the same thing that Dynamite wants.

Yes, I do understand that most forums exist for discussion, but mine now contain two years of valuable content and expert advice -- many people are satisfied to read that Q&A exchange with experts, and don't see the need to post. However, I think it's fair that I ask them for their email address (with full opt-out privileges, of course), before I grant them access.

An example: Ordinarily, I wouldn't bother to make this post, because I know most of you don't care. But it bothers me that someone from the vB.com team would make a comment like "what's the point." In the real world of business, number of users is the point.

Zachery
05-28-2004, 02:01 AM
I'll tell you the point:

I have a site with a ton more content than the forums alone, and forcing people to register is one way (via subsequent emails) to introduce them to other parts of the site that they may regard as more valuable than the forums. I want the same thing that Dynamite wants.

Yes, I do understand that most forums exist for discussion, but mine now contain two years of valuable content and expert advice -- many people are satisfied to read that Q&A exchange with experts, and don't see the need to post. However, I think it's fair that I ask them for their email address (with full opt-out privileges, of course), before I grant them access.

An example: Ordinarily, I wouldn't bother to make this post, because I know most of you don't care. But it bothers me that someone from the vB.com team would make a comment like "what's the point." In the real world of business, number of users is the point.
If you allow spiders to spider your page, and display this info in a search engin, then anyone who googles your site could read it, without registering.

My comment stands valid