vb.org Archive

vb.org Archive (https://vborg.vbsupport.ru/index.php)
-   vBulletin 2.x Full Releases (https://vborg.vbsupport.ru/forumdisplay.php?f=4)
-   -   vbArchive - Search Engine Indexer for vBulletin (https://vborg.vbsupport.ru/showthread.php?t=47667)

saint_seiya 02-06-2003 07:41 PM

Resolve IPs, cool thanks man! :)

BTW, congrats on the visitors, i hope that everysingle one of them joins (same for my site :bandit: ) :):):)

kuska 02-06-2003 10:28 PM

Damn Google just wont visit me :(

saint_seiya 02-06-2003 10:32 PM

I got like 6 bots on right now. googlebots. =|

TECK 02-07-2003 01:28 AM

Quote:

Originally posted by xiphoid
damn
IT DOESN"T STOP

4 Members and 53 Guests

[high]Most users ever online was 58 on 06-02-2003 at 21:21.
There are currently 0 members and 49 guests on the boards.[/high]

That helped you break the record of visitors also, heh. Still having Google now onto your board...

Overgrow 02-07-2003 02:55 AM

AUTO-FORWARDING

I'll put this out there for those of you who want to send users to the real forum page and spiders to the archive. I hope you don't get banned for link cloaking... see my last post.

PHP Code:

$homeurl="yourdomain.com";
if ((!
stristr(getenv(HTTP_REFERER),$homeURL)) or (strlen(getenv(HTTP_REFERER)) < 1)) {
header("location:http://www.$homeurl/forum/showthread.php?threadid=$thread[threadid]");
    } 

It checks if the referrer page is on your domain. If it's not, they obviously came in from somewhere else (ie, search results), so get out of the archive and on to the real thread.

wooolF[RM] 02-07-2003 02:02 PM

]
Quote:

Originally posted by wooolF[RM]
Just got an idea...

Imagine Forum home page :
Users Currently Online: 200 [ 100 users + 100 guests ] <-- just an EXAMPLE

idea is to trace IPs of all users and if they match any of the IPs owned by any of search crawlers like googlebot, altavista etc, show this :

Users Currently Online: 200 [ 100 users + 80 guests + Google + Altavista ] <-- just an EXAMPLE


Maybe looks ugly... maybe add extra queries... instead of dnsing/tracing all IPs u can just look after its ident (like Mozilla for IE).


PS: maybe it's not clever to add it on the forum home, but I would REALLY like to see this feature implemented on Who's Online page :)

I know u can do it, TECK ;)

no comments at all? :paranoid:

TECK 02-07-2003 02:31 PM

You can't do it, due to the way it is now set the database in vBulletin, at least not to my knowledge.
Code:

$loggedins = $DB_site->query_first("
  SELECT COUNT(*) AS sessions
  FROM session
  WHERE [high]userid=0[/high] AND lastactivity>$datecut
");


Floris 02-07-2003 02:32 PM

KuraFire has released a hack to modify the whois online :) after my suggestion and request.

Online 15 users
Staff: 3 (user1,user2,user3)
Members: 7 (user4,user5,user6,user7,user8,user9,user10)
Guests: 5 (user11,user12,user13,user14,user15)

Maybe guests can now be split into:
Guests: 2 (user11,user12)
Search Engines: 3 (user13,user14,user15)

TECK 02-07-2003 02:33 PM

You cannot Floris, that's why I posted the query.

Floris 02-07-2003 02:54 PM

When I typed my post, your post wasn't there yet :)
teck: Today 05:31 PM
xip: Today 05:32 PM

Idea: can't we rewrite it to have a seperate usergroup id for bots from search engine? like the ban script- funtion, but instead of banning, showing its a search engine bot: google. :)

wooolF[RM] 02-07-2003 02:56 PM

]:(

TECK 02-07-2003 03:25 PM

The way it works now vBulletin is this:
Every time an user (guest or member) enters the site, a unique session is created, that is automatically deleted after 900 seconds, if not in use anymore.

The highlighted part, userid=0, reflects only the guests, since they have no userid's. So the query counts the the sessions opened by those users, not their user agent or any other ident method.

Unfortunately, there is no way around this... is not as simple as it is with members.

inphinity 02-07-2003 03:45 PM

something slightly similar (see attachment)
search engines are listed in italics

wooolF[RM] yes it is possible although it would add 1 extra query to index.php - covering every search engine isnt realistic (new ones everyday etc) but doing the major ones is fairly easy.

i'll get floris to test some bits tonight if he's around on irc

inphinity 02-07-2003 03:47 PM

Quote:

Originally posted by TECK
Hmmmmmm... I'm really pissed... you see the [high]crawler918.com[/high] in the pic above?
That's a spy. Read more here about it.

To block the scums, add onto htaccess.txt file, at the top, this information:
Code:

<limit GET>
  order allow,deny
  deny from 12.148.196.
  deny from 12.148.209.
  allow from all
</limit>

then upload the file and rename it to .htaccess
That will block them for good, damn crooks.

deny from 12.148.196.
deny from 12.148.209.

those deny's block more that just the monkeys at crawler918.com
my limit section in htaccess looks like:

Code:

<limit GET POST>
  order allow,deny

## -----------------------------------------------------------------------------
## block crawler918.com - http://www.nameprotect.com
## http://www.advogato.org/article/610.html
## http://ws.arin.net/cgi-bin/whois.pl?...AMEPROTECT.COM
## http://ws.arin.net/cgi-bin/whois.pl?queryinput=!%20NET-12-148-196-128-1  /25
## http://ws.arin.net/cgi-bin/whois.pl?queryinput=!%20NET-12-148-209-192-1  /26
## http://ws.arin.net/cgi-bin/whois.pl?queryinput=!%20NET-12-175-0-32-1      /28
  deny from 12.148.209.192/26
  deny from 12.148.196.128/25
  deny from 12.175.0.32/28
## -----------------------------------------------------------------------------
## block cyveillance - http://www.cyveillance.com
## http://www.webmasterworld.com/forum11/1587.htm
## http://ws.arin.net/cgi-bin/whois.pl?...ut=CYVEILLANCE
## http://ws.arin.net/cgi-bin/whois.pl?queryinput=!%20NET-63-148-99-224-1    /27
## http://ws.arin.net/cgi-bin/whois.pl?queryinput=!%20NET-65-118-41-192-1    /27
  deny from 63.148.99.224/27
  deny from 65.118.41.192/27
## -----------------------------------------------------------------------------

## =============================================================================
## -----------------------------------------------------------------------------
  allow from all
</limit>

cyveillance do roughly the same thing as nameproctect
also in my robots.txt i've got:

# allow everyone else
User-agent: *
Disallow:

# block turnitin.com
User-agent: TurnitinBot
Disallow: /

www.turnitin.com might be a good cause for teachers - but they charge for accessing the data they've collected - so i'd rather not have them using my bandwidth/server load for free. So they can stay off my site until they decide to give a little back.

TECK 02-07-2003 03:56 PM

Hmm, I did a WHOIS on their company (crawler918) and it came up with those 2 IPs...
Just curious, what results you got (other names)?

wooolF[RM] 02-07-2003 04:28 PM

][ 19:31:44 ] _? ? /dns [ www.crawler918.com ] ...
[ 19:31:45 ] _? ? Failed to resolve : [ no such user ]

wooolF[RM] 02-07-2003 04:29 PM

]
Quote:

Originally posted by inphinity
something slightly similar (see attachment)
search engines are listed in italics

wooolF[RM] yes it is possible although it would add 1 extra query to index.php - covering every search engine isnt realistic (new ones everyday etc) but doing the major ones is fairly easy.

i'll get floris to test some bits tonight if he's around on irc

Looks SEXY!!! Just what I needed :D

Also 1 extra query is not that much... maybe just adding it to who's online...

Thanx for the effort! :D

Floris 02-07-2003 05:05 PM

Wow, that was easy, only 5 lines of code or something :)

wooolF[RM] 02-07-2003 08:51 PM

]as said earlier, looks sexy :) if you could also release it instead of teasing me :p ;)

Floris 02-07-2003 08:58 PM

If inp allows me to make a release, sure :)

wooolF[RM] 02-07-2003 09:00 PM

release? it's just an addon to existed hack... uhm...

Floris 02-07-2003 09:03 PM

"I will release it"
"I will addon it"

I will go with 'release'.

wooolF[RM] 02-07-2003 09:10 PM

sorry... /me hides in the nearest bush and cries silently...

kuska 02-07-2003 09:58 PM

w00t, this is to l33t teck :)
FINALLY !!!!!!!!!!!!!!!!
43 Google BOTS crawling since yesterday and STILL AT IT !!!!!!
Thanks TECK :)
HOTM for this HACK !!!!!!!

limey 02-07-2003 10:10 PM

im so lucky...im mostly experiencing the turnitin.com crawl.

added them to robots.txt, but have to wait till their cached version expires.

inphinity 02-07-2003 10:58 PM

you can add a broad deny for them for 48 hrs (time it takes for their cache of robots.txt to expire)

## turnitin.com
deny from 64.140.49

remember to remove it thou, it blocks a few more than turniton.com but they dont own their own ip block...

>as said earlier, looks sexy if you could also release it instead of teasing me

just cleaning stuff up atm need to write instructions as well :/

wooolF[RM] 02-07-2003 11:07 PM

]oki, I'll just hang on, thanx for the job u guys do :)

TECK 02-07-2003 11:50 PM

Quote:

Originally posted by inphinity
something slightly similar (see attachment)
search engines are listed in italics

wooolF[RM] yes it is possible although it would add 1 extra query to index.php - covering every search engine isnt realistic (new ones everyday etc) but doing the major ones is fairly easy.

i'll get floris to test some bits tonight if he's around on irc

My mistake, I missunderstood. I thought wooolf[RM] is referring to the online users, on the forumhome page, not the actual online.php file.

About the index.php file, you said is possible to be done also, can you post the code? I would like to see please, so I can learn from you a tip.
Is not possible into my eyes...

TECK 02-07-2003 11:57 PM

Quote:

Originally posted by kuska
w00t, this is to l33t teck :)
FINALLY !!!!!!!!!!!!!!!!
43 Google BOTS crawling since yesterday and STILL AT IT !!!!!!
Thanks TECK :)
HOTM for this HACK !!!!!!!

As I said, it takes time.
Please don't panic if the links are dropped in a week or 2, is normal... they are moved from the "fast" crawl to the deep one.

TECK 02-08-2003 11:10 AM

If you want to display nice names for your crawlers, instead of "Guest", see attached file (20 seconds install).
All you have to do is to add your crawler name and IP part.

NOTE: Pay attention to the commas, when you add each crawler.
Notice that the last one doesn't have a comma at the end.

wooolF[RM] 02-08-2003 11:34 AM

]
Quote:

Originally posted by TECK
If you want to display nice names for your crawlers, instead of "Guest", see attached file (20 seconds install).
Thanx for the snippet :cool: :banana: :D

Floris 02-08-2003 11:36 AM

You guys should have just waited.

wooolF[RM] 02-08-2003 11:38 AM

]
Quote:

Originally posted by xiphoid
You guys should have just waited.
for what? :) TECK released a nice snippet, it works here (just tried), no extra load noticed... Just a great addition to the board :)

TECK 02-08-2003 02:09 PM

Example of the script in action for crawler name instead of Guest...
Hmmm, 27 Google crawlers not chewing the web site...

TECK 02-08-2003 02:13 PM

Guys, if you get new crawler IP's, please post them here so everyone can add them...
Thanks.

Floris 02-08-2003 02:37 PM

This is why our script is better, it doesn't care about the IP

Here are some screenshots for inph to link to.

Floris 02-08-2003 02:37 PM

He will soon release his addon, which will adjust the nosessioshash part and makes guest turn into the bot on online.php and shows how many bots are online on index.php at whois online section (following me?) hehe

TECK 02-08-2003 02:39 PM

Quote:

Originally posted by xiphoid
This is why our script is better, it doesn't care about the IP

Here are some screenshots for inph to link to.

Who's saying my script is better?
You should release it so everyone can use it.

Floris 02-08-2003 02:40 PM

Quote:

Originally posted by TECK
Who's saying my script is better?
You should release it so everyone can use it.

Because we are working on it, geez, told you 500x on irc already :) Everybody can use, *when it's done*
Just sit down and wait :banana:

TECK 02-08-2003 02:41 PM

Well, you posted screenshots, so I presumed is done.
Then you should wait before you post anything... :p

And I don't like to sit down. :banana:


All times are GMT. The time now is 10:35 PM.

Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.

X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.02031 seconds
  • Memory Usage 1,833KB
  • Queries Executed 10 (?)
More Information
Template Usage:
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (3)bbcode_code_printable
  • (1)bbcode_php_printable
  • (10)bbcode_quote_printable
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (4)pagenav_pagelink
  • (1)pagenav_pagelinkrel
  • (1)post_thanks_navbar_search
  • (1)printthread
  • (40)printthreadbit
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • showthread
Included Files:
  • ./printthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/class_bbcode_alt.php
  • ./includes/class_bbcode.php
  • ./includes/functions_bigthree.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • printthread_start
  • pagenav_page
  • pagenav_complete
  • bbcode_fetch_tags
  • bbcode_create
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • printthread_post
  • printthread_complete