Quote:
Originally Posted by Avros
Not all Bots will adhere to or obey the text file. If anything it will provide them with more information than you want them to have.
Only a minority of spiders adhere to those rules.
|
They can ignore "robots.txt" but they can't ignore this. It's not "robots.txt" it has teeth.
Here's my current blacklist:
SemrushBot
SeznamBot
http_requester
BIXOCRAWLER
KomodiaBot
QACC
Plukkie
botje
Opera/9.80
'Mozilla
CompSpyBot
ScreenerBot
Chrome/15
python-requests
ZumBot
Ruby
Add Catalog
Genieo
socialbm_bot
Ezooms
omgilibot
Go http package
JikeSpider
Python-urllib
Iron
sputnik
Xenu
Wotbox
200PleaseBot
360Spider
Indy Library
Sogou
SEOstats
baiduspider
beta.statsit.com
statsit
SiteIntel
Yandex
GomezAgent
Nesotebot
DCPbot
AOL Advertising R&D
DataCha0s
aiHitBot
Apache-HttpClient
Zend_Http_Client
ReverseGet
XXX bot Content
vBSEO
spbot
OffByOne
thyroidbuzz
AcoonBot
coccoc
xpymep
proxyproxy2884
AppEngine
start.exe
Semiocast HTTP client
Firefox/3.6.23
Firefox/3.6.3
TurnitinBot
curl
SwpLc/1.6
GrepNetstat.com
news bot
AskTbPTV
checks
panopta
App3le
PhantomJS
AlwaysOnline
SISTRIX
proximic
CRAWL-E/0.6.4
WebMoney
HTMLParser
oBot
UnisterBot
ERACrawler
MSIE 2
MSIE 3
MSIE 4
MSIE 5
MSIE 6
crawler4j
NCSA_Mosaic
Rippers
80legs
Firefox/3.5.6
YaBrowser
majestic
EasouSpider
User-Agent
FunWebProducts
I am not seeing anything from "majestic" in online.php. Or any of these for that matter.