PDA

View Full Version : Heckling (and Jeckling) magpie-crawler


Digital Jedi
10-29-2013, 07:43 AM
Has anyone noticed or experienced issues with Brandwatch's magpie-crawler? I'll admit, I sometimes don't pay as close attention to my server logs as I should. But I did used to note that magpie-crawler visited my site quite often.

Over a year ago, I had to shut down my site because of persistent database errors. Mostly failure to connect or too many connections. Every time I had sorted it out, it would crash even more often, so I just decided it wasn't worth getting my account suspended all the time and shut the place down until I could sort it out. The database errors stopped all those years, until the last couple of months, where I've been actively working on my websites every day. Suddenly it's crashing again, on websites where I'm the only visitor. So I checked my logs again.

I noticed that magpie-crawler had visited my website today over 235 times. Seems excessive. I did some research online, but I could only find little known, somewhat overly emotional blog entries about the crawler chewing up their bandwidth, but nothing more professionally written. And oddly, no forums posting about it. I was wondering what your experience with them is. While 235 hits is a bit much for a bot, I would have thought a typical shared hosting environment could handle it.

Since I'm not all that concerned with whatever it is Brandwatch does, I went ahead and put in their robot.txt deny line, and went ahead and IP blocked the three IPs I found for them in my logs. I'll be watching my site (and my logs) over the next couple of days to see if that even makes a difference.

final kaoss
10-29-2013, 02:53 PM
Can always use try out the "miserable users" mod on the bot's ip addresses :)

https://vborg.vbsupport.ru/showthread.php?t=231106