Quote:
Originally Posted by T3rminator
Is it the sheer number of crawlers or are the crawlers doing weird stuff? Asked some of the folks at work, if the former, they were pointing as some form of DDOS protection....but I believe that is big $$$.
|
It's mostly the number but also the speed they are crawling. They are doing what they normally do when they are here but the general rule is that the number of spiders is kept reasonably small so that the impact isn't great. For example, we generally have about 30 Google or MSN spiders at any one time but we had more like 150 of these Huawei ones and they were working 3x as fast as the Google or MSN ones.
The (useless) answer from our ISP was to install a Fortinet Firewall in front of our server for an extra US$50 / month but I've managed to block a number of them and I can still use the Linux iptables firewall if I need to.
To give you an idea of the impact (and why it is hurting us) here are some raw numbers....
This time of year our data usage averages 15 Gb / day but since May 28th the average has been 141 Gb / day.
Likewise, we would normally serve ~500k pages a day with 700k server hits but the averages since 28th May have been 3.3M pages and 3.5M hits.
It is improving gradually and there are currently no Huawei spiders active but there are some other Chinese ones like Baidu that I still need to knock on the head.