Spider-Based Strategies

While observing hits on our honeynet, we noticed a high amount of traffic from spiders - a spider is a program which fetches a series of web pages for analysis, for example Google's and Yahoo's web crawlers. Typically a spider will announce itself as such in the 'user-agent' field of an HTTP reques such as 'Googlebot 1.0'. Other spider programs we observed announce themselves as a typical web browser while the tiny interval between successive requests shows they are running without user interaction. We have determined that the spamming attempts we received were caused by the presence of web forms on our honeypot. Search engines cannot be used to search for a form in a web site, therefore a spider or other parsing tool must have discovered a form on our honeypot. When discovered, spam was immediately inserted into the form, regardless of the more valuable shell access the honeypot advertised. This points to an automated, spider-based attacker as opposed to a human.