One conclusion was immediately obvious when we started to analyse data from the UK honeynet compromise in phishing technique one above - due to multiple simultaneous attacks by different blackhat groups, a significant amount of time would be required to extract and prepare the data from the network streams before more detailed analysis could take place. This data extraction process is repetitive and tedious, and if carried out manually represents an inefficient use of valuable analysis time. An automated solution was required.
The honeysnap script, written by David Watson of the UK Honeynet Project, grew out of this idea and was designed to process honeynet data feeds on a daily basis and produce a simple summary output to direct later manual analysis. The honeysnap script breaks down the data for each honeypot and provides lists of outbound HTTP and FTP GETs, IRC messages and Sebek keystroke logs. TCP stream re-assembly for interesting connections is automated, as is extraction, identification and storage of files downloaded by FTP or HTTP, meaning that much of the time-consuming preparatory work of incident analysis is removed, leaving the analysts free to concentrate on manually investigating key elements of an incident. Honeysnap also provides an automated method for screening IRC traffic for interesting keywords (e.g. bank, account, password) and providing daily summary reports by email.
Currently honeysnap is a basic proof of concept UNIX shell script and the alpha release can be found here, whilst a set of sample honeysnap output can be found here. A modular and fully expandable version written in Python is currently under development by members of the Honeynet Project and will be beta released to the community in June 2005.