New Zealand Chapter Status Report For 2008

Report covering activities from late 2007 to December 2008. 
ORGANIZATION
Changes in the structure of your organization.
Christian Seifert has relocated to Seattle and Peter Komisarczuk and Ian Welch from Victoria University have taken on the lead role in the Chapter for 2009. We would like to thank Christian for running the chapter for the last couple of years, as well as leading the research and development of our client honeypot development at Victoria University of Wellington.
The chapter remains fairly informal. Christian remains as a key contributor and architect of Capture-HPC, Capture-BAT and HoneyC remotely. Christian has also recruited Chiraag Aval to the chapter, Chiraag is a researcher at Washington University who is working on bare metal support for Capture-HPC at this time. We have two new PhD students working on developing mobile exploit detection and AI tools for analysis. 
The chapter has reviewed its membership recently, and is developing plans for the further R&D of client honeypot technology, further deployment and measurement of drive-by-downloads and analysis tools.
 
List current chapter members and their activities
Dr Peter Komisarczuk, Senior Lecturer, Victoria University of Wellington, coordinator/supervisor
Dr Ian Welch, Senior Lecturer, Victoria University of Wellington, coordinator/supervisor
Mr Christian Seifert, PhD student, Victoria University of Wellington, and at Microsoft, Redmond, USA, architect of Capture, lead researcher.
Mr David Stirling, MSc student, Victoria University of Wellington, developer of scalable client honeypots using Grid technology, operations manager of client honeypot deployment for 2008.
Mr Radek Hes, Programmer, Victoria University of Wellington, developer. 
Mr Ramon Steenson, associated with Victoria University of Wellington, developer of Capture.
Mr Russell Fulton, Information Security Officer, University of Auckland.
Mr Bojan Zdrnja, University of Auckland.
Fahim Abbasi, PhD student, Massey University, researcher.
Chiraag Aval, MSc student, University of Washington, USA, researcher and developer.
Recent student members and summer research assistants:
Mr Pacharawit Topark-Ngarm, PhD student, Victoria University of Wellington, researcher mobile client honeypots, developer and client honeypot operations 2009.
Mr Van Lam Le, PhD student, Victoria University of Wellington, researcher AI techniques for analysis, developer and client honeypot operations 2009.
Mr Ryan Chard, Honours student/research assistant, Victoria University of Wellington, developer.
Mr David Fowler, undergraduate student/research assistant, Victoria University of Wellington, developer.
 
DEPLOYMENTS
List current technologies deployed.
The University of Auckland have continued to provide a GDH node. This will continue through 2009.
A client honeypot deployment is provided through the University of Wellington (sponsored partially by InternetNZ and Victoria University). The client honeypot has been deployed and operational since April 2008. Several reports have been written and presentations provided at InternetNZ, Kiwicon Hacker conference and the NZ-BTF, and an academic paper is being prepared. It is intended to maintain this deployment through 2009 and data is being shared with the New Zealand Centre for Critical Infrastructure Protection, and AusCERT.
A generation III honeypot (roo) was hosted at Victoria University on a temporary basis for researcher Fahim Abbasi at Massey University. This will be supported by Massey University in 2009. The project identified and investigated design problems in building Virtual Honeynets and Fahim is in the process of compiling a comprehensive explanatory report. Special emphasis was laid on planning to achieve maximum logging capability within the available resources. The project was up for a period of just over 2 months, in which the system isolated 5 successful compromises and various types of attacks, summary of which will be published in a report in 2009.
 
Activity timeline: Highlight attacks, compromises, and interesting information collected.
See the presentations and papers listed for highlights of results primarily from the client honeypot deployment. Information of interest is currently recorded in Christian Seifert’s blog, see http://homepages.mcs.vuw.ac.nz/~cseifert/blog/index.php
 
RESEARCH AND DEVELOPMENT
List any new tools, projects or ideas you are currently researching or developing.
Future work on developing Capture-HPC, Capture-BAT and HoneyC are primarily discussed in the next section. New work has started in a number of areas and we have released a new tool: 
1. Christian Seifert has developed FFDetect (a Java library for detecting Fast Flux domains) which is based on a Thorsten Holz paper and the Team Cymru ASN service (the tool is available at: http://homepages.mcs.vuw.ac.nz/~cseifert/FFDetect/index.shtml)
2. Hybrid high interaction and low interaction client honeypot. A research prototype is under development. This hybrid model will allow systems to provide faster scanning of web servers/pages.
3. Web services wrapping of Capture-HPC for large-scale deployment of client honeypots using BPEL workflows and Grid computing (Globus GT4). A prototype will be tested in Q1/2009, a production version may be developed for release later in 2009.
4. Mobile client honeypot research has begun in late 2008 and will carry on through to 2011. The tools developed may be released – depending on limitations imposed by funding sources.
5. AI analysis tools for automated client honeypot results began in late 2007 and will carry on through to 2011. Tools may be released periodically, dependent on limitations imposed by funding received.
At the University of Washington there are two specific projects:
6. Developing a Bare-metal version of Capture-HPC. This version would work without involvement of virtual machines. This work would include a study analyzing the kind of attacks which are not caught by the current version of Capture-HPC as malware can detect and avoid Virtual Machines.
7. Extending Capture-HPC by hooking the browser to give more information (like what activeX components are loaded on each page, etc.) and probably answer some interesting research questions on how activeX components can be used to compromise machines.
 
List tools you enhanced during the last year
Capture-HPC
Capture-HPC is a high interaction client honeypot system. It has been greatly enhanced in 2008. The current production release of the Capture-HPC is 2.5.1, see https://projects.honeynet.org/capture-hpc/wiki/Releases for current release information and prior releases.
Capture-HPC is currently being extended in a number of areas:

  • Hooking the network API to collect further network events
  • Making Capture stateful, with database support, including the capture-client sending the malicious page/code to the capture-server. Allows us to start, pause, stop execution, etc.
  • Some integration with other tools for status and analysis (geo-location/ASN nalysis, etc.)

The development and testing is expected to be complete in Q2/2009. Future options for development work on Capture-HPC are documented at https://projects.honeynet.org/capture-hpc/wiki/Proposals. This is to be discussed at the Honeynet Workshop February 2009.
Capture-BAT
Capture-BAT is a behavioral analysis tool of applications for the Win32 operating system family. Capture BAT is able to monitor the state of a system during the execution of applications and processing of documents, which provides an analyst with insights on how the software operates even if no source code is available. Capture-BAT is the analysis tool at the heart of Capture-HPC client. See https://www.honeynet.org/node/315.
Capture-BAT will be updated to include the functionality to be released in the next major release of Capture-HPC. See the description above, and should be ready for release sometime in Q2/2009. It is our aim to keep Capture-BAT and Capture-HPC development in step as far as possible.
HoneyC
HoneyC is a low interaction client honeypot framework that allows us to find malicious web servers on a network. Instead of using a fully functional operating system and client to perform this task, HoneyC uses emulated clients that are able to solicit as much of a response from a server that is necessary for the analysis of malicious content. Version 1.3.0 was released on 19th January 2008, see https://projects.honeynet.org/honeyc/wiki/Releases, for more information. No development plans are currently in place to extend HoneyC. New low interaction systems, based on the principles described in publication [1] are in the research phase. Release of these new tools has not yet been decided.
 
Would you like to integrate this with any other tools, or are you looking for help or collaboration with others in testing / developing tool?
 
We are open to work with other chapters, depending on available (usually manpower) resources. We are looking to extend the analysis offered through Capture-HPC by including for example geo-location of malicious servers and exploit servers. We expect to reuse some components from various other projects such as Nepenthese (asn, geoip), ffdetect, and add in statistics and graphs. We would be happy to receive guidance and information on potential tools that could be used.

Explain what kind of help or tools or collaboration you are interested in.
We are open to work with other chapters, depending on available (usually manpower) resources. As we develop our analysis tools we will need further data sets for AI training purposes.
We would like to extend our client honeypot deployment as one thesis we would like explore is whether location is a large factor in delivery of exploits. Auckland University has offered to deploy a client honeypot. 
 
FINDINGS
Highlight any unique findings, attacks, tools, or methods.
In [1] we presents a novel classification method for detecting malicious web pages that involves inspecting the underlying static attributes of the initial HTTP response and HTML code. Because malicious web pages import exploits from remote resources and hide exploit code, static attributes characterizing these actions can be used to identify a majority of malicious web pages. Combining high-interaction client honeypots and this new classification method into a hybrid system leads to significant performance improvements.
Paper [2] presents a novel classification method for detecting malicious web pages that involves inspecting the underlying server relationships. Because of the unique structure of malicious front-end web pages and centralized exploit servers, merely counting the number of domain name extensions and DNS servers used to resolve the host names of all web servers involved in rendering a page is sufficient to determine whether a web page is malicious or benign, independent of the vulnerable web browser targeted by these pages. Combining high-interaction client honeypots and this new classification method into a hybrid system leads to performance improvements.
In [3] we discuss that merely recording the network traffic is insufficient to perform an efficient forensic analysis of an attack. Custom tools need to be developed to access and examine the embedded data of the network protocols. Once the information is extracted from the network data, it cannot be used to perform a behavioral analysis on the attack, therefore limiting the ability to answer what exactly happened on the attacked system. Implementation of a record / replay mechanism is proposed that allows the forensic examiner to easily extract application data from recorded network streams and allows applications to interact with such data for behavioral analysis purposes. A concrete implementation of such a setup for HTTP and DNS protocols using the HTTP proxy Squid and DNS proxy pdnsd is presented and its effect on digital forensic analysis demonstrated.
In [4] we show that applications that aim to defraud the victim are the primary malware type identified and show that antivirus products are only able to detect, on average, approximately 70% of any malware pushed in a drive-by-download attack.
In the KYE paper “Behind the scenes of Malicious Web Servers” [5] we increase our understanding of malicious web servers through analysis of several web exploitation kits that have appeared in 2006/07: WebAttacker, MPack, and IcePack. Our discoveries will necessitate adjustments on how we think about malicious web servers and will have direct implications on client honeypot technology and future studies.
In [6] we present the design and analysis of a new algorithm for high interaction client honeypots for finding malicious servers on a network. The algorithm uses the divide-and-conquer paradigm and results in a considerable performance gain over the existing sequential algorithm used in Capture-HPC. The performance gain not only allows the client honeypot to inspect more servers with a given set of identical resources, but it also allows researchers to increase the classification delay to investigate false negatives incurred by the use of artificial time delays in current solutions. The divide and conquer algorithm, as well as a bulk algorithm (as used by the Mitre honeypot) is incorporated in release 2.5.1 of the Capture-HPC, see https://projects.honeynet.org/capture-hpc/wiki/Releases.
In [8] we introduce the Grid Enabled Internet Instrument (GEII, http://homepages.ecs.vuw.ac.nz/~peterk/geii/) concept and discuss instruments that are being developed at Victoria University to measure Internet quality. The first instrument is a Grid version of the network telescope for studying Internet Background Radiation (IBR) and the second is a hybrid client honeypot system using high and low interaction devices for scanning the web for malicious content and servers. The GEII framework is a work in progress and in [7] we overview how workflows using BPEL and Grid computing could be used for the deployment of client honeypots.
In [9] we discuss strategies for sampling Internet Background Radiation (IBR) with network telescopes. The paper is based on the thesis by Dean Pemberton, which contains some analysis of the 210GB IBR trace that was captured on a \16 network telescope, see http://homepages.ecs.vuw.ac.nz/~peterk/geii/dean-pemberton-thesis.pdf. The dataset is available to researchers as a compressed pcap archive (DATCAT, http://www.datcat.org/).
From the virtual honeypot deployment Fahim indicates that there were a few notable observations including a high rise in MSSQL worm propagation (September to November). Attack analysis lead to an insight on botnet formation by various hacker groups. It was concluded that European hackers compromise a large number of not very well patched and insecure hosts at China, and thus utilize their resources for achieving their goals. Fahim also studied various aspects of effectively informing administrators of infected networks about the illicit use and abuse of their resources by the hackers. Sophisticated hackers are now using encryption to mask their irc bot logs. Hackers have automated their attack tools and have thus increased their attack efficiency. Some of their tools have been isolated from compromised honeypots. Fahim is investigating how to inform web hosting companies of hacker tools, warez and rootkits being hosted by the attackers and thus abusing their services.
 
Any trends seen in the past year?
Our main activity has been around the detection of drive-by-download servers in the .nz domain. The number of detected web servers indicates no clear trend of increasing malicious server numbers in the .nz domain. The data below indicates the number of confirmed malicious servers detected from April to November 2008. Refer to our publications for more information, a detailed paper on the .nz scan is expected in 2009.
April 51
May (No measurement available)
June 62
July 97
August 78
September 77
October 88
November 62
Table 1. Number of malicious web servers detected on the .nz domain from April to November 2008.
 
What are you using for data analysis?
A variety of tools, spreadsheet, some analysis in R (IBR analysis).
 
What is working well, and what is missing, what data analysis functionality would you like to see developed?
As discussed above.
 
PAPERS AND PRESENTATIONS
Are you working on or did you publish any papers or presentations, such as KYE or academic papers?  If yes, please provide a description and link (if possible)
Working on several papers related to client honeypots and IBR/network telescopes, to be published in 2009. 
[1] Seifert, C., Komisarczuk, P., Welch, I., Identification of Malicious Web Pages with Static Heuristics. in the Australasian Telecommunication Networks and Applications Conference, ATNAC 2008, December 2008, Adelaide, Australia.
[2] Seifert, C., Komisarczuk, P., Welch, I., Aval, C. U., and Endicott-Popovsky, B. Identification of Malicious Web Pages Through Analysis of Underlying DNS and Web Server Relationships. in 4th IEEE LCN Workshop on Network Security (WNS 2008), Montreal, 2008.
[3] Seifert, C., Endicott-Popovsky, B., Frincke, D., Komisarczuk, P., Muschevici, R. and Welch, I., Justifying the Need for Forensically Ready Protocols: A Case Study of Identifying Malicious Web Servers Using Client Honeypots. in 4th Annual IFIP WG 11.9 International Conference on Digital Forensics, Kyoto, 2008.
[4] Navarez, J., Seifert, C., Endicott-Popovsky, B., Welch, I. and Komisarczuk, P. Drive-By-Downloads, Victoria University of Wellington, Wellington, 2008, technical report available from http://www.mcs.vuw.ac.nz/~cseifert/publications/CS-TR-08-01; accessed on 08 April 2008.
[5] Seifert, C. Know Your Enemy: Behind The Scenes Of Malicious Web Servers, The Honeynet Project, 2007, available at http://www.honeynet.org/papers/wek/KYE-Behind_the_Scenes_of_Malicious_Web_Servers.pdf, accessed on 7 November 2007.
[6] Seifert, C., Komisarczuk, P. and Welch, I. Application of divide-and-conquer algorithm paradigm to improve the detection speed of high interaction client honeypots. 23rd Annual ACM Symposium on Applied Computing, Ceara, Brazil, April 2008.
[7] David Stirling, Ian Welch, Peter Komisarczuk, Designing Workflows for Grid Enabled Internet Instruments. In the Proceedings of the 8th IEEE CCGrid 2008 Conference, Lyon, France, May 2008.
[8] Komisarczuk, P., Seifert, C., Pemberton, D., Welch, I. Grid Enabled Internet Instruments, Proceedings of the 2007 IEEE Global Communications Conference, Washington DC, USA, November 2007,
[9] Dean Pemberton, Peter Komisarczuk, Ian Welch, Internet Background Radiation Arrival Density and Network Telescope Sampling Strategies, In the proceedings of the Australasian Telecommunication Network and Application Conference (ATNAC) 2007, Christchurch, New Zealand, December 2007.

Are you looking for any data or people to help with your papers?
Data from other client honeypot deployments would be of interest.
 
Where did you present honeypot-related material? (selected publications ) 
Presented client honeypots and the .nz scan at the following meetings, (i) NZ-BTF, (ii) Kiwicon (Hacker conference), (iii) InternetNZ (sponsors of the .nz scan).
Christian has also presented at University of Washington and at Microsoft.
 
GOALS
Which of your goals did you meet for the past year?
We increased the visibility of our work in New Zealand, we were awarded a small grant from InternetNZ to study the .nz domain for malicious servers. We have increased the number of students working on honeypot technology and have written a number papers and a KYE, so generally met our goals.
 
Goals for the next year.
1 Continue to extend our chapter and develop students with honeynet knowledge.
2 Further develop our tools and integrate with third party tools as appropriate.
3 Continue the .nz domain scan (seeking funding). Share data as appropriate.
4 Apply for funding to support future R&D.
5 Maintain our research and development and develop linkages with other researchers.
 
MISC ACTIVITIES
None