Project 14 - Glastopf improvements

Student: Phani Vadrevu
Primary mentor: Jamie Riden
Backup mentor: Lukas Rist

Google Melange: http://www.google-melange.com/gsoc/project/google/gsoc2012/pvadrevu/16001

Project Overview:
The project aims at implementing several ideas that will enhance the functionality of Glastopf (new version to be called Glaspot v3) as a web application honeypot. Glaspot is going to be more autonomous as the HTTP requests get automatically classified. Also, new patterns will be extracted from the classified requests. Requests using the POST method will be handled. Forms and scripts will be added for attracting and trapping comment spammers and brute forcers. The PHP sandbox will be made more secure and fingerprint resistant. FTP attacks are also going be analyzed.

Project Plan:

  • April 23rd - May20th: Community Bonding Period
  • May 21st : GSoC 2012 coding officially starts
  • May 21st - May 28th:
    1. Set up a FTP honeypot (Part-5)
    2. Work on inet and bgp prefix datatype support (Part-4).
    3. Code the IP profiler module. (Part-1a)
    4. Begin the code for the clustering algorithms required for the scan classifier module(Part-1c)
  • June 2 - June 11:
    1. Write the Search engine filter module (Part-1b)
    2. Finish all the back end code for the classifier module (Part-1c)
    3. Design the scan classifier module (Part-1c)
  • June 11 - June 20:
    1. Code the scan classifier (Part-1c)
    2. Work on date/time datatype support (Part-4).
    3. Debug the 3 modules and test for issues (Part-1[abc])
  • June 21 - July 4:
    1. Code all the Request handling features to be added (Part-2)
    2. Discuss the results of FTP honeypot setup with the mentors (Part-5)
    3. Code changes for dealing with issues regarding trackability (Part-5)
  • July 5 - July 18:
    1. Code the changes to PHP sandbox (Part-4)
    2. Code scripts to extract patterns from generated clusters (Part - 1c)
    3. Act on FTP honeypot discussion (Part-5)
    4. Analyze the results of the data collected in past 3 months
    5. July 9th - July 13th: Mid Term Assessments
  • July 19 - July 27:
    1. Test the functioning of extracted patterns (Part-1c)
    2. Code the PostgreSQL module (Part-4)
  • July 28 - August 7:
    1. Work on new features proposed as part of the above discussion
    2. Bug fixing and testing of all old and new modules
  • August 8 - August 13:
    1. Work on documentation issues. (Part-6)
    2. Code cleanup to suit PEP-8 requirements (Part-6)
  • August 13th: Suggested "pencils down" date, coding close to done
  • August 20th: Firm "pencils down" date, coding must be done
  • August 24th - August 27th: Final Assessments
  • August 31st - Public code uploaded and available to Google

Project Deliverables:

  1. Addition of modules for IP profiling, clustering of HTTP requests and generation of new patterns.
  2. Handling of POST requests for known attacks and addition of new HTML forms and parsing scripts for dealing with comment spammers and brute forcers.
  3. Implement a whitelist approach to secure the PHP sandbox. Randomize the data output by the functions to make it fingerprint resistant.
  4. Addition of custom SQLite functions for dealing with new data types (like IPv4 addresses, date/time fields etc) as may be deemed necessary by the other parts.
  5. Miscellaneous issues: Explore ways to make Glaspot installations untrackable . Also, run Glaspot in parallel with Kippo and Dionaea. This is to trap automated scripts that might have been written to break into FTP/SSH daemons of web servers.
  6. Diagnose and fix PEP8 issues in the entire code base.

Project Source Code Repository: svn://glastopf.org:9090/glaspot

Student Weekly Blog: https://www.honeynet.or/blog/349

Project Useful Links:

Project Updates:
Week 1 (21st May - 28th May):

  • Set up Glaspot with support for running behind Squid
  • Implemented the TRACE method
  • Experimented with k-means clustering on request data collected from hpfeeds

Week 2 (29th May - 4th June):

  • Wrote the code for keeping track of the scans
  • Coded the IP profiler module and deployed it for testing

Week 3 (5th June - 11th June):
Done Last Week:

  • Worked on debugging issues in the IP profiler module

Planned for next week:

  • POST request handling
  • Work on changes to the IP profiler module according to feedback

Issues:

  • Sample target clusters should be determined for the pattern extraction module

Week 4 (12th June - 18th June):
Done Last Week:

    Worked mainly on POST request handling. Setup RFI handling for POST requests. Setup HTML forms for getting the login bruteforce attack/comment spam (Did not commit this code to the main repo yet. I will wait for some results before doing that)

Planned for next week:

    Next week, I will focus mainly on improving the PHP Sandbox. For this, I am going to collect a few bot samples and make sure they can be run on the sandbox. Also, I need to start working on the feedback received regarding the IP profiler module.

Weeks 5, 6 (19th June - 2nd July):
Done :

  • Added black/whitelisting for the PHP Sandbox
  • Added a comment post emulator for reflecting back comments
  • Worked on the 2nd patch for the IP Profiler

Planned for next week:

  • Next week, I am going to test the code and wrap up the work on the IP profiler module.

Issues:

  • The indexing of the links (by Google) on the website is slow, although the crawling rate is pretty good. As none of the RFI dorks have been indexed yet, I do not see any RFI attacks as of now. Further, there has been slow in the comment spam too.

Week 7 (3rd July - 9th July):
Done :

  • Continued to work on modifying the IP profiler module. Just committed some code, but this is slower than expected. I will try to finish this part by this week.
  • Planned for Next Week:

    • Work on porting the scans table into a Python Data Structure and do the generic computation there.
    • Implement the IP profiler for other SQL based DBs as well.
    • Sanitize comments received and implement IP specific comment handling

    Week 8 (10th July - 16th July):
    Done:

    • Completed the coding of the IP profiler module.
    • Deployed the profiler module and tested it. Fixed some bugs in the module.

    To be done:

    • Implement the IP profiler for other SQL DBs
    • Integrate IP specific comment handling with the profiler module

    Weeks 9, 10 (17th July - 30th July):
    Done:

    • Implemented IP based comment handling

    Planned for next week:

    • Porting of APD to the next Zend version
    • Also, Implement the IP profiler for other SQL DBs

    Week 10 (31st July - 6th August)
    Done:

    • Worked on porting of APD to Zend 2.4. But failed to achieve any results