- About us
- Blogs
- Funding/Donations
- Challenges
- Chapters
- Papers
- Projects
- Code of Conduct
- Google SoC
- GSoC
- Latest images
- Security Workshops
Primary mentor: David Watson (UK)
Student: György Kohut
Project Overview:
Creating a central public web-based malware information service based on continuous data collection from the Honeynet Project sensor network. The system will serve as a central repository for sensor collected data with mechanisms to enrich the data by invoking related services (dynamic malware analysis - CWSandbox/Anubis, virus scan - VirusTotal, geo-IP lookup) and by capturing/generating statistics from the collection process/collected data. The resulting data will be exposed through a rich web-based interface that aims to allow easy exploration of the presumably large set of information, as well as to offer overview on high level trends.
Project Plan:
The general architecture of the system consists of three main components:
Component 1 and 2 will be developed from scratch and represent the two main deliverables of the project.
Component 3 will be an actual stable release of PostgreSQL. Optimally PostgreSQL 9+ to leverage the new replication features for load balancing read-only queries from the web front-end or when running presumably expensive statistics queries.
1. Data collection back-end
This component interfaces with the Honeynet Project sensor network and processes the the sensor collected further (mostly) by invoking external/third party services. The resulting data is stored in the database back-end.
For minimum functionality, the following interfaces will be provided:
Honeynet:
Third party:
The component has a message driven architecture. Sub-components responsible for processing or providing interfaces are largely self contained and are connected by message queues to each other to form the component's workflow.
Generally, the workflow is triggered immediately by the arrival of a new Dionaea submissions, however the design should allow fair amount of control/extendability.
The component will be implemented in Java, largely within the semantics of the JMS API. Durable messaging and distributed transaction support to coordinate message queue and database access will be used for robustness.
Which third party components (most notably the JMS provider and the transaction manager) will be used for the implementation is yet to be decided upon until the coding starts. Probably, the Spring framework will be used to wire parts together. Optionally, the GlassFish 3.1 application server could be considered as the runtime environment as it provides a JMS provider and a transaction manager out of the box.
2. Web front-end
This component presents the collected data and generated statistics through a rich HTML/Javascript interface. This includes at least the following:
Generating statistics (and possibly visualizations) is considered to be part of this component. These operations are carried out in a batch-like manner and results are cached.
The component will be implemented in Python using the Django framework on the back-end and the Ext JS library on the front-end.
Usage of additional/alternative third party components will be decided upon until the implementation of this components starts (see the timeline).
Project Timeline:
- May 23 (Community bonding period)
Community bonding period:
- planning
- evaluating/choosing/setting up tools/third party components/libraries
May 24 - July 10 (First internim period)
May 24 - June 12
- writing code for component 1
June 13 - June 19
- writing code for component 1
- begin to move gradually towards real world testing component 1
June 20 - June 26
- testing and patching of component 1
- writing documentation of component 1
June 27 - July 3
- testing and patching of component 1
- begin to gradually change focus to component 2
July 4 - July 10
- component 1 considered "finished for now"
- focus changed to component 2, begin to write code
July 11 - July 15 (Mid-term evaluation period)
- writing code for component 2
- mid-term evaluation
July 18 - August 15 (Second internim period)
July 18 - July 31
- writing code for component 2
August 1 - August 7
- writing code for component 2
- begin to move gradually towards real world testing component 2
August 8 - August 15
- testing and patching of component 2
- writing documentation of component 2
August 16 - August 21 ("Pencils down" Period)
- testing and patching of component 2
- general review of code and documentation
August 22 - Firm "pencils down" date
August 26 - Final evaluation deadline
Updates:
- Firm "pencils down" - final update (for now)
Of course, there're not only charts to show on the web front-end, but seeing it as a whole, it's probably the hardest part, so I wanted to start with that, to get it as complete as possible. I'm planning to work on the project post-GSOC, to finish up what's already there in some form, and to add what was imagined but didn't fit into the timeframe (more about that later in a separate post).
- August 15
- August 8
Planned for this week:
- July 25
Planned for this week:
- July 15 Mid-Term
Planned for next week:
- July 4
Planned for this week:
- June 27
Planned for this week:
- June 13
Planned for this week:
- June 6
Planned for this week:
Source Code
HonEeeBox development is still a work in progress, but you can find snapshots of the GSoC project code here: