Project 7 - Network malware simulation

Student: Jing Conan Wang
Primary mentor: Hugo Gonzalez
Backup mentor: Jianwei Zhuge

Google Melange: http://www.google-melange.com/gsoc/project/google/gsoc2012/jingconanwang/17001

Project Overview:
It support both simulation and emulation. Emulation will be the focus, but we will design the structure to be general enough to support simulation. The support for GNS(graphic network simulator) will be included. We will focus on two types of malware behaviour: botnet and worm.

Project Plan:
My planned timeline is as follows:
April 23rd - May20th: Community Bonding Period. Literature review of malware network behaviour. get familiar with ns3
May 21st : GSoC 2012 coding officially starts
May 21-June 5 Design the general structure for the simulator.
June 5 - July 8 Implement the part for Botnet. by July 8, we should have a software that can work and can both simulate & emulate the behaviour of botnet. done some experiments using the software, finished a report about the results
June 5 - June 20: Work on the simulation part
June 21 - June 30: Work on the emulation part
July 1 - July 8: Write testbed and get some results
July 9th - July 13th: Mid Term Assessments
July 14th - August 12th: Implement the part for worm by August 12th, we should finish most parts of the software. The software should also be able to simulate & emulate the behaviour of worm. have done some experiments of worms using this software and finished a report
July 14 - July 25: simulation part
July 25 - Aug 5: emulation part
Aug 5 - Aug 13: write testbed and do some experiments
August 13th: Suggested "pencils down" date, coding close to done
August 14 - August 19: refactoring the code.
August 20th: Firm "pencils down" date, coding must be done
August 24th - August 27th: Final Assessments
August 31st - Public code uploaded and available to Google

Project Deliverables:
By the end of the summer, we should have a software that can: 1 simulate the traffic of attack generated by botnet. 2 simulate the behaviour of worm infection in the network. From the May 21st to Mid Term Assessments, I will focus on botnet and after that I will focus on worm.

Project Source Code Repository:
hg clone https://bitbucket.org/hbhzwj/imalse

Student Weekly Blog: https://www.honeynet.or/blog/343

Project Useful Links:
http://people.bu.edu/wangjing/open-source/sadit/html/index.html

Project Updates:
for rich format version, please go to:
https://docs.google.com/document/d/1oyS2StrPvVs0_cpiuXh40e2lUtpkFCQRsLGsVUANBr0/edit

May 28th
Done last week:
This week I focus on the design of the software framework.

  • Download NS3 and play with it.
  • Learn the NS3 emulation module
  • Research about the taxonomy of the computer security incidents. read paper of “A Common Language for Computer Security Incidents ” by John d. Howard, Thomas A. Longstaff
  • General Description of the Attack, read articles about STAT language, a domain independent language to describe computer attacks.
  • Design the software framework, https://docs.google.com/document/d/12KNDSM4FXamSlDqqquYxkVNvPBU4TLJzIPeLh9xbb00/edit

Issues:
Need to define a socket adaptation layer that unifies the simulation and real attack.

Plan for next week:

  • Finish design of software.
  • Find out whether more code of SADIT can be reused.
  • Determine whether still use NS3 to do the simulation part.

June 3rd
Done in Last week

  • Specify the server command set and client command set for a simple attack
  • Design the modules in the software
  • Write sample scripts for simulation & emulation

Plan for next week:

  • finish a simple version of code as a proof of concept.

June 10th
Done in last week:

  • implement basic botnet framework.
  • can export the finite state machine to png file.
  • a simple attack is implemented as a proof of concept. all the compromised computers will print out “I am the master, I have controlled your” message continuously.

Issues:

  • the structure of the botnet framework need to be clearer to make it more flexible

Plan for the next week:

  • finish a more mature version of the botnet framework
  • implement a simple version of DDoS attack as proof of concept.

for more information about the update please visit the google docs website
https://docs.google.com/document/d/1oyS2StrPvVs0_cpiuXh40e2lUtpkFCQRsLGsVUANBr0/edit

June 18th

Done Last Week:

  • improve the structure of the botnet framework
  • implement a simple ddos_flooding_attack as proof of concept.
  • try to add support of fs simulator, a network simulator developed by joel sommers.
  • Issues:

    • Add support of fs failure. fs is not actively maintained. The only feasible way to interact with fs is to write the configuration first in advance, however, many attacks cannot be implemented in this way since it required interaction dynamically.

    Plan for Next Week:

    • Add support for NS3
    • run simple ddos_flooding_attack under NS3 as a proof of concept.

    June 24th
    Done last week:

    • search ways to integrate NS3. Find a good open source project netns3 (http://www.nsnam.org/wiki/index.php/HOWTO_use_Linux_namespaces_with_ns-3), and CORE(http://cs.itd.nrl.navy.mil/work/core/) that may be useful
    • redesign the code under netns3

    Plan for the Next Week:

    • Set up a demo simulation under ns3.

    July 2nd
    Done last week:

    • I have added NS3(network simulator) simulation support for imalse.
    • finish a ddos_ping_flooding attack as proof of concept under NS3.
    • use linux namespace as virtual machine. The virtual machine can be launched automatically to make simulation more flexible.
    • the simulation can be hybrid, which means some nodes can be simulated and some other nodes can be real, it is just preliminary.

    Plan for next week:

    • complete the work of hybrid simulation. some of nodes is simulated through ns3, some nodes are virtual machines, and some other nodes are real computers.
    • clean code for mid-term assessment.

    July 8th
    Done last week:

    • The previous version can only work on csma network. This week I added function of loading topology file and made it possible to simulate Large scale network. Now the program can load topology file generated by Inet internet topology generator. Support for orbis and rocketfuel topology generator will be added later. Can generate pcap file from any nodes of the network.
    • Improve the hybrid simulation of NS3 simulated node and real node. Implement some APIs for simulated nodes in NS3.

    Issues:

    • Although C++ version of NS3 is good, the python binding of NS3 is buggy. The binding tool used by NS3(pybindgen) lacks documentation. I am mad about it as I have wasted a lot of time to solve the bug of NS3. For example, a bug in python binding of topology reader in NS3 make this module unusable and I have to implement by myself in python. And the bug in python API of CreateObject make it very hard to lookup IP Address for a specific node, I had to implement a hack version of lookup function by myself.

    Plan for next week:

    • Find a way to circumvent the bugs in python binding of NS3. The current problem of python binding is caused by poor STL support of pybindgen. A possible solution is to use boost::python to generate python bindings for some modules.

    July 15

    Done last week
    Last week, I focused on clean up the code and solve some issues left when
    imeplmenting the ns3 support of simulation.
    Eventually I decide to still use native python bindings of ns3 for two
    reasons. First, none of other tools are perfect enough to convince me that it
    will not suffer from the similar problems after applying it extentively
    Second, generating python bindings for ns3 (at least for the module I will
    need to use) is such a large project that beyond the scope of Google Summer
    of Code.

    This choice is not perfect though. I am still harassed by lack of several
    APIs in manipulating packets. Due to this bug, the simulated node cannot
    manipulate packet content directly, which make pure simulation even harder
    than hybrid approach. This
    problem doesn't exist for real nodes as I can use standard python libary to
    manipulate packets. Fortunately, the simulated node can add
    abitrary padding zeros to packets. I took advantage of this and implemented
    a mapping of message to the number of padding zeros. I implemented a
    experiment of small simulated network with static routing in which all nodes are
    simulated and the attack can be initiated successfully.

    The structure of the code has been changed, the experiments has been pulled
    up to the top directory. the finite state machine description has been
    abandonded. The original goal is to let user to write severl fsm description
    sentence to describe the scenario. However, in practice I found this method
    is by no means efficient and clear. Now user can subclass serveral
    descriptive class to implement new scenarios.

    Plan for Next Week
    Till now I am testing the framework using only ddos_ping_flooding scenario.
    As the framwork becomes more and more mature and both emulation and ns3
    simulation supoort have been added. It's time to implement more attack
    scenarios. Next Week I will stil spend two days or so to clean up and
    refactor the code for the framework and then start to implement another
    attacking scneario.

    Issues:
    Although simulated node can send message to each other through mapping of
    messsage and padding zeros, the real node still cannot receive message from
    simulated nodes. Extra hack code needed to added to real node to tranlate
    the mapped information, this shouldn't be hard, though.

    July 22

    Done last week
    this week I implemented attacking scenario “file_exfiltration” to demonstrate the usability of the imalse. When the botmaster issue a file_exfiltration command to server. The server will send the exfiltration command to corresponding bots. After receiving the command, the bots will search over file system for files with a specific pattern. botmaster can also specify the type of files. For example, the botmaster can ask bots to search any .txt file which contains pattern “assword”. After that, bots will upload those files to a ftp server, whose address and password are also specified by botmaster commands.

    Issues:
    the scenario can work perfectly in emulation mode and under netns3. However, there is still a small bug under ns3. The bots cannot receive the last command.

    Plan for next week:

    • extract some pcap file from the emulation and simulation of the two existing scenarios -- “ddos_ping_flooding” and “file_exfiltration”
    • implementing another attacking scenario.

    July 30
    Unfortunately, I started to suffer from eye illness and cannot stay in front of computer for a long time. This will delay the progress of the project, I will try my best to reduce the influence of the illness. Fortunately, the hardest part(design and implementation of framework) has been finished.
    Done in Last Week:

    • Fix some bugs, including bug in topology-reader I have implemented that make topology wrong, bug of updating routing table.
    • Add visualization support for the pure ns3 simulation. add option --SimulatorImpl. The available options are ['Realtime', 'Default', 'Visual']. When Visual is selected, it will try to run the visualizer in ns3 to visualize the result.

    Issues:

    • Since visualizer in ns3 doesn’t support real nodes, only pure simulation can be visualized.
    • Again because of lacking of some important API in python binding of ns3, pure sim node and communicate with real node with very limited message.

    Plan in Next Week:

    • extracting some pcap files from the simulation.
    • clean up code and write documentation.

    Aug. 6
    Done last week:

    • This week I wrote the help document for the project and I also created a video demo. The video is available in http://www.youtube.com/watch?v=CZ91McFlIvo&feature=plcp

    Plan for the next week

    • There are still several bugs need to be solved. The most urgent one is the bug of ns3 packet manipulation. It is so ugly to map the message to the number of padding zeros. I am considering to implement a header manipulation part in C++.
    • Need to add convenient method to collect pcap data.

    Aug 12:
    Done Last Week:

    • Add help document or installation in Ubuntu system. The help document is now available on http://people.bu.edu/wangjing/open-source/imalse/html/index.html
    • Fix the long-time bug for NS3 python binding in manipulating the packet. I added a new imalse module to NS3 to manipulate imalse packet manipulation.
    • Add ManualExperiment, can read configuration from settings file. User can specify the ip address for each interface, and also the delay and rate of each link. It make much easier to integrate with the gui topology editor in CORE.

    Plan for the next week:

    • Add trace system to the imalse.
    • CORE has a very mature GUI topology editor. User can edit topology using mouse and and set ip address using menus. I can add support of imalse to gui to make it much easier for user to create experiment.
AttachmentSize
simulate.py_001.png47.94 KB