RoT-1 (aka the Texas Honeynet Project) is a young chapter, we were just officially formed in January 2011. That said, our original founding members have been active members of The Honeynet Project since its inception in the late 1990’s, so we’re not as young as it may seem.
Our rough center of mass, and primary meeting point is Austin, TX, though we have members in Houston and a few new recruits from the Dallas-Fort Worth area as well.
Our group was formed as an invitation-only chapter, however we encourage people near our bases of operation (Austin, Houston, DFW) to reach out to us and build a reputation and trust and solicit their own invitation. Anyone interested in contributing to our ongoing projects is welcome to contact us as well, and we’ll review these on a case-by-case basis, based on a level of trust and merit.
The following is the current list of active members and their focus areas:
Our first deployment, whose development was sponsored and funded by Praetorian, was of the Scalable Tailored Application Analysis Framework (STAFF), developed by Ryan W Smith and Adam Pridgen. The goal of this framework was to provide large-scale static analysis for Android applications, to provide high level analytics, statistics and patterns. Our initial data processing was completed in May 2011, and consisted of the static analysis of over 50,000 applications from both the official Android Marketplace as well as third party marketplaces. We were able to extract data such as: Manifest values, permissions, receivers, interfaces, Dex bytecode, methods implemented, methods called, objects defined, control flow graphs, URLs contacted, etc. Many of the modules to extract these values used one or more third party tool, which was integrated into our modular framework.
We were able to provide a high level picture of certain attributes such as permissions requested, and libraries used, however it became clear that in order to compute the more complex aggregate information that we intend to that we would need to address certain scalability and data management issues. We are currently in the process of rewriting STAAF to be more modular, to use aggressive parrallelization (including an EC2 deployment), to further reduce processing and data redundancies, and to use a much more scalable and distributed database implementation. We plan to complete and release this new framework under the Apache 2 license later this year.
STAAF is designed to allow large scale distributed Android application analysis, and achieves this with aggressive parallelization of analysis tasks, de-duplication of processing efforts and data storage, as well as efficient data storage and recall. Because applications can be processed independently of each other, we are able to distribute the load of processing tasks for each application, which showed a marked improvement over the serialized application analysis. Additionally, rather than feeding every individual analysis tool the raw application we extract and process the required resources once, and then we feed the processed results into the analysis tools that require those resources. Furthermore, certain aspects of the application, such as library code (e.g. advertising libraries), and certain resources, are often reused between applications. Rather than analyzing these shared resources multiple times for each application that includes them, many tasks can ignore these shared resources, significantly cutting down on the amount of redundant data processing. Finally, we have designed the system using a distributed noSQL database solution. This database design provides low latency storage and recall, and also allows us to transparently include additional remote third party analysis databases for collaborative analysis and data sharing.
v0.2 - Available upon request, on a case-by-case basis. Note that v0.2 is a python implementation and does not provide the distributed or scalability enhancements.
v0.3 - Will be available later this year under the Apache 2 license, and will include all the distributed and scalability features listed above.
STAAF is a modular framework of analysis tools, and leverages many other open source Android analysis tools such as: androguard, apktool, baksmali, axmlprinter2, etc. These tools provide the fundamental Android data parsing and interpretation, which we then use to provide the higher level data and results. A lot of credit goes to Anthony Desnos, for a lot of quick feedback and new features from the Androguard Project.
STAAF is designed to allow the integration of new “native” tools to extract new features or data very quickly and easily. That said, we are interested in collaborating with and integrating any tools that do data analysis or reverse engineering on Android applications.
We also plan to extend our framework to other mobile platforms as well, so we would appreciate any expertise, experience, or tools for iPhone, Window7, Blackberry, or even apps for Chrome or browser extensions.
We are also in need of a free or low-cost cloud solution, or simply resources for hosting a large number of virtual machines to test and deploy the STAAF framework and future distributed data analysis projects. We are currently using EC2, but our servers/services are generic Java services on each host, and we’re using a multi-instance, multi-database CouchDB solution for the database, so it’s generic enough to move out of EC2 with no modifications.
Full details can be found at http://www.honeynet.org/gsoc/slot6. This project is being mentored by Ryan W Smith and implemented by student Cong Zheng. The intent of the project is to produce a free GUI that’s context-aware for Android Analysis. Currently Cong has made great progress and has implement modules for APK information browsing, module/ method context menus, Smali code view (contextual), and CFG view (contextual). After the midterm he plans to implement features such as notes, annotations, and contextual jumps between cfg and smali code, as well as the ability to save and share analysis notes and views.
We would like a organize and lead a KYE paper on Android Malware and Android Analysis. I know there are plenty or other members with different perspectives and expertise, so I’d like to recommend that we pool our knowledge, pick some focus areas, and release a KYE from our collective experience.
Being that this is the first year, we can’t judge past performance. That said, for our first year (2011), our goals are: