Code-Red: An analytic model of its spread

From: Stuart Staniford (stuartat_private)
Date: Fri Jul 20 2001 - 00:48:30 PDT

  • Next message: kawaii: "Re: Jetdirect card Attack???--Followup"

    I have worked up a tentative quantitative theory of what happened with the
    spread of the Code-Red worm on Thursday July 17th (yesterday, for purposes of
    this email).  Although I can't prove it, my theory seems to explain the limited
    data I have fairly well, and to support a slightly novel conclusion.  I'd really
    appreciate getting more complete data from anyone who has it.  This is not
    exactly peer-reviewed science at this point, but rather late night, hasty math
    which hopefully won't turn out to be too wrong in the morning.
    
    What I believe happened is that sometime yesterday morning (US time), someone
    released a new version of Code-Red.  At a minimum, the new version had a
    corrected random number generator and seeding algorithm.  The new version spread
    very rapidly until almost all vulnerable IIS servers on the Internet were
    compromised.  It stopped trying to spread at midnight UTC (5pm in California
    where I am).  I do not know in what other ways the new version might have been
    different from the earlier one.
    
    Background.  
    
    Code-Red was first seen in the wild on July 13th, according to Ryan Permeh and
    Marc Maiffret of Eeye Digital Security who analyzed it at
    
    http://www.eeye.com/html/Research/Advisories/AL20010717.html and 
    http://www.eeye.com/html/advisories/codered.zip
    
    According to Eeye, Code-Red spreads by compromising Microsoft IIS web servers
    using the .ida vulnerability discovered also by Eeye and published June 18th:
    
    http://www.eeye.com/html/Research/Advisories/AD20010618.html
    
    Once it has infected a host, Code-Red spreads by launching 99 threads which
    generate random IP addresses, and then try to compromise that IP address using
    the same vulnerability.  A hundredth thread defaces the web server in some
    cases.
    
    However, the worm analyzed by Eeye has what seems like a bug.  The random number
    generator is initialized with a fixed seed, so that all copies of the worm in
    all threads, on all hosts, will attempt to generate and compromise exactly the
    same sequence of IP addresses.
    
    The spread data.
    
    The first thing that caught my attention was that our web server at Silicon
    Defense only showed Code-Red attempts against it on July 19th.  There were no
    earlier attempts, but there were over 20 attempts on the 19th.  Several other
    correspondents reported a very similar pattern.
    
    Next, I found the following data from Ken Eichmann of cas.org 
    at http://www.incidents.org/diary/diary.php.
     
    		   Hour    # Code Red Worm Scans     # Scanning During the Hour
                        ------  ---------------------   -------------------------
                         00           12699                     2450
                         01           13059                     2577
                         02           13272                     2590
                         03           13056                     2564
                         04           13283                     2632
                         05           13229                     2612
                         06           13554                     2601
                         07           13517                     2608
                         08           13746                     2685
                         09           16819                     3325
                         10           36589                     7838
                         11          116083                    26823
                         12          295348                    68085
                         13          466542                   103522
                         14          520973                   113451
                         15          513513                   115124
                         16          513894                    90931 
    
    These first column is the hour of the day (I believe in Columbus, Ohio).  The
    second is the number of port 80 probes, and the third is the number of
    independent source IPs giving rise to those probes.  In the Excel spreadsheet
    attached to this mail, there is a graph which shows the probe count in dark
    blue, and the IP count in pink. 
    
    I will now develop two simple theories of the worm spread.  The first, which
    I'll call the Random Spread (or RS) theory assumes that the worm had a proper
    random number generator that was seeded with basically random information.  The
    second, which I'll call the Constrained Spread (or CS) theory, assumes that the
    seeding works as Eeye describes - every copy of the worm starts with the same
    fixed seed and the same random number generation algorithm.  The next part of
    the discussion assumes a little calculus knowledge, but you can probably skip
    over that part with limited loss of comprehension.  I'll subsequently apply this
    theory to the data above and argue that the RS theory can explain it, but the CS
    theory cannot.
    
    I'll start with the RS theory.  I define the following parameters:
    
    N is the total number of vulnerable IIS servers which can be potentially
    compromised from the Internet.  (I assume N is fixed - ignoring both patching of
    systems during the worm spread and normal deploying and removing of systems.  I
    also ignore any spread of the worm behind firewalls on private Intranets).
    
    K is the initial compromise rate.  That is, the number of vulnerable hosts which
    an infected host can find and compromise per hour at the start of the incident
    when few other hosts are compromised.  I assume that K is a global constant, and
    does not depend on the processor speed, network connection, location, etc of the
    infected machine.  (Constant K is an approximation).  I assume that a
    compromised machine picks other machines to attack completely at random.  I
    assume that once a machine is compromised, it cannot be compromised again, or
    that if it is, that does not increase the rate at which it can find and attack
    new systems.  I assume that once it is compromised, it stays that way.
    
    T is a time which fixes when the incident happens.
    
    I then have the following variables:
    
    a is the proportion of vulnerable machines which have been compromised.
    
    t is the time (in hours).
    
    Now, I analyze the problem by assuming that at some particular time t, a
    proportion of the machines a have been compromised, and then asking how many
    more machines N*da, will get compromised in the next amount of time dt.  The
    answer is
    
    N * da = a N K dt (1-a)
    
    The reason is that the number of machines compromised in the next increment of
    time is proportional to the number of machines already compromised (N*a), times
    the rate at which a compromised machine can compromise new machines K (1-a),
    times the increment of time dt.  (Note that machines can compromise K others per
    unit time to begin with, but only K*(1-a) once a proportion of other machines
    are compromised already.
    
    This give us the differential equation
    
    da/dt = K*a*(1-a)
    
    Bringing to bear my extremely rusty mathematical skills, I found that the
    solution to that equation is
    
    a = Exp(K * (t-T))/ (1 + Exp(K * (t-T)))
    
    where T is a constant of integration that fixes the time position of the
    incident.
    
    This is an interesting equation.  For early t (significantly before T), a grows
    exponentially.  For large t (significantly after T), a goes to 1 (all machines
    are compromised.  The rate at which this happens depends only on K (the rate at
    which one machine can compromise others), and not at all on the number of
    machines.  This is interesting because it tells us that a worm like this can
    compromise all vulnerable machines on the Internet fairly fast.
    
    In the attached spreadsheet, the yellow curve in the chart is a manually made
    fit to the cas.org scan data using the RS model above.  (I assume that the
    number of probes is directly proportional to the fraction of machines
    compromised).  Inspection of the graph will show that the fit is really quite
    nice.  So the RS model can explain the data well.  Basically, I fit it to fall
    slightly below the data curve, since it seems there is a fixed background rate
    of web probes that was going on before the rapid rise due to the worm spread.
    
    It uses a K of 1.8 - at the outset, a compromised IIS server was able to find
    and compromise 1.8 other servers per hour.
    
    You will notice an important implication of using this model to explain the
    data.  The probe level due to the theoretical worm was negligible in the early
    hours of the morning (see column D of the spreadsheet).  Therefore, if the RS
    model is reasonably valid, the worm that caused this rapid upsurge in traffic
    *cannot* have been a worm that was running around for days.  It must have been
    released in the morning of the 17th.
    
    Ok, now turning to the CS theory (constrained spread, as according to the Eeye
    disassembly of the worm).  If indeed it's the case that all copies of the worm
    use the same seed and RNG, then all copies of the worm attempt to compromise
    exactly the same sequence of IP addresses.  If we assume that all copies work at
    the same rate (a simplifying assumption we made in the RS analysis), then no
    copy will ever catch up with the first instance of the worm.  In practice, other
    copies may get a little ahead of the first one, but basically all copies are
    working through the exact same list in parallel so progress will be linear to
    begin with.
    
    The rate at which such a worm will be able to compromise systems will be K
    (1-a), total.  Since K is likely to be very small indeed compared to N, a will
    stay negligible for a very long time, and the rise in the number of systems will
    be very slow and linear.  I did not attempt to fit such a model to the data,
    because it seems obviously hopeless.  If Eeye has understood the worm they
    analyzed correctly, and I have understood their analysis, I don't believe there
    is any way that worm can have caused the rapid rise in probes we saw yesterday.
    
    Conclusion:
    
    If I'm rights, this was (and probably still is) a pretty scary worm.  Within a
    few hours of launch, it gained control of almost all vulnerable IIS servers
    (which was probably a sizeable fraction of IIS servers).  That degree of control
    could have been used to do enormous damage in any large number of ways.  It also
    illustrates the incredible power that worms have to rapidly gain control of huge
    numbers of machines on the Internet.
    
    Thanks:
    
    My thanks to my coworker Roel Jonkman, who sat up late in the night with me and
    pored over Eeye's dissassembled code.  Thanks to Laura Tinnel of Teknowledge,
    and Russell Fulton of the University of Auckland for helpful reports of their
    experience of the worm.
    
    Stuart.
    
    -- 
    Stuart Staniford     ---     President     ---     Silicon Defense
             ** Silicon Defense: Technical Support for Snort **
    mailto:stuartat_private  http://www.silicondefense.com/
    (707) 445-4355 x 16                           (707) 445-4222 (FAX)
    
    
    ----------------------------------------------------------------------------
    
    
    This list is provided by the SecurityFocus ARIS analyzer service.
    For more information on this free incident handling, management 
    and tracking system please see:
    
    http://aris.securityfocus.com
    



    This archive was generated by hypermail 2b30 : Fri Jul 20 2001 - 09:48:17 PDT