very interesting 0day tool... http honeypot in action

From: Michal Zalewski (lcamtufat_private)
Date: Tue Mar 12 2002 - 08:17:26 PST

  • Next message: Luis Bruno: "Re: Keylogger Needed Quick!"

    Hello list[s],
    
    My small, home-brew honeypot was hit by something pretty interesting today
    - an automated, not published, not widely used web reconnaisance tool. I
    do not have a better name for it - it appears to gather information about
    the structure of your webserver by recursively downloading the data,
    querying external web crawlers (probably google.com) to include data not
    directly referenced on your webpages, and later, tries to brute-force
    certain locations on the server (such as administration scripts, logs,
    misc files, pr0n). It is safe to assume that further client-side
    processing is done to classify the contents, aggregate it and extract
    possibly sensitive information from the noise.
    
    Its "behavioral patterns" are very unique and pretty uncommon - this is
    not yet another "common cgi scripts" scanner. It seems to be designed to
    perform targeted attacks. I couldn't find any references to this tool, or
    any logs showing this kind of activity in the past. I guess many readers
    can find it interesting to examine their logs or analyze it further.
    
    Such tools are relatively difficult to write (and this one is far from
    being perfect, as you will see later), but are also very valuable for
    potential attackers or pen-testers. As far as I know, there are no
    comprehensive tools of this kind available publicly. I know that many
    people (including myself) have their private codes of this kind. This is
    also a very good proof that sufficiently challenging, customized honeypots
    can be used to capture targeted, smart attacks. I never thought that this
    really trivial installation would provide such results.
    
    The tool is apparently launched by hand against a specific host. This can
    be guessed by analyzing the initial behavior - the attacker first made few
    regular, slow connections to the server, one of them with a typo.  Two
    minutes later, he/she launched the tool, which kept firing 5-10 HEAD and
    GET requests per second or such, approximately 1000 requests in total.
    
    The attack was apparently triggered by a curiosity - the server I am
    referring to is running some minimal http honeypot, providing bogus
    "secret" data to visitors. The "secret" URL was "leaked" to certain
    communities (egg, IRC channels). This is the initial activity (to protect
    my honeypot, I've changed the "secret" URL slightly):
    
    node-d-2425.a2000.nl - - [12/Mar/2002:14:15:59 +0100] "GET /privare HTTP/1.1" 404 788
    node-d-2425.a2000.nl - - [12/Mar/2002:14:16:13 +0100] "GET /private%20stuff/ HTTP/1.1" 200 183
    node-d-2425.a2000.nl - - [12/Mar/2002:14:16:44 +0100] "GET /privare HTTP/1.1" 404 788
    node-d-2425.a2000.nl - - [12/Mar/2002:14:16:48 +0100] "GET /privare%20stuff/ HTTP/1.1" 404 788
    node-d-2425.a2000.nl - - [12/Mar/2002:14:17:34 +0100] "GET /private%20stuff/pass.shtml?pass=blaat HTTP/1.1" 200 399
    node-d-2425.a2000.nl - - [12/Mar/2002:14:17:43 +0100] "GET /private%20stuff/passwd.dat HTTP/1.1" 200 48938
    node-d-2425.a2000.nl - - [12/Mar/2002:14:19:14 +0100] "GET /private%20stuff/index2.shtml HTTP/1.1" 200 23
    node-d-2425.a2000.nl - - [12/Mar/2002:14:19:20 +0100] "GET /private%20stuff/index1.shtml HTTP/1.1" 200 23
    node-d-2425.a2000.nl - - [12/Mar/2002:14:19:24 +0100] "GET /private%20stuff/index3.shtml HTTP/1.1" 404 788
    
    As you can see, there's a gap between 14:17 and 14:19, the time attacker
    used to examine passwd.dat file he/she obtained from the system.
    
    Then, the scan started. Phase 1 was recursive, rapid suck of the contents:
    
    node-d-2425.a2000.nl - - [12/Mar/2002:14:20:49 +0100] "GET / HTTP/1.0" 200 17421
    node-d-2425.a2000.nl - - [12/Mar/2002:14:20:49 +0100] "HEAD /head.jpg HTTP/1.1" 200 0
    node-d-2425.a2000.nl - - [12/Mar/2002:14:20:49 +0100] "HEAD /lcam.jpg HTTP/1.1" 200 0
    node-d-2425.a2000.nl - - [12/Mar/2002:14:20:49 +0100] "GET /prof.html HTTP/1.0" 200 20479
    node-d-2425.a2000.nl - - [12/Mar/2002:14:20:49 +0100] "GET /soft/ HTTP/1.0" 200 7966
    node-d-2425.a2000.nl - - [12/Mar/2002:14:20:49 +0100] "HEAD /mobp.jpg HTTP/1.1" 200 0
    node-d-2425.a2000.nl - - [12/Mar/2002:14:20:49 +0100] "GET /mobp/ HTTP/1.0" 200 15305
    
    Note that the fingerprint of this tool is pretty unique - HTTP/1.0 GET on
    HTML files and directories, and HTTP/1.1 (different version!) HEAD on
    other file types. Interesting... All requests have 'Referer' field set to
    the server name (http://myhost/), and 'User-Agent' to 'Mozilla/4.0
    (compatible; MSIE 5.0; Windows 98; DigExt)', which is, quite obviously,
    bogus. The remote system appears to run Windows right now, but I am not
    the administrator of this box, so I couldn't run p0f, tcpdump or such. Of
    interesting things, this crawler attempts to index every directory even if
    it is not explictly referenced in HTML code.  For example, if I have a
    link to catspace/BIGLOG.txt on my webpage, the crawler will attempt to
    index catspace/ directory too:
    
    node-d-2425.a2000.nl - - [12/Mar/2002:14:20:59 +0100] "GET /catspace/ HTTP/1.0" 403 720
    
    The crawler is rather poorly written - one of URLs on my webpage refers to
    http://myhost:54321/. The crawler incorrectly parses this URL into this
    request:
    
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:05 +0100] "GET /:54123/ HTTP/1.0" 404 748
    
    Another bug - URLs taken from certain directory indexes have extra '/'
    appended at the end:
    
    node-d-2425.a2000.nl - - [12/Mar/2002:14:20:52 +0100] "GET /soft/uc.c/ HTTP/1.0" 404 748
    
    This will keep certain files from being indexed, at least with Apache.
    Note that this happens only for certain directories (probably because I
    have different FancyIndexing settings for different directories). This
    seems to prove this code is not based off existing crawler and is a custom
    work.
    
    Then, phase 2 is brute-forcing - this phase is partially interleaved with
    phase 2, which suggests multithreading application:
    
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:36 +0100] "GET /2/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:36 +0100] "GET /8/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:36 +0100] "GET /5/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:36 +0100] "GET /4/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:36 +0100] "GET /123/ HTTP/1.0" 404 7
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:37 +0100] "GET /a/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:38 +0100] "HEAD /about HTTP/1.1" 404 0
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:38 +0100] "HEAD /account HTTP/1.1" 404 0
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:39 +0100] "GET /accounts/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:39 +0100] "GET /admin/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:39 +0100] "GET /adm/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:39 +0100] "GET /action.asp HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:39 +0100] "GET /ad/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:39 +0100] "HEAD /accounts HTTP/1.1" 404 0
    
    Our first guess is that this tool might be looking for PHP scripts to
    exploit recent mod_php vulnerability. However many requests are not likely
    to contain scripts - it tries to find certificates, mails, source codes,
    default html files, administrative services, or... pr0n.
    
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:40 +0100] "GET /amateurs/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:40 +0100] "GET /amateur/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:40 +0100] "GET /apps/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:40 +0100] "GET /app/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:41 +0100] "GET /archives/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:41 +0100] "GET /arc/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:41 +0100] "GET /archive/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:41 +0100] "GET /asp/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:42 +0100] "GET /bank.asp HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:42 +0100] "GET /bin/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:42 +0100] "GET /binaries/ HTTP/1.0"
    404 748
    
    [...]
    
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:44 +0100] "GET /book/ HTTP/1.0" 404 748
    
    [...]
    
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:45 +0100] "GET /certificates/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:46 +0100] "GET /code/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:47 +0100] "GET /controlpanel/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:47 +0100] "HEAD /codes HTTP/1.1" 404 0
    
    [...]
    
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:49 +0100] "HEAD /data HTTP/1.1" 404 0
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:49 +0100] "HEAD /database HTTP/1.1" 404 0
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:49 +0100] "HEAD /debug HTTP/1.1" 404 0
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:49 +0100] "GET /Default.htm HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:50 +0100] "GET /dmr/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:50 +0100] "GET /doc/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:50 +0100] "GET /dhtml/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:50 +0100] "GET /door/ HTTP/1.0" 404 748
    
    [...]
    
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:53 +0100] "GET /email/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:53 +0100] "GET /download/ HTTP/1.0" 404 748
    node-d-2425.a2000.nl - - [12/Mar/2002:14:21:53 +0100] "GET /emails/ HTTP/1.0" 404 748
    
    [...]
    
    The overall list of checked resources that returned 404 code:
    
    /1 /123 /2 /3 /4 /5 /6 /7 /8 /9 /a /abc /about /account /accounts /ad /adm
    /admin /ads /al /amateur /amateurs /ani /ani1 /anime /app /apps /appz /arc
    /archive /archives /asian /asians /asp /b /bin /binaries /binary /bizarre
    /black /book /books /c /cat /catalog /catalogs /certif /certificate
    /certificates /certified /certify /cgi /cgi- /cgibin /cgi-bin /cgi-win
    /code /codes /coding /content /contents /controlpanel /crack /cracks /ctc
    /d /data /database /debug /dhtml /dir /dirs /dmr /dmr1 /doc /docs /door
    /double /download /downloads /downloadz /driver /drivers /e /email /emails
    /entry /en_US /f /file /filez /final /food /forum /free /freepic /freepics
    /front /ftp /fuck /fucks /g /gal /galleries /gallery /galls /game /games
    /gamez /girl /girls /girlz /graph /graphic /graphics /graphs /h /hardcore
    /help /hidden /hide /home /htaccess /htdata /htdoc /htdocs /html /htpasswd
    /htpasswrd /i /id /ids /image /images /images_dir /imagez /index /info /j
    /k /l /lancelot /les /lesb /lesbian /lesbians /lesbo /lez /link /links
    /linkz /list /log /logs /m /mail /mails
    
    ...for some reason, the scan ended around letter 'm', so I can't determine
    what else would it look for, or if there are any later phases. And because
    the scan probably didn't provide attacker with any useful data in this
    case, I can't tell how would he/she attempt to use eventual information.
    One last thing I noticed:
    
    node-d-2425.a2000.nl - - [12/Mar/2002:14:20:51 +0100] "HEAD /soft/unicorns.tgz HTTP/1.1" 404 0
    
    This file used to be on my server, but is no longer available there. This
    suggests that this tool crawls not only pages found directly, but also
    previously indexed and cached by other systems (such as google.com).
    
    Well, ok, enough from me, I could probably write few more pages, but I
    don't want to insult your intelligence or make blind guesses. Have fun!
    Check your logs, post your hypotesis!
    
    -- 
    _____________________________________________________
    Michal Zalewski [lcamtufat_private] [security]
    [http://lcamtuf.coredump.cx] <=-=> bash$ :(){ :|:&};:
    =-=> Did you know that clones never use mirrors? <=-=
              http://lcamtuf.coredump.cx/photo/
    
    
    ----------------------------------------------------------------------------
    This list is provided by the SecurityFocus ARIS analyzer service.
    For more information on this free incident handling, management 
    and tracking system please see: http://aris.securityfocus.com
    



    This archive was generated by hypermail 2b30 : Tue Mar 12 2002 - 10:28:38 PST