Log corruption on multiple webservers, log analyzers,...

From: Hugo (overclocking_a_la_abuelaat_private)
Date: Tue Mar 04 2003 - 09:39:52 PST

  • Next message: Robert Waldner: "Re: Siemens *35 and 45 series phones SMS Danial of Service"

    
     ('binary' encoding is not supported, stored as-is)
    Hi,
    
    something that could be interesting...
    We have decided not to contact any vendor (many vendors are vulnerable and 
    we have not enough time...sorry) and made this advisory public in this 
    list.
    
    ILLC - Inverse Lookup Log Corruption
    
    We are using a technique that we have called “ILLC” (Inverse Lookup Log 
    Corruption) that allows us to corrupt the logs generated by many web 
    servers that are doing inverse address resolution. 
    
    Impact of this technique:
    
    -	“IP spoofing” on the logs
    -	Code execution (XSS) on boxes that are running log analyzers (web 
    servers that have buit-in report analisys tools,etc.)
    
    On some specific scenarios, we have been able to hide the entire http 
    request to the log viewer.
    
    Most of the actions were possible because of the lack of  filtering when 
    parsing host names between different applications.
    
    Related RFC´s about Internet Host Names convention:
    
    RFC 952: 
    “(…)
    1.	A "name" (Net, Host, Gateway, or Domain name) is a text string up 
    to 24 characters drawn from the alphabet (A-Z), digits (0-9), minus sign (-
    ), and period (.). Note that periods are only allowed when they serve to 
    delimit components of "domain style names". (See RFC-921, "Domain Name 
    System Implementation Schedule", for background). No blank or space 
    characters are permitted as part of a name. No distinction is made between 
    upper and lower case. The first character must be an alpha character. The 
    last character must not be a minus sign or period.... Single character 
    names or nicknames are not allowed....
    (…)”
    
    RFC 1034:
    “(…)
    3.5. Preferred name syntax 
     
    ... The labels must follow the rules for ARPANET host names. They must 
    start with a letter, end with a letter or digit, and have as interior 
    characters only letters, digits, and hyphen. There are also some 
    restrictions on the length. Labels must be 63 characters or less. (...)”
    RFC 1123: 
    “(…)
    2.1 Host Names and Numbers 
    
    The syntax of a legal Internet host name was specified in RFC-952 [DNS:4]. 
    One aspect of host name syntax is hereby changed: the restriction on the 
    first character is relaxed to allow either a letter or a digit. Host 
    software MUST support this more liberal the RFCs. Note that under BIND 8, 
    you may need to add "check-names master ignore" to the zone definition 
    when defining these names.(…)”
    
    RFC 2181: 
    “(…)
    11. Name syntax 
     
    Occasionally it is assumed that the Domain Name System serves only the 
    purpose of mapping Internet host names to data, and mapping Internet 
    addresses to host names. This is not correct, the DNS is a general (if 
    somewhat limited) hierarchical database, and can store almost any kind of 
    data, for almost any purpose. 
    The DNS itself places only one restriction on the particular labels that 
    can be used to identify resource records. That one restriction relates to 
    the length of the label and the full name. The length of any one label is 
    limited to between 1 and 63 octets. A full domain name is limited to 255 
    octets (including the separators).(…)”
    
    Independently of what should be the legal host name syntax, it seems that 
    operating systems allows host names with arbitrary characters.
    To succesfully attack a server with “ILLC” technique is mandatory that web 
    server/log analyzer,etc., will be doing inverse address resolution and 
    that the attacker could control in any way the responses to those inverse 
    lookup requests.
    
    
    
    ---------------------------------------------------------
    Exploiting web server/log analyzers through “ILLC”
    ---------------------------------------------------------
    
    Examples of attacks:
    
    -Log “IP Spoofing”
    (exploited succesfully on Apache 2.0.44  on Windows/Linux, and Iplanet 6 
    on Windows)
    Scenario: a machine with a host name as "123.123.123.123"  makes a request 
    to an Apache server. If the server dosn`t generate any error, on the 
    access log you will see an  access request from a client 
    called "123.123.123.123", what apparently seems to be a valid request from 
    a client that server was unable to resolve to a host name. So the real IP 
    wouldn't appear in the access log file.
    
    access.log
    
    123.123.123.123 - - [28/Feb/2003:10:39:01 +0100] "GET / HTTP/1.1" 200 
    1786 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"
    123.123.123.123 - - [28/Feb/2003:10:39:46 +0100] "GET /badrequest.html 
    HTTP/1.1" 404 294 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"
    
    If the request produces some error, you will see an entry in the error log 
    file were you could see the real IP, although the web server has the 
    inverse lookup activated.
    
    error.log
    
    [Fri Feb 28 10:39:46 2003] [error] [client 172.26.50.45] File does not 
    exist: C:/Archivos de programa/Apache Group/Apache2/htdocs/badrequest.html
    
    So, while there aren’t errors, the real IP is not  showed. This can lead 
    in a complete anonymous http access for a client in a usual web surfing 
    activity, that is, if there are not broken links,etc.
    
    In the case of Iplanet 6, the real IP wouldn’t appear in the “access” 
    preview (see link below):
    http://www.infohacking.com/INFOHACKING_RESEARCH/Our_Advisories/ILLC/cap-
    access-log1.gif 
    
    Neither in the “errors” preview (see link below):
    http://www.infohacking.com/INFOHACKING_RESEARCH/Our_Advisories/ILLC/cap-
    errors.gif 
    
    -CODE INYECTION
    
    (Succesfully exploited on Apache 2.0.44 on Windows/Linux, on IIS 6.0 and 
    Iplanet 6 on Windows)
    Scenario: a machine with a hostname as “<scrip>alert(‘a’)&lt;/script&gt;” that 
    makes an HTTP request leaves javascript code on the log.
    
    When generating a report, with some log analyzers (that show results in 
    html), the script will be executed.
    
    *Note: in IIS 6.0 case we needed to restrict access on webserver by domain 
    name in order to force inverse lookup resolution.
    *Note2: in the Iplanet case we needed to simulate a FQDN client host name 
    like this:
    “<scrip>alert(‘a’)&lt;/script&gt;.infohacking.com”.
    You can also set a host name were the script is only part of the entire 
    string label:
    “nop<scrip>alert(‘a’)&lt;/script&gt;.infohacking.com”
    so when html formatted it will appear as a valid domain name:
    “nop.infohacking.com.”
    
    Meanwhile the script will be executed…
     
    Some log analyzers proved to be vulnerable to "ILLC":
    
    WebTrends (see link below):
    http://www.infohacking.com/INFOHACKING_RESEARCH/Our_Advisories/ILLC/cap-
    webtrends-illc.gif 
    
    SurfStats (see link below):
    http://www.infohacking.com/INFOHACKING_RESEARCH/Our_Advisories/ILLC/cap-
    surfstats_loganalizer.gif 
    
    WebLogExpert (see link below):
    http://www.infohacking.com/INFOHACKING_RESEARCH/Our_Advisories/ILLC/webloge
    xpert_illc.gif 
    
    And probably many more…
    
    Iplanet comes with a buil-in tool to generate html reports based on access 
    and error logs. This tool is part of the administration web interface.
    Moreover, Iplanet log analyzer always uses a web broser to show the 
    results of the report, although the user selects “Only text output”, so it 
    will be always exploitable.
    
    Iplanet Log Analyzer (HTML report, see link below):
    http://www.infohacking.com/INFOHACKING_RESEARCH/Our_Advisories/ILLC/cap-
    report-html.gif 
    
    Iplanet Log Analyzer (“txt” report, see link below):
    http://www.infohacking.com/INFOHACKING_RESEARCH/Our_Advisories/ILLC/cap-
    report-text.gif 
    
    On the  other hand we have to notice that the access log previewer(“View 
    Access Log”) in the Iplanet web interface is doing some kind of filtering 
    on some characters (for example <>).
    
    Iplanet “View Access Log” (see link below):
    http://www.infohacking.com/INFOHACKING_RESEARCH/Our_Advisories/ILLC/cap-
    iplanet-filtra.gif 
    
    -HIDING REQUESTS (Iplanet 6 on Windows)
    
    In the specific case of Iplanet 6, we coul realise that there`s a way to 
    trick the server on not showing the request in the log preview (“View 
    Acces Log” and “View Error Log”). The requests from boxes whose host name 
    begins with “format=” will not be showed, that is, those requests still 
    are visible in the access and error log files, but they would 
    be “invisible” from the built-in access and error log viewer of the 
    administration web interface…(“Last 25 accesses to…”). As an example we 
    made requests from a box with this host name:
    “format=.infohacking.com”, and we realize that we can see the request in 
    the access log:
    
    format=%Ses->client.ip% - %Req->vars.auth-user% [%SYSDATE%] "%Req-
    >reqpb.clf-request%" %Req->srvhdrs.clf-status% %Req->srvhdrs.content-
    length% "%Req->headers.referer%" "%Req->headers.user-agent%" %Req-
    >reqpb.method% %Req->reqpb.uri% %Req->reqpb.query% "%Req->reqpb.protocol%" 
    %vsid%
    format=.winmat.com - - [28/Feb/2003:10:22:25 +0100] "GET /evilrequest.html 
    HTTP/1.1" 404 292 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)" 
    GET /evilrequest.html - "HTTP/1.1" https-script.winmat.com
    
    But on the “View Access Log” nothing is showed (see link below):
    http://www.infohacking.com/INFOHACKING_RESEARCH/Our_Advisories/ILLC/cap-
    NOlog-iplanet.gif 
    
    We suppose that server is processing the first part of the host name 
    string “format=” as it would be the directive that sets the log format, 
    and the rest of the string is not recognized as valid format, so nothing 
    is showed.
    Combining the possibility of hiding a request and the Cross Site Scripting 
    technique we could execute scripts on the machine that runs the Report 
    Generator of the Iplanet Web Server in a “stealth” way. We have done this 
    establishing a host name like this:
    “format=&lt;script&gt;alert(document.cookie)&lt;/script&gt;.infohacking.com”
    (See link below):
    http://www.infohacking.com/INFOHACKING_RESEARCH/Our_Advisories/ILLC/cap-
    iplanet-cookie.gif
    
    Many more evil actions can be done... it only depends on the attacker's 
    imagination.
    
    We haven’t checked “ILLC” on other daemons as ftp, smtp, or firewalls, 
    IDSs, etc. We think that probably this technique could be used in the same 
    way in other scenarios.
    
    
    
     
    
    ---------------------------------------------------------
    Exploiting http headers for log corruption
    ---------------------------------------------------------
    
    
    Controlling inverse lookup responses is not always possible for the 
    attacker. We tried to figure out another, more generic attack to corrupt 
    web logs.
    The first that came to us was to use faked http headers in order to 
    achieve the same result: execution of scripts by log analyzers. 
    There are a lot of http headers that can be used to inject code in a log 
    file. We are not going to discuss all of them in this paper, but only to 
    outline some generic ways to do it.
    The main objective here is to choose the right header to inject code in 
    the http request… For example, the “RequestResource” is always showed in 
    web logs, but probably it will be filtered by many application firewalls 
    or it will be detected by IDSs… On the other hand the “UserAgent” header 
    usually is not being checked for suspicious secuence of characters, and 
    web masters usually like to have this info on their log files…
    
    An example on how to trick a log analizer to execute a script we 
    set “UserAgent” header of our client as:
    “&lt;script&gt;alert(‘UserAgent’)&lt;/script&gt;”.
    The requests of our client with this faked “UserAgent” will inject code in 
    the web server log. Some log analyzers reading this logs and generating 
    HTML formatted reports without filtering the output, will execute the 
    script.
    
    -Examples of vulnerables log analyzers-
    
    WebExpert (see link below):
    http://www.infohacking.com/INFOHACKING_RESEARCH/Our_Advisories/ILLC/log_ana
    l_XSS.gif 
    
    LoganPro (see link below):
    http://www.infohacking.com/INFOHACKING_RESEARCH/Our_Advisories/ILLC/cap-
    loganpro-agent.gif 
    
    To solve this kind of problems it would be nice a more aggressive 
    filtering on DNS responses and HTTP requests on all the headers.
    
    To finish this short analisys we would like to make some questions:
    
    Are log analyzers thrusting too mutch on log files?
    
    Maybe, are web servers the ones that would have to filter what they write 
    to log files…?
    
    Is the operating system the one that have to filter the returned values 
    from DNS servers?
    
    Are the actual legal domain name hosts allowed too mutch liberal?
    
    Sorry for our bad english.
    We would like to thank Martí Domenech, director of "Winmat Catalunya", for 
    letting us doing this "research".
    
    Hugo Vázquez Caramés & Toni Cortés Martínez
    Infohacking Research 2001-2003
    Barcelona
    Spain
    



    This archive was generated by hypermail 2b30 : Tue Mar 04 2003 - 12:21:59 PST