Re: PointGuard: It's not the Size of the Buffer, it's the Address

From: pageexecat_private
Date: Tue Aug 19 2003 - 09:54:24 PDT

  • Next message: Crispin Cowan: "Re: PointGuard: It's not the Size of the Buffer, it's the Address"

    > >You are wrong (and even self-contradicting) here, in any case, so-called
    > >information leaking can happen without having to corrupt pointers ([1],
    > >[2]). Also, section 3.4.3 sublates the above.
    > >
    > It is true that PointGuard raises new issues with regard to information 
    > leakage: before PointGuard, there was not much significance to leaking 
    > pointer values from a running process, and so this now becomes a new 
    > threat that needs study.
    
    You are wrong, PG is not the one that raised the issue of information
    leaking as it has been known for quite some time now [1] (i would say
    ever since randomization schemes were implemented, although apparently
    not everyone knew of them).
    
    > Yes, PointGuard only protects pointer values generated by code compiled 
    > with PointGuard.
    
    In other words, does this mean that PG as you described it in your paper
    is vulnerable to the most basic form of exploit techniques - shellcode
    injection and execution?
    
    > We are modifying the dynamic linker for Immunix. But that kind of 
    > hacking isn't worthy of a paper, so we omitted it.
    
    I did not expect a full paper about ld.so hacking but you should have
    at least mentioned the very fact that you did need to do it - after all
    it is part of the PG system. Also i would like to point out that at least
    on i386 the PLT is generated by the normal linker (ld in binutils), so
    you will have to modify it as well.
    
    > > It would also be interesting to know how you can
    > >handle the saved program counter and frame pointer just after the AST
    > >level where as far as i know these entities do not even exist (and
    > >hence cannot be manipulated/controlled there).
    > >
    > As the paper said, we are going to tag the AST expressions so that 
    > spills are PG-encrypted, but this is not yet implemented.
    
    I was not asking about register spills (which hold local variables
    described at the C language level), i meant the CPU specific registers
    that are used for addressing local variables (frame pointer, EBP on
    i386) and control flow (instruction pointer or program counter, EIP on
    i386). As far as i know, these registers are not visible at the AST
    level, they are not (directly) controlled by any C language level
    construct therefore encrypting/decrypting them requires special
    changes - what shall they be?
    
    > >   2). It also begs the question of what kind of performance impact PG
    > >   will have once all these omissions are rectified (more on your
    > >   performance evaluation below).
    > >
    > The only pointer load/stores that are not encrypted right now are 
    > register spills. That is a rare case, so it will not affect performance 
    > much.
    
    If it is true that PG does not protect the saved instruction and frame
    pointers then indeed the performance impact of the above changes will
    be irrelevant - all an attacker has to do is simply use the good old
    way of shellcode injection/execution without having to worry about PG
    encrypted pointers. If you are going to encrypt these then i am not
    convinced that the resulting impact will be so little (see [2]).
    
    > > Also what happens
    > >   with functions that take format strings and hence accept arguments of
    > >   variable types (i.e., pointers and non-pointers), do you parse such
    > >   format strings and convert the pointer arguments accordingly or do
    > >   you turn off PG altogether for such code?
    > >
    > There is special case handling for varargs.
    
    I understood that, i was asking about its technical details. Consider the
    following code:
    
    union bar {
       char * y; /* char @ y; */
       long z;
    };
    
    void foo(int x, const char * f, union bar * s, int y) {
       if (x)
         printf(f, s->y[0]);
       else
         printf(f, s->z);
    }
    
    Here foo() is PG protected so 'f' and 's' are encrypted. The question
    is: what happens to s->y ? Is it supposed to be marked with '@' (maybe
    along with s itself) and hence be used unencrypted throughout the code
    (and leave it vulnerable to overflows)? If not, when will it be
    decrypted? The problem with this is that the compiler cannot know in
    advance whether the format string 'f' will contain a %c or a %ld
    specifier, so neither blind decryption (of s->y) nor leaving it
    (s->z) untouched will produce the expected result (which is defined
    to be the result of non-PG compiled code).
    
    > That is correct: unencrypted pointers are passed into the kernel.
    
    This means that such pointers (passed as arguments on the stack) do
    leak on the stack (beyond the register spills you mentioned).
    
    > or to have the kernel know the key value of all processes and do
    > the mapping for you (which is feasible, but more intrusive than
    > just hacking glibc).
    
    Did you consider building the kernel itself with PG? In that case
    sharing the (per task) encryption key between the kernel and userland
    could be automatic.
    
    > >   Finally i am wondering how you plan to implement pointer mode tracking
    > >   in the compiler, or more precisely, why you have to do it in the compiler
    > >   only and not at runtime (in the latter case you would have to extend the
    > >   pointer representation and open a whole can of worms).
    > >
    > I have no idea what you are talking about.
    
    Refer to my small example above and imagine that you encrypt varargs
    pointer arguments as well. As i pointed it out above, you cannot decide
    at compile time whether to do the decryption or not hence you would
    have to carry the pointer's mode into the runtime which means you will
    have to augment the internal pointer type which increases its size.
    
    > >6. In section 5 you admit that you do not indeed have a PG protected
    > >   glibc and hence heap pointers are not protected at all, this calls
    > >   into question the seriousness of your security and performance
    > >   testing (especially since you compare your results to mature
    > >   solutions which cannot be said of PG yet).
    > >
    > All of the code used in our performance testing was statically linked 
    > and compiled with PointGuard to work around the absence of a PG version 
    > of glibc, so the performance figures are valid.
    
    I think i am not following you here. Why does statically linking
    in glibc change the fact that glibc is not protected with PG (and
    neither are heap pointers as a consequence)? By the way, can you
    please make your 'straw man' vulnerable binary (the one shown in
    figure 12 in your paper) available, i would like to take a look
    at it.
    
    > >   "2. Usefully corrupting a pointer requires pointing it at a
    > >    specific location."
    > >
    > >This is false, the hijacked pointer may very well point to a set of
    > >specific values (e.g. any GOT entry that is used later, any member of
    > >a linked list, etc).
    > >
    > Bull: you just specified a specific location that happens to be a range. 
    
    Sorry, i am not a native speaker and have misunderstood your use of
    'specific location' as 'fixed/known address'. However this does not
    change the fact that such a pointer can still be successfully attacked
    without knowing the full encryption key, see below.
    
    > A very small range in the size of an address space. Unlike PaX/ASLR 
    > (which can only jiggle objects a little within a range)
    
    What is your definition of 'little' here? ASLR randomizes the main
    executable, the brk() heap, the libraries (mmap()) and the stack in
    separate 256 MB ranges (the main executable and the brk() regions
    do overlap somewhat). The randomness within these ranges is roughly
    16/24/16/24 bits (all this on 32 bit architectures, and more on
    64 bit ones).
    
    > PointGuard has complete freedom to randomize all 32 bits of the
    > pointer, so the fact that you can craft an exploit that can only
    > approximately hit a target does not affect PointGuard.
    
    Even if PG randomizes all 32 bits it is vulnerable to partial pointer
    overwrites as described in [3].
    
    >     * PointGuard provides better randomization than ASLR, because the
    >       randomization ranges are much greater.
    
    Attacking ASLR means that one needs to know addresses from more than one
    (differently) randomized region (e.g., a code address and a stack/heap
    address because one is forced to do a return to libc style attack), this
    quickly adds up (e.g., up to 40 bits in the mentioned case) as generally
    one cannot brute force these values independently.
    
    > >   "3. Under PointGuard protection, a pointer cannot be corrupted
    > >    to point to a specific location without knowing the secret key."
    > >
    > >This is correct provided the implementation is bug-free - something
    > >that cannot be verified until you actually release PG.
    > >
    > I have no idea what you are talking about. If the pointer is hashed, you 
    > *cannot* usefully corrupt it without knowing the secret key.
    
    I can, it is called partial pointer overwrite and can be useful if your
    'specific location' you want to aim at is a range. Imagine that in the
    future PG will encrypt the saved program counter as well and you have
    a large enough overflowable stack buffer - overwriting the least
    significant byte of the saved program counter will transfer control
    within a 256 byte range of the original return place (or a 64k range
    for a 2 byte overwrite). If there is any byte sequence there that does
    something like 'jmp register' and the register happens to hold a value
    pointing back to the buffer (it can happen since you do not reload all
    registers on function return therefore a plaintext pointer can leak
    back to the caller) you will execute shellcode. Even a non-executable
    stack is just a workaround here, you may very well have a plaintext
    heap pointer in the register (which in fact you do since PG does not
    handle them yet) and return there if you can ensure that there is a
    copy of the shellcode there.
    
    > Speculating that any piece of software has bugs without foundation
    > boarders on FUD,
    
    It is not FUD, it is a simple reminder that unless your work is made
    public, it cannot be evaluated let alone trusted for its correctness.
    I will remind you that in [4] you stated that:
    
      "We describe StackGuard: a simple compiler technique
       that virtually eliminates buffer overflow vulnerabilities
       with only modest performance penalties."
    
    As we know this did not prove to be true, StackGuard was found to be
    circumventable ([5], [6]). I personally believe that the more eyes
    can scrutinize a system, the lower its error rate will be. It is in
    your best interest to make your work available and wait for judgement
    until others have looked over it - you did not prove to be infallible
    in the past, why would you have created a bug-free system this time
    (especially considering how much more complex it is to hack the AST
    than function prologue/epilogue generation)? For now i would even
    appreciate just taking a look at PG compiled binaries at the
    assembly level, that would tell me what is exactly randomized
    and how vulnerable PG is.
     
    > but in this case it isn't even possible: an encrypted pointer cannot be 
    > modified by a plaintext overflow. A bug that accidentally laid a 
    > plaintext pointer would result in a crash when the value is decrypted, 
    > and vice versa: the design specifically resists this problem.
    
    Three words: partial pointer overwrite.
    
    > Speculate away as to what PointGuard will do when we're done integrating 
    > it. On second thought, don't: you've done more than enough flaming 
    > speculation today :)
    
    It seems to be that my only mistake was an unfortunate consequence
    of my language skills (or lack thereof), otherwise you have only
    confirmed my 'speculations'. In any case, i look forward for trying
    out PG once it is available (or at least your test binaries).
    
    > >   Third, there is related work ([4] and [5], all of which predates PG
    > >   by years and you failed to reference) that appears to show more real
    > >   performance impact of function pointer encryption (something PG does
    > >   not seem to do yet universally).
    > >
    > That work is in fact based directly on PointGuard, having resulted from 
    > this post http://lwn.net/1999/1111/a/stackguard.html
    
    Which of the two works are you referring to here?
    
    > And you're on crack if you think their performance results are more 
    > realistic: the only "pointer" they encrypt is the activation return 
    > address. *None* of the hard work of weaving pointer encryption into the 
    > compiler's type system was done. They published first because we chose 
    > deliberately to not publish an empty idea with no implementation.
    
    Assuming you are talking about [5] above, i am at a loss to interpret
    your words ;-). Their performance results indicate much bigger (yes,
    positive, i.e., slowdown) impact than what you have reported in your
    PG paper. Now, if you think they encrypt even less than PG (so far you
    seem to confirm that PG does *not* in fact encrypt the program counter),
    then one would expect that your numbers would be even worse than theirs
    (i.e., even more slowdown and definitely no speed-up). Since you imply
    that their numbers are less realistic than yours (i hope you did not
    seriously expect a speed-up from them), you can draw the consequence
    that your numbers are even less realistic than theirs. Which is what
    i have been saying myself. Also do you realize that the 'only' pointer
    they encrypt happens to be the most vulnerable for execution flow
    redirection style attacks as well?
    
    > >9. In section 7.1 you say that:
    > >whereas you admit before that PG requires programmer intervention (as it
    > >is not possible to have a pure PG system right now), i doubt a programmer
    > >can compile (port) millions of lines of code in a day.
    > >
    > You are entitled to your opinion on the numbers and magnitudes, but it 
    > is inescapable that "porting" to PointGuard is far less work than 
    > porting from C to Cyclone or CCured. So what's your point?
    
    The point is that augmenting existing code for PG use consumes time
    (since you do not have a full PG system) and you have shown no data
    that would back up your claim. When PG becomes able to recompile the
    entire userland without any intervention then you can claim your
    'millions of lines of code in a day'.
    
    > >Where is this "exec(sh)" supposed to be 'almost always'? Can you
    > > substantiate this claim?
    > >
    > It is in glibc, and most programs link to glibc. This is very well 
    > known, and I didn't think it needed to be justified.
    
    Last time i checked, glibc did not contain any exec(sh) code. What it
    does contain is system() which will invoke 'sh -c' and exit if you
    fail to pass a proper argument and the various wrappers around the
    execve() system call that will also fail if you do not pass the proper
    arguments. The point i am making here is that despite a handy execve()
    in glibc you also have to be able to provide arguments (pointers) before
    you can (ab)use it in an exploit. Therefore randomization (ASLR in PaX)
    prevents this kind of exploiting in a probabilistic sense (minus
    information leaking) and you should have pointed out that composing
    various techniques (such as those in PaX) provides more protection
    than using them standalone - it is an important part of the PaX
    philosophy, you will see it once you read the documentation.
    
    > I have been *trying* to properly cite PaX in various papers for at least 
    > a year, but you don't make it easy. A web URL is not normally considered 
    > a suitable citation.
    
    Is it not? Then how do you explain that you provided web URLs for projects
    like Solar Designer's kernel patch?
    
    > research community is unaware of PaX. I dare say that the PointGuard 
    > paper will do more to raise PaX visibility in the research community 
    > than anything before. That was deliberate, because IMHO PaX is 
    > under-exposed: it's good work, and few have heard of it.
    
    Shall i also thank google then for helping out all the lost souls who
    otherwise would not be able to find the PaX webpage because for some
    yet to be explained reason you had failed to provide it yourself?
    
    > >I am curious to learn why you cited this information
    > >when you have already been made aware of the current situation ([13]).
    > >
    > It was hearsay. Publish something, and I'll cite it. Please.
    
    Is hearsay what can be published in a paper accepted at a 'strong
    refereed conference' then? How about you test it out yourself? You
    have the source code at your disposal after all. I also doubt that
    you would cite anything i write considering how you failed to do so
    in the past, even after you have been made aware of the documentation
    (and which you had apparently failed to read). In any case, a separate
    documentation about PaX performance is not out of question, but i
    would rather do it when all architectures can be properly evaluated
    (right now due to some crappy userland code RISC architectures need
    special emulation that i will get rid of eventually).
    
    > Go look up the word "dual": it is a mathematical term. What you're 
    > saying is exactly the same as what I am saying.
    
    Ok, i did and according to [7]:
    
      "Every field of mathematics has a different meaning of dual."
    
    Oops.
    
      "Loosely, where there is some binary symmetry of a theory, the image
       of what you look at normally under this symmetry is referred to as
       the dual of your normal things."
    
    I cannot really say that i got what you really meant by 'dual' but
    let's assume we were really meaning the same thing ;-).
    
    > >This is misleading because Address Obfuscation is vulnerable to the exact
    > >same information leaking problem as ASLR or PG, otherwise an attacker has
    > >to guess addresses (if he needs any, that is), there is no (determinisctic)
    > >way around that.
    > >
    > It is *your job*, not mine, to go write a paper explaining how PaX/ASLR 
    > is better than Sekar et al.
    
    Actually, it is not my job (more below), i was merely pointing out that
    your comparison table has more flaws beyond PaX.
    
    > In the absence of such a paper, I'm having to guess at the differences,
    > in a very small portion of my paper.
    
    Why do you have to guess at the differences and why did i not have to?
    I can assure you that we have access to the exact same information:
    the AO paper and the PaX source code and documentation (and i am sure
    that you could have asked questions from either party should you have
    needed it).
    
    > I vigorously encourage you to go write a real paper and 
    > submit it to a strong refereed conference such as USENIX Security. 
    
    Thank you but i have other ideas about a 'strong refereed conference'.
    Besides, i am not into academia, my interests lie elsewhere.
    
    > Had you done this two years ago, you would not be having this silly
    > flame war over W^X with Theo.
    
    Thanks for the advice but i already saw in this very thread ([8])
    where publishing papers in this area leads to, i do not think i
    need any more of it.
    
    [1] http://www.phrack.org/show.php?p=58&a=4
    [2] http://link.springer.de/link/service/series/0558/bibs/2513/25130025.htm
    [3] http://www.phrack.org/show.php?p=59&a=9
    [4] http://immunix.org/StackGuard/usenixsc98.pdf
    [5] http://www.phrack.org/show.php?p=56&a=5
    [6] http://www1.corest.com/common/showdoc.php?idx=242&idxseccion=11
    [7] http://dictionary.reference.com/search?q=dual
    [8] http://marc.theaimsgroup.com/?l=bugtraq&m=106124676623652&w=2
    



    This archive was generated by hypermail 2b30 : Tue Aug 19 2003 - 11:11:17 PDT