[ISN] Cracking an algorithm bit by bit conclusion.

From: InfoSec News (isnat_private)
Date: Wed Feb 05 2003 - 22:18:32 PST

  • Next message: InfoSec News: "[ISN] Feds pull suspicious .gov site"

    +------------------------------------------------------------------+
    |  Linux Security: Tips, Tricks, and Hackery                       |
    |  Published by Onsight, Inc.                                      |
    |                                                                  |
    |  05-February-2003                                                |
    |  http://www.hackinglinuxexposed.com/articles/20030205.html       |
    +------------------------------------------------------------------+
    
    This issue sponsored by Onsight, Inc, your source for open-source
    solutions.
    
    Onsight offers on-site training on Basic Perl programming, Advanced
    Perl programming, CGI programming using Perl, Tcl/Tk, XML and
    JavaScript. All courses are hands-on, designed by real-world
    consultants and fully customizable. Because all classes are on-site,
    our overhead is low and our prices consistently beat those of our
    competitors. Every Onsight instructor is a seasoned consultant able
    to provide back-end web programming, network security, system
    administration and other support services.
    
    For more information, visit http://www.onsight.com
    
    --------------------------------------------------------------------
    
    Cracking an algorithm bit by bit conclusion.
    By Brian Hatch
    
    Summary: We complete our reverse engineering of a terribly-lame
    encryption algorithm.
    
    Last week we stripped off the outer layer of our home-grown crypto
    algorithm through some intuition. The data was made pure ASCII using
    a modified uuencode system.
    
    (For those of you keeping score, David Nash was the winner of the
    contest, having written a perl script earily similar to the one at
    the end of this newsletter. One copy of HLEv2 is coming his way.)
    
    Again, here are the sample strings you had to work with:
    
      # String 1
      !!@!1P!=P!?P!=P!?`!>`!<0!?0!;0!?0!=@!B`!,`!>@!A0!,P!>@!B@!A`
    
      # String 2
      !T`!M0!TP!V0!X0!Y0!C@!P0!Y0!W0!UP!Y@!H@
    
      # String 3
      !8@!>@!E`!EP!H`!GP!I0!GP!60!A@!I`!J@!L@!M@!7P!A0!N0!L@!L@!MP!J@!J@
    
      # String 4
      !^`!P0![0![0!IP!]0!H@!Z`!Y0!^0!I@!``![0!]0!]@!^@!`P!K0!`0!_0!_P!"`!P`
    
      # String 5
      !S@!O0!W`!SP!BP!V0!X@!X@!XP!D`!G@!D@!YP!V0![0!Z@!EP!X0![`!F@!W0!X0!\`!\@!K0
    
    
    Using the following perl script, you can reformat and uudecode the
    output to it's original binary state:
    
    
      prompt$ cat decrypt.pl
      #!/usr/bin/perl
      use strict;
    
      my $encrypted =
         '!!@!1P!=P!?P!=P!?`!>`!<0!?0!;0!?0!=@!B`!,`!>@!A0!,P!>@!B@!A`';
    
      # Add the backticks again
      $encrypted =~ s/(...)/\1``/g;
    
      print unpack( "u", $encrypted );
    
      prompt$ perl decrypt.pl
      Gww|xq}m}v_0z_3z__b
    
    
    (Note that, since some letters will be outside the printable range,
    I'll show them as underscores for the rest of this article.)
    
    Using this first string, we have some printable characters as our
    output, but other encrypted strings yielded very ugly results,
    outside the printable ASCII range. So instead, let's see the actual
    numerical values of those characters:
    
    
      prompt$ cat decrypt.pl
      #/usr/bin/perl
      use strict;
    
      my $encrypted =
         '!!@!1P!=P!?P!=P!?`!>`!<0!?0!;0!?0!=@!B`!,`!>@!A0!,P!>@!B@!A`';
    
      # Add the backticks again
      $encrypted =~ s/(...)/\1``/g;
    
      my $uudecoded = unpack( "u", $encrypted);
    
      # Loop through the letters one by one
      for my $char ( split //, $uudecoded ) {
              print ord($char), " ";
      }
      print "\n";
    
      prompt$ perl decrypt.pl
      6 71 119 127 119 124 120 113 125 109 125 118 136 48 122 133 51 122 138 132
    
    
    Hmmn. Now most of those numbers seem to come from a very small range,
    109 -> 138, and the difference (29) is pretty close to the size of
    the English alphabet (26). What if they are all shifted over from the
    actual letters? Let's see what happens if we ASCII-ify our encrypted
    string, and try to bring the letters back into a proper range:
    
    
      prompt$ cat decrypt.pl
      #!/usr/bin/perl
      ...
      # Loop through the letters one by one
      for ( my $offset=0; $offset < 50; $offset++ ) {
              for my $char ( split //, $uudecoded ) {
                      print chr( ord($char) - $offset ), " ";
              }
              print "\n";
      }
    
      prompt$ perl decrypt.pl
      _ G w _ w | x q } m } v _ 0 z _ 3 z _ _
      _ F v ~ v { w p | l | u _ / y _ 2 y _ _
      _ E u } u z v o { k { t _ . x _ 1 x _ _
      _ D t | t y u n z j z s _ - w _ 0 w _ _
      _ C s { s x t m y i y r _ , v _ / v _ _
      _ B r z r w s l x h x q _ + u _ . u _
      _ A q y q v r k w g w p _ * t _ _ t _ ~
      ...
    
    
    Well, this certainly looks more like normal letters here, but it's
    definitely not the answer. Note how the beginning begins with a
    capital, which could make sense.
    
    Then I tried to figure out if it was a substitution cipher from this
    point. The garbage characters toward the end, with two printable
    characters in the middle may be spaces. If that were the case, then
    the two letter word might be something like "an", "if", or "is".
    Unfortunately, my attempts to come up with a direct substitution
    cipher didn't get anywhere.
    
    When comparing the output of this decryption stage against the other
    strings I had to work with, I noticed that subsequent characters
    tended to get larger on the whole. Perhaps there was a bit more
    manipulation going on here. So I tweaked my loop again, adding a
    $count variable which would be subtracted from each ASCII value, with
    $count growing by one each time:
    
      prompt$ cat decrypt.pl
      #!/usr/bin/perl
      ...
      # Loop through the letters one by one
      for ( my $offset=0; $offset < 50; $offset++ ) {
              my $count=0;
              for my $char ( split //, $uudecoded ) {
                      print chr( ord($char) - $offset - $count), " ";
                      $count++;
              }
              print "\n";
      }
    
      prompt$ perl decrypt.pl
      _ F u | s w r j u d s k | # l v # i x q
      _ E t { r v q i t c r j { " k u " h w p
      _ D s z q u p h s b q i z ! j t ! g v o
      _ C r y p t o g r a p h y   i s   f u n
      _ B q x o s n f q ` o g x  h r  e t m
      _ A p w n r m e p _ n f w  g q  d s l
      _ @ o v m q l d o ^ m e v  f p  c r k
      ...
    
    Pay dirt! Look at the fourth line, "_Cryptography is fun" -- seems
    like we've figured out the encoding. This occurred when $offset was
    set to 3. So it would seem that the algorithm to encrypt was
    something like this:
    
      * Take the ASCII value of the character to be 'encrypted'.
      * Add 3 ($offset)
      * Add the position of this character ($count)
      * Uuencode
      * Strip the last two characters of the uuencoded value (``)
    
    I ran the same program, this time with the next encrypted string as
    input, and got the following:
    
      # Using string 2 this time
      prompt$ perl decrypt.pl
      Ð ´ Ñ Ö Ý à _ º Ý Ô Í Û _
      Ï ³ Ð Õ Ü ß _ ¹ Ü Ó Ì Ú _
      Î ² Ï Ô Û Þ _ ¸ Û Ò Ë Ù _
      Í ± Î Ó Ú Ý _ · Ú Ñ Ê Ø _
      Ì ° Í Ò Ù Ü _ ¶ Ù Ð É × _
      Ë ¯ Ì Ñ Ø Û _ µ Ø Ï È Ö _
      Ê ® Ë Ð × Ú _ ´ × Î Ç Õ _
      É ­ Ê Ï Ö Ù _ ³ Ö Í Æ Ô _
      È ¬ É Î Õ Ø _ ² Õ Ì Å Ó _
      Ç « È Í Ô × _ ± Ô Ë Ä Ò _
      ...
    
    Drat. I was hoping to see text on the 4th line, but was just getting
    garbage. Well, not exactly garbage, there were still unprintable
    characters in the same place as if they'd be word breaks, while the
    rest was filled with bytes with the high bit set. I compared the
    ASCII values of this set against the first set. This second string
    was offset much more than 3 characters. I tweaked the script to go
    from an $offset of 0 to 256 to see if anywhere there was a readable
    string. I also prefixed the $offset value so I could keep better
    track of things:
    
      # Using string 2
      prompt$ perl decrypt.pl
      0: Ð ´ Ñ Ö Ý à _ º Ý Ô Í Û _
      1: Ï ³ Ð Õ Ü ß _ ¹ Ü Ó Ì Ú _
      2: Î ² Ï Ô Û Þ _ ¸ Û Ò Ë Ù _
      3: Í ± Î Ó Ú Ý _ · Ú Ñ Ê Ø _
      4: Ì ° Í Ò Ù Ü _ ¶ Ù Ð É × _
      ...
      101: k O l q x { # U x o h v 1
      102: j N k p w z " T w n g u 0
      103: i M j o v y ! S v m f t /
      104: h L i n u x   R u l e s .
      105: g K h m t w  Q t k d r -
      106: f J g l s v  P s j c q ,
      ...
    
    Aha! Look at line #104 - "hLinux Rules.". We're definitely onto
    something.
    
    So, it seems our algorithm was correct, but the offset kept changing.
    Running the rest of the encrypted strings also yields readable output
    with different values for $offset. Although, for some reason, one of
    them was not quite readable:
    
      # Using string 4 this time
      prompt$ perl decrypt.pl
      ...
      124: | D o n ' t   e a t   _ e l l o _ _ _ n o _ .
      ...
    
    Hmmn - those underscores (originally unprintable characters)
    indicated I didn't quite have my algorithm tweaked correctly. Looking
    at the raw ASCII values, I realized I'd not taken into account the
    fact that when you take an original ASCII value and add or subtract
    increasing numbers, eventually you'll go beyond the bounds of an 8
    bit number. So, tweaking a bit further I added the following and
    re-ran it:
    
    
      prompt$ cat decrypt.pl
      #!/usr/bin/perl
      ...
              for my $char ( split //, $uudecoded ) {
                      my $ascii = ord($char) - $offset - $count;
                      $ascii += 256 while $ascii < 0;
                      print chr( $ascii ), " ";
                      $count++;
              }
              print "\n";
      }
    
      prompt$ perl decrypt.pl
      ...
      124: | D o n ' t   e a t   y e l l o w   s n o w .
      ...
    
    
    Excellent. Now we can decrypt the strings correctly. Unfortunately,
    there's still one nagging problem: we have no way to find the right
    $offset value, and need to try many of them and either find the right
    one using our eyes, or write something to check and see when the
    results match our expectations.
    
    I tried seeing if the offset was related to the size of the message,
    any of the other related fields in the database, or the message
    itself.[1] Then it struck me how blind I was - the offset must be
    stored in the mystery first character.
    
    Converting the initial character of each string to it's uudecoded
    ascii value you get:
    
      String 1: !!@`` == 6
      String 2: !T``` == 208
      String 3: !8@`` == 98
      String 4: !^``` == 248
      String 5: !S@`` == 206
    
    The first thing that jumps at you is that each value is even. The
    next thing that jumps out is that if you divide them by two, you get
    the offset that we need. The first message had offset 3, the 4th had
    offset 124, etc. Problem solved in full.
    
    So what problems were built into this custom crypto?
    
      * There was a random offset, but this offset was contained within
        the encrypted strings barely obfuscated.
       
      * The randomization (adding the character number to the ascii value
        before uuencoding) was extremely week, and pretty easy to guess.
       
      * The uuencoding was used to make the encrypted version pure
        printable text, but made the output regular and led to some
        correct assumptions about the plaintext and the method of
        creating the ciphertext.
       
      * The random initializer, in order to only occupy one byte, needed
        to be between 0 and 127, which is only 7 bits of 'security'.[2]
    
    I whipped up a set of perl subroutines to encrypt/decrypt text that
    was consistent with the algorithm that was in use, and presented them
    to the client who was relying on this home-grown cryptography to
    protect their sensitive data. After the panic and four letter words
    subsided, they quickly petitioned to create a committee to asses the
    feasibility of having new cryptographic security systems investigated
    to replace their bad algorithm.[3]
    
    Later they showed me their encryption/decryption C code. It was many
    pages long, not counting the pages it took to re-implement a one-byte
    uuencode/decode algorithm. The amount of mathematics and manipulation
    that went into the algorithm were far more advanced than the simple
    solution I had come across, involving several independent[4]
    variables taken together as per-message keys. Their algorithm did
    require the knowledge of a secret key, only available on their
    server. However the way in which these values were (mis)used, ended
    up cancelling out the need for this key, and everything reduced to
    the much simpler algorithm which I'd reverse engineered.
    
    This, of course, leads us to another tenant of cryptography:
    
      * Complexity does not guarantee security
    
    So, thus ends our tale of one bad crypto system. Unfortunately, there
    are many many more out there, waiting to be broken.[5]
    
    Here are the encryption/decryption routines for this bad crypto.
    Unfortunately, I can't provide the original C sources that were used
    for comparison. Feel free to use this as a learning/investigative
    tool, but for goodness sake don't use them for anything that should
    be secret.
    
    
    
     #!/usr/bin/perl
     # badcrypto.pl
     # copyright 2003, Brian Hatch, released under the GPL.
     #
     # Reverse-engineered algorithms.  Don't use these - they
     # are excellent examples of cryptography done poorly.
     #
     # Usage:  require "VeryBadCrypto.pl";
     #         $encrypted = VeryBadCrypto::encrypt( $string );
     #         $decrypted = VeryBadCrypto::decrypt( $encrypted );
     
     package VeryBadCrypto;
     
     sub encrypt {
            my($message) = join "", @_;
            my $encrypted;
    
            # Pick random initializer.  Must be less than 128
            my $count = int rand(127);
    
            # Break up message into individual characters
            my @characters = (chr($count), split //, $message);
    
            for my $char ( @characters ) {
    
                    my $ascii = $count + ord($char);
    
                    # convert to a 0-255 number
                    $ascii = $ascii % 256;
    
                    my $uuencoded = pack "u", chr($ascii);
    
                    # Strip the trailing `` chars we have from uuencoding
                    $uuencoded = substr($uuencoded,0,3);
    
                    $encrypted .= $uuencoded;
    
                    $count++;
    
                    # Avoid potential problems when integers wrap around
                    $count %= 256;
            }
            $encrypted
     }
    
     sub decrypt {
            my($encrypted) = join "", @_;
            my $decrypted;
    
            # Put back trailing "``" characters
            $encrypted =~ s/(...)/\1``/g;
    
            my $uudecoded = unpack "u", $encrypted;
            my @characters = split //, $uudecoded;
    
            my $initializer = shift @characters;
    
            # Find original initializer value
            $initializer = ord($initializer) / 2;
    
            # add one to compensate since the first one was used
            # to encode the initializer itself.
            $initializer++;
            
            # Ok, we've got our initializer, time to decrypt the rest.
            for my $char ( @characters ) {
                    my $ascii = ord($char);
    
                    $ascii -= $initializer;
                    $ascii += 256 while $ascii < 0;
    
                    $decrypted .= chr($ascii);
    
                    $initializer++;
            }
            $decrypted;
     }
    
    
    NOTES:
    
    [1] Which was pretty foolish, since the message wouldn't be available
    to the decrypt routine, but I wasn't thinking clearly.
    
    [2] If you could even call this algorithm secure.
    
    [3] Ahh, now if that isn't a definitive step, I don't know what is.
    
    [4] But, by their error, these other values were not independent,
    which is why they ended up being irrelevant.
    
    [5] And if you live in the US, you'd better hope they're not being
    used to protect copyrighted works, or your knowledge of rot13 could
    land you in jail for DMCA violations.
    
                                -------------                            
    Brian Hatch is Chief Hacker at Onsight, Inc and author of Hacking
    Linux Exposed and Building Linux VPNs. He developed his first
    (horribly insecure) cryptographic algorithm when he was six. It was
    no better than rot13, but took up a lot more space. Brian can be
    reached at brianat_private
    
    --------------------------------------------------------------------
    This newsletter is distributed by Onsight, Inc.
    
    The list is managed with MailMan (http://www.list.org). You can
    subscribe, unsubscribe, or change your password by visiting
    http://lists.onsight.com/ or by sending email to
    linux_security-requestat_private
    
    Archives of this and previous newsletters are available at
    http://www.hackinglinuxexposed.com/articles/
    
    --------------------------------------------------------------------
    
    Copyright 2003, Brian Hatch.
    
    
    
    -
    ISN is currently hosted by Attrition.org
    
    To unsubscribe email majordomoat_private with 'unsubscribe isn'
    in the BODY of the mail.
    



    This archive was generated by hypermail 2b30 : Thu Feb 06 2003 - 01:33:33 PST