Spam Hunting Practicum (was: Fw: Young lolitas 5-16 years old)

From: Crispin Cowan (crispin@private)
Date: Fri Oct 12 2001 - 09:57:33 PDT

  • Next message: mark: "Re: Spam Hunting Practicum (was: Fw: Young lolitas 5-16 years old)"

    Alexey Panchenko wrote:
    
    >Speaking of something not being SPAM just because it says "This is not Spam!
    >This e-mail is never sent unsolicited".
    >
    I have been *assuming* that people are not actually taking the "this is 
    not spam" line seriously.  I certainly hope that is the case.
    
    >Here below is something I found in
    >my mailbox this morning -- would I need to try to convince the esteemed
    >CRIME group that it *was* unsolicited?  Normally, I would just hit the
    >delete button, but due to the contents of this website, I am wondering to
    >whom I could/should report Mr. "Daniel Leuenberger" and his web "service".
    >
    There are two fundamental ways to hunt spammers:
    
        * Manually: inspect the headers, extract the real IP addresses of
          the machines involved, use whois and traceroute to find the ISPs
          hosting the offending parties, and have their assets close.
        * Automatically: use a service such as spamcop.net
    
    I spent a lot of time over the last several years using the manual 
    method.  It is viscerally satisfying, but very time consuming. What you 
    want to do is extract the ISPs that host the spammers, and then forward 
    the spam to "postmaster", "abuse", and "root" @ that ISP. These are the 
    things to look for to try to extract the spammer's ISPs.
    
        * In the full headers, you will find a string of "Received by"
          lines, showing the string of SMTP servers that the spam passed
          through. In each line there may be a claimed name of the machine,
          and then there is always a set of () parentheses that contains the
          true sending machine. The () name may be just an IP number, in
          which case you have to use network tools like whois and traceroute
          to hunt down the hosting ISP. The non-() name is never to be
          trusted unless it matches the () name, which you can verify with
          nslookup.
        * In the body, you will often find URLs. Sometimes the URLs are easy
          to figure out, e.g. http://tripod.com/~stupid_spammers_web_page
          [say].  In other cases, it's harder, e.g. http://stupidspammer.com
          but then whois and traceroute are effective. Sometimes the URL has
          been heavily obscured, such as using a numeric IP address, e.g.
          http://1.2.3.4 or worse a numeric IP encoded in decimal
          http://298374928759828 .  In these cases, traceroute is your
          friend, as it will parse variously encoded IP numbers (just as
          your web client will parse them) giving you a real IP address to
          examine. Then go back to whois.  In some cases, you will get a
          REALLY creatively encoded URL that contains a bunch of different
          names interspaced with @ signs. Only the name following the last @
          matters, e.g. http://hey_lookie@private will resolve to google.com
        * Spam may be encoded in HTML, and often the images in the HTML are
          not included in the mail, but are just links that point to some
          web site.  Apply all of the above to locating the ISP hosting the
          images and have them disabled.
        * Spam may use HTML to encode a submission form, which invariably
          ends in a mailto: URL.  Inspect the e-mail addresses used and
          complain to the hosting ISP.
    
    The above takes 10-15 minutes per spam, and requires network access in 
    most cases. Eventually I got tired of it. Now I use spamcop.net: you can 
    report a spam by just forwarding it to spamcop@private, and they 
    reply with an e-mail containing an URL that takes you to a web page for 
    reporting the spam, with all those headers parsed for you.  I bout 25 MB 
    worth of spam processing in mid-september, and their usage estimate 
    claims that I'm using about $2.50 worth of services per year at my 
    current burn rate.  Recommended.
    
    Crispin
    
    -- 
    Crispin Cowan, Ph.D.
    Chief Scientist, WireX Communications, Inc. http://wirex.com
    Security Hardened Linux Distribution:       http://immunix.org
    Available for purchase: http://wirex.com/Products/Immunix/purchase.html
    



    This archive was generated by hypermail 2b30 : Sun May 26 2002 - 11:27:19 PDT