FC: UK wiretapping "traffic" vs. "contents" a sham, by John Gilmore

From: Declan McCullagh (declanat_private)
Date: Wed Apr 10 2002 - 04:59:24 PDT

  • Next message: Declan McCullagh: "FC: Politech members reply to Fleishman-Hillard PR firm's threats"

    Previous Politech message:
    
    "More on UK firms can't police personal email during office hours"
    http://www.politechbot.com/p-03363.html
    
    Also see responses from Tom Perrine and Matthew Francey toward the end of 
    this message.
    
    -Declan
    
    ---
    
    To: declanat_private, gnuat_private
    Subject: Re: UK wiretapping: "traffic data" versus "contents" a sham
    In-reply-to: <5.1.0.14.0.20020409063451.02246b20at_private>
    Date: Tue, 09 Apr 2002 15:03:48 -0700
    From: John Gilmore <gnuat_private>
    
     > There's a fundamental difference between what employers want to do (look at
     > the contents of e-mail their employees are sending and receiving)
     > and what the government wants to do (record nothing more than the to and
     > from addresses of e-mail and the time it was sent or received).
    
    There is no such fundamental difference.
    
    After 9/11 I wrote a long story explaining how little fundamental
    difference there is between "contents" and "envelopes" in the digital
    world.  I'll append it below (slightly revised).  We lost that battle
    while Congress was in Lemming Mode.  I'm sending this in the hope that
    the citizens of the UK won't get similarly taken advantage of.
    
    In US wiretap law, until last year, there was a clear legal
    distinction.  "Digits dialed on a telephone before the call is
    answered on the other end" was the original definition of the
    addressing information that could legally be captured by a "pen
    register" warrant, issued without probable cause to believe that a
    crime has been committed.  Tens or hundreds of thousands of pen
    register orders are issued every year in the US.  This info is also
    called "addressing and signaling info" or "traffic data"; they are
    all intended to mean the same thing.
    
    Everything else about a communication was "the contents of the
    communication", protected by the US Constitution and by the wiretap
    laws.  Tapping anything that happened "after the call was answered"
    required a cop to prove to a judge FIRST that they already had
    probable cause to believe that THIS PARTICULAR phone line used by THIS
    PARTICULAR person would hold evidence of A PARTICULAR crime.  Only
    about 1500 legal wiretap orders like this are reported every year in
    the US, though there is solid evidence that there are more police
    wiretaps which are not reported.
    
    The USA Patriot Act sponsors lied to and misled Congress into
    believing that there is a similarly clear distinction between the
    "traffic data" and the "content" of email.  They were wrong, and they
    were maliciously wrong, seeking deliberately to undermine the rights
    of citizens in order to make their own jobs easier.    I suspect that
    the UK proponents of expanded wiretapping are similarly misleading
    the public.
    
    I'll give you one example here and leave the rest for the appended note.
    
    Suppose someone is reading their email at a web site, like Hotmail,
    via a dialup call to their ISP.  Assume we already give up the clear
    distinction between "before the call is answered" and after, allowing
    SOME of the bits that are communicated after the modem calls the ISP to
    be wiretapped without probable cause.  Suppose the government has
    the right to wiretap the "addressing info in the email headers".
    Exactly which bits on the phone line are those?
    
    Well, the government can't tell, when you access a web site, whether
    there's going to be an email message in that web page or not.  So they
    have to at least look at every web page you access.  Even at an email
    web site, which bits of the web page might be the email addressing
    info?  Well, every ISP or mail service lays out the page differently,
    so some human is going to have to look at the whole web page.  Doesn't
    that already violate the rule that the government can't watch the
    CONTENT of what you are doing without a warrant that shows probable
    cause?  They just DID watch the content, and (surprise!) you were
    merely checking an auction at Hotmail, rather than reading an email
    there.
    
    Besides the fuzziness of "how can you tell which bits are legal to
    look at until you look at them", let's spend a moment on the social
    problem caused by widespraed monitoring of "addressing information" or
    "traffic data".  The wiretap agencies are very good at building up a
    long-term database of who-is-talking-to-who by monitoring the sources
    and destinations of messages.  This is 90% of what the NSA does, it's
    called "traffic analysis".  Even if they can't crack the codes of a
    military organization, they know who issues the orders and who
    receives them, where they are located, their history of communication,
    etc.
    
    If such "traffic analysis" systems can be deployed against our own
    population, by inserting them into every ISP for permanent monitoring
    without any warrants, then everyone's freedom of association is
    violated.  The government knows who all your friends are, who all your
    relatives are, who you work with, who you play with.  If you are ever
    a suspect, everyone who communicates with you becomes a suspect -- and
    vice verse.  You will never find out what connections they have drawn,
    because there is no requirement to notify you that your communications
    were tapped, even if they later prosecute you based on investigations
    that were triggered by knowing your relationships.
    
    This would undoubtedly be useful information for exploring the
    connections that a discovered terrorist had -- but there are only
    small numbers of terrorists in the world.  It would be much more
    useful for tracking their POLITICAL OPPOSITION.  Their COMPETITION.
    Their EX-WIFE or their WAYWARD DAUGHTER.  Useful to the IRS.  To the
    party in power.  To drug enforcers who oppose drug legalization.  To
    diplomats who oppose popular reform movements.  To anyone at the
    levers of power, who seeks not to be dethroned.  To J. Edgar Hoover
    and to Senator McCarthy.
    
    The US and UK governments are trying to find and track all the
    connections among all the citizens, while at the same time trying to
    hide and obscure the connections inside the government itself.
    Locking up public documents where the public can't get them, holding
    "closed meetings", refusing to honor open government and
    freedom-of-information laws.  Wouldn't YOU like to know who in the FBI
    and the "Office of Homeland Secrecy" is deploying massive-scale
    wiretaps against the civilian population, so you could oppose them as
    traitors to their society?  The government certainly knows who is
    coordinating *resistance* to those massive wiretaps!
    
    What sparked this message is discussion of UK proposals to expand
    wiretap capability ("record nothing more than the to and from
    addresses of e-mail and the time it was sent or received").  I am not
    up to date on the current UK proposals, but history has shown the UK
    wiretap authorities to be very good students of the worst of the US
    wiretappers.  Perhaps the lesson below about what the US wiretappers
    did to us after 9/11 will be instructive to the UK populace.
    
    	John Gilmore
    
    [The following note was written when the USA Patriot Act was a draft,
    and EFF was reviewing its provisions, along with a coalition of other
    groups, including CDT.  The government's proposed wording passed, with
    minimal changes, with only a tiny number of courageous legislators
    opposing it.]
    
    The DoJ's current practice and their proposed wording drive a huge
    truck through the entire concept of limited extrajudicial wiretaps,
    destroying any semblance of Constitutionality.  I think that any
    wording change that is likely to pass Congress would be far inferior
    to the wording that we have today.  I believe that what they are
    currently demanding under pen register orders (e.g. email header lines
    except for Subject) are in no way permitted under the current statute
    (which is undoubtedly why they are so anxious to amend it).
    
    And I encourage the coalition to promptly bring a case and get
    judicial review of some specific instance of pen register access to
    anything above the physical dialing layer (such as email headers).
    The DoJ is breaking the law and violating the Constitution with every
    such order, and it's time we stopped them cold.
    
    It was clear what "dialing" information was in the phone network when
    the original wiretap law was passed.  It applied only to switched
    phone calls with a beginning and an end.  It applied to what was
    "dialed" before the call began.  It didn't even include anything
    "dialed" after completing the call, like a PIN number.
    
    The DoJ's current practice, as disclosed to Congressman Boucher's
    office, and their proposed statutory definitions in the ATA, stretch
    this whole concept in major philosophical ways.  The biggest expansion
    is to have pen register orders apply at any philosophical "layer" of
    telephony.  DON'T LET THEM GET AWAY WITH THIS!  Their only authority
    today is at physical layer, before the start of a "dialed" switched
    telephone call that begins by picking up the phone, dialing some
    digits, communicating for a particular period of time, and hanging up
    the phone.  They're trying to claim the "addressing" or "signalling"
    information not just at the physical circuit-switched level, where the
    previous law anchored it, but at every level of abstraction available
    inside the *content* of the communication.
    
    The reason you can't just let this concept "float" to higher levels is
    because there is no limit to the "addressing" or "signalling"
    information.  EVERY bit of the communication is just "addressing" or
    "signalling" information, at some level of abstraction.  What is
    "signaling" information to one layer is the CONTENT of the next layer
    down.  That CONTENT is protected by today's wiretap laws and by the
    Constutition.
    
    The way modern communication works is that every "layer" of protocols
    is layered on top of another layer.  This "layering" is why you don't
    have to care whether I read my email via an Ethernet or a wireless
    modem or a dialup phone line; on a Mac, a PC, Unix server, or a
    palmtop; using Eudora or Netscape or a web browser.  It's why I don't
    have to care, when I phone you, whether you are on a wired phone, a
    cordless phone, a cellular phone, or an IP phone.  Early data networks
    did not have these layers, and would only communicate with identical
    devices (like receiving SMS text messages today: it only works on
    cellphones).
    
    This "layering" is an easy concept, but most people who aren't
    specialists haven't thought very hard about it.  It's a very powerful
    concept at the heart of Internet communications.  The idea is that you
    can view the same communication at many different "layers of
    abstraction", and obtain both a large degree of flexibility, and a
    great degree of understanding of the communication, by examing many
    levels of the same message.  I'll work out a detailed example
    for you here, then you'll understand how "addressing and signaling"
    versus "content" are chimera concepts because what they mean depends
    on what "layer" of the communication you are looking at.
    
    For example, as this email message travels from my computer to yours,
    it first travels as VOLTAGE VARIATIONS along a twisted pair of wires
    that use the 100 megabit Ethernet physical layer standards.  The
    "addressing information" of that wire pair is merely the two physical
    endpoints into which it is plugged (my computer and a hub).  As for
    duration, the connection never ends; the wire always carries voltage
    variations from one end to the other.  At this level, the "content" is
    all of the voltages on the wire.
    
    Looking one level higher, the voltage variations actually are encoding
    the email message as a series of BITS.  Again, the address of those
    bits is "the other end of the wire down which they are pushed"; the
    wire isn't switched or dialed and only goes to one place.  These bits
    are only communicated when driver and receiver chips are attached to
    the Ethernet wire (at opposite ends); when the wire is unplugged, it
    no longer carries bits, even though it carries voltage.  Here the
    "content" is all of the bits from plug-in to unplug.
    
    One level above the bits are ETHERNET PACKETS.  These are particular
    sequences of bits, which indicate the start of a chunk of information,
    what Ethernet address it came from, what Ethernet address it's going
    to, the type of information carried, the payload of the packet, and a
    checksum to make sure the message wasn't garbled along the way.  As
    for duration, each Ethernet packet begins just after a particular
    series of bits appears on the wire, and ends with another unique
    series of bits; the whole packet lasts for much less than a thousandth
    of a second.  (The next Ethernet packet might be a few nanoseconds
    after this one, or might come along minutes or hours later.)
    
    As you can see, suddenly at this layer there are physical Ethernet
    addresses, which are 48-bit values (for this email, my computer's,
    which is 8:0:20:11:5e:32, and the Ethernet address of my Internet
    gateway box in the basement, which is 0:80:c8:ca:db:35).  These
    Ethernet addresses are assigned by manufacturers, like serial numbers,
    and are supposedly unique in the world (and as a practical matter,
    they are, unless someone takes pains to garble theirs).  At this layer
    we also have a radical change in duration: each Ethernet packet might
    be going to or from a different address, and on a single wire there
    can be thousands of them every second.  At this layer, the CONTENTS
    are the payload -- whatever bits come after the Ethernet header and
    before the final checksum.
    
    One level further up, the payload of the Ethernet packet is an IP
    DATAGRAM.  Now the bits are organized into 8-bit bytes (called
    "octets" in the IP standard, because when IP was invented, bytes
    weren't always 8 bits wide).  The IP datagram includes 20 bytes of
    addressing or signalling information: The 4-byte source IP address,
    4-byte destination IP address, some miscellaneous information, the
    type of contents, and another checksum to detect garbling of the
    addressing info.  (In the case of this email, the source IP address is
    140.174.2.1, and the destination IP address is 206.112.85.50, the
    address of the machine that CDT designated for incoming email.)  As
    Jon Postel's text in RFC 791
    (http://www.rfc-editor.org/rfc/rfc791.txt) says:
    
         A distinction is made between names, addresses, and routes [4].   A
         name indicates what we seek.  An address indicates where it is.  A
         route indicates how to get there.  The internet protocol deals
         primarily with addresses.  It is the task of higher level (i.e.,
         host-to-host or application) protocols to make the mapping from
         names to addresses.   The internet module maps internet addresses to
         local net addresses.  It is the task of lower level (i.e., local net
         or gateways) procedures to make the mapping from local net addresses
         to routes.
    
    Let's go up another level.  Inside the IP datagram is a TCP SEGMENT.
    This contains some addressing information (port numbers), sequence
    numbers, acknowledgements of reciept of earlier information, speed
    control information, a checksum to detect garbling, and the data being
    carried to the other end.  In my case, the source port number will be
    25 (the mail software on my machine), the same as the destination port
    number (your mail software).  The sequence numbers of the TCP segments
    containing our email will be randomly chosen and then will increment
    to count off the amount of data that's been successfully sent and
    received.  If any segment doesn't reach its destination and have a
    proper checksum, its sender will keep retransmitting it (in a series
    of separate IP datagrams) until it hears an acknowledgement of its
    reciept.  The duration of each TCP segment spans the time from when it
    was first transmitted, until it is received and acknowledged --
    usually a period of a second or so.  The "contents" of the TCP segment
    is whatever bits are being carried inside it.
    
    Let's go up another level.  Extra TCP segments are also used to
    negotiate the beginning and end of a "TCP CONNECTION" (like "dialing a
    phone" negotiates the beginning of a phone call, and "hanging it up"
    ends it).  Such a TCP connection contains two "streams" of bytes, one
    carried in each direction.  Each stream contains from 0 to billions of
    bytes.  TCP guarantees that it will deliver those bytes, without
    change, and in the same order that they were sent in.  The duration of
    such a connection begins after one computer initiates it, once the two
    involved computers mutually agree to begin it; it ends when either of
    them ends it.  TCP connections frequently persist for hours or days,
    though the one used to transfer my email to you will only last for a
    second to ten seconds, because this message is only about twenty
    thousand bytes long.  The addressing information for the TCP
    connection is the same as for the TCP segments that make it up.  The
    "contents" of the TCP connection is the series of ordered bytes that
    are sent and received by whatever program opened up the connection.
    
    One more level up, we have the protocol that's used to transfer
    electronic mail in the Internet, called SMTP for Simple Mail Transfer
    Protocol (RFC 821).  When a computer has email for another one, it
    figures out how to get the email closer to its destination, then
    "opens a TCP connection" to that location.  Once that connection is
    open, the two sides exchange HELO ("hello") messages, and then
    negotiate whether and how and to whom they will transfer the email.
    They may send several email messages, then decide to end the connection.
    The information they exchange is sort of like what appears on the
    outside of an envelope of paper mail, and is by analogy called the
    "envelope information".  In our case, a sample sequence would look
    like this.  The lines that begin with numbers are sent by the machine
    at CDT; the ones that begin ">>> " are sent by my machine.  Normally
    the two machines alternate, one sending a line or a short series of
    lines and then awaiting a response from the other side.
    
       220 cdt.org ESMTP Sendmail 8.11.0/8.11.0; Mon, 1 Oct 2001 05:22:27 -0400
       >>> EHLO toad.com
       250-cdt.org Hello toad.com [140.174.2.1], pleased to meet you
       250-ENHANCEDSTATUSCODES
       250-EXPN
       250-VERB
       250-8BITMIME
       250-SIZE
       250-DSN
       250-ONEX
       250-ETRN
       250-XUSR
       250 HELP
       >>> MAIL From:<gnuat_private> SIZE=43
       250 2.1.0 <gnuat_private>... Sender ok
       >>> RCPT To:<jdempseyat_private>
       250 2.1.5 <jdempseyat_private>... Recipient ok
       >>> DATA
       354 Enter mail, end with "." on a line by itself
       >>> [[[ *****the email message itself***** ]]]^M
       >>> .
       250 2.0.0 f919MRu20878 Message accepted for delivery
       >>> QUIT
       221 2.0.0 cdt.org closing connection
    
    At this level we've discovered some new "addresses", like
    <gnuat_private> and <jdempseyat_private>, as well as some "host names"
    like toad.com and cdt.org.  We're starting to get into the human-
    readable stuff here.  The duration of the SMTP connection is the
    same as the duration of the TCP connection, a few seconds.  The
    "contents" of the SMTP connection is the email message itself.
    
    Note that essentially ALL of the information being conveyed in the
    above exchange, except the email message itself, is "addressing and
    signaling" information if you define it in this fuzzy and
    non-layer-specific way.  We aren't even done with levels yet!  But
    even to get to where my machine is about to start sending the actual
    text of my email message (just after receiving the "354 Enter mail..."
    line above), my computer and CDT's will have sent and received twelve
    TCP segments, each contained in an IP datagram, each contained in an
    Ethernet packet, each one a series of bits, encoded as voltage
    variations in twisted pairs of wire.  ***ALL OF THE CONTENTS*** of
    those twelve packets will be "addressing and signaling" information
    for one layer or another.  No email has been sent yet, all the stuff that
    preceded it was just signalling!  Or, looked at from a lower layer, ALL of
    the contents of those packets will be "content", the information that
    is the whole point of the communication, protected against a prying
    government.
    
    So now, stop and answer for yourself the question: Does looking at
    those twelve packets reqire a wiretap warrant, or a pen register
    order?  If you guess wrong and intercept somebody's packets without a
    warrant, you are breaking the law and violating that person's
    Constitutional rights under the Fourth Amendment.
    
    Here's an even more interesting question: Does looking at the
    thirteenth packet, which contains the beginning of the SMTP layer's
    "content", (the email message), require a wiretap warrant, or a pen
    register order?
    
    Here comes a clue.  Stop here, and think about it before reading the clue.
    
    The clue is this: There's more "addressing and signaling" info to
    come.  The more ways you can "think about" this same set of
    information being conveyed, the more ways you can find to call a
    larger and larger fraction of it "addressing and signaling".  This is
    the scam that the DoJ is silently pulling on us, the public, and on
    Congress.
    
    OK, the next level is the EMAIL MESSAGE FORMAT, called RFC 822.
    This defines the basic structure of email messages, consisting of
    a "header" and some "text".  The header is familiar to every email
    user; it contains lines labeled with a word or phrase, a colon,
    and some more information.  An example:
    
       Date: Sun, 30 Sep 2001 04:10:25 -0700
       From: John Gilmore <gnuat_private>
       To: Jim Dempsey <jdempseyat_private>
       cc: gnu
       Subject: Re: Advice needed: pen registers as applied to the Internet
       In-reply-to: <p043301d2b7dd85c84729@[10.0.1.15]>
    	
    The header ends with a blank line, and what follows is the text of the
    message.  (E.g. this sentence is part of the text of the email message
    that it's in).
    
    So, this "header" looks suspiciously like addressing and signaling
    information to me!  It says who it's from, who it's to, what the date
    is, what message it is replying to, etc!  The only thing vaguely
    resembling the "contents" is the Subject: line.  So, since that
    thirteeenth packet is likely to only contain this sort of addressing
    gobbledygook, I guess your answer about whether a wiretap warrant or a
    pen register order is needed had better be the same for packet 13 as
    for packets 1-12.
    
    (Indeed, the DoJ claims that today they are demanding this layer of
    information under pen register orders.)
    
    So, let's go up another level.  Inside the RFC 822 standardized text
    of the message, there's another kind of "standard communication" in
    operation.  My message starts with:
    
       Hi Jim,
    
    and ends with:
    
       I hope this helps,
    
    	  John
    
    Just like the headings on preprinted stationery, or a "fax cover
    page", this stuff sure looks like addressing and signaling info to me!
    Yes, even these things that a human directly typed into an email
    message are just stylized forms of addressing, just like the digits
    that a human directly dialed on a rotary telephone.  There's a
    legitimate claim that I intended to address my friend "Jim".  Not some
    dotted IP address, or some forgettable email address, that's for
    certain.  And "Hi" doesn't convey any information, it's just signalling.
    
    OK, up one level from this human-oriented addressing information, we
    clearly have "content".  Well, or do we?  There are human generated
    words in there, but it might well be that there are other levels of
    abstraction to our communication.  The communication is happening in a
    context.  Jim works at CDT, I'm on the board of EFF.  Those
    organizations have a long and detailed history.  The fact that I'm
    taking the time to respond to Jim's query in detail, at 4AM,
    communicates something about the current state of the relationship
    between the two organizations.  The fact that the message relates to a
    government wiretapping initiative also says something about what the
    two organizations feel are important issues to put our time into.
    This is all very relevant "content", of the sort that a CIA analyst or
    a prosecutor might well impute into an intercepted message, but which
    is only implied by the actual text of the communication.
    
    When my ex-girlfriend of years ago phoned me on the night of September
    11th, the real message wasn't what she said; the real message was that
    when the world looked shaky and strange, she thought to call me.  The
    actual words we exchanged were merely signaling information.
    
           --
    
     > The current terms of the statute are not very clear (facility,
     > signaling), but the new ones would be just as vague.  There is
     > concern that "addressing" could include URLs, which can identify the
     > specific page visited or the titles of books browsed or search terms.
    
    You probably don't want me to go through a similar web browsing
    example in detail, but it consists of a similar set of layers inside
    layers inside layers.  What is content at the Ethernet layer is
    addressing information at the IP layer (e.g. an IP address).  What is
    content at the TCP layer is addressing and signaling information at
    the HTTP layer (e.g. a URL).  What is content at the HTTP layer is
    addressing and signaling information at the HTML layer (e.g. a frame
    for holding web pages).  What is content inside the frames may be
    signaling and addressing information about how to arrange images on
    the screen.  What is content inside a JPEG image file may be tags that
    specify who created the image, what program manipulated it, the serial
    number of the camera on which it was taken, etc -- addressing and
    signaling information.  Eventually, you get to the bits of the image,
    which are then interpreted at a different level, perhaps as a series
    of letters or perhaps as a photograph of a building exploding.  Both
    of these have other layers of meaning; those airplane crashes were
    just signals to the US, really; and the light and dark spots that make
    up the shapes of letters in an image are meaningless signals, until a
    human or an OCR program abstracts them (at another layer) into words
    and concepts.
    
     > Is my proposal better?  Does it respond to DOJ concerns as outlined
     > below?  (At some level. it is impossible to respond to DOJ's points,
     > since DOJ, typically, argues that it wants new language that covers
     > everything it is already getting.  Basically, it is asking the
     > Congress to amend the statute to authorize what is already happening,
     > but in terms vague enough so that DOJ can argue that further
     > information is covered as technology changes.)
    
    It is easy to respond to DoJ's points -- but not by acquiescing.  By
    insisting that what they are doing today is utterly illegal, and that
    what they seek under the law (the legalization of what they do today,
    plus wiggle room for more later) is utterly unconstitutional.
    
    Let's see:  The response of the federal police force when an emergency
    arises is to IMMEDIATELY BREAK THE LAW AND THE CONSTITUTION in an
    umistakably massive way.  Why should we, or Congress, give such people
    any deference or any support?
    
     > In some sense this is my question:  is it possible to describe
     > addressing information in a way that covers "www.cdt.org" but nothing
     > thereafter?
    
    So, Jim, in summary, the answer is no.  Once you open the barn door to
    fuzzy layers of abstraction, the interpretation will be fuzzy.  And
    the history of FBI moves to push for more and more is well documented.
    Even with a very clear abstraction about telephone calls and digits
    dialed before the call begins, they now use pen registers which record
    the digits dialed AFTER the call began.
    
    Now you tell me that TODAY they are recording not only phone numbers,
    not only IP addresses using pen register warrants, but the full
    contents of email headers except for the Subject line??!!  And that
    this bill is trying to legalize this practice before someone calls
    them on it in court?  The right answer is to CALL THEM ON IT,
    IMMEDIATELY!!!  Do not agree to ANY wording that even comes close to
    legitimizing this completely illegitimate violation of both the
    statute and the Constitution.
    
    Face it, judges aren't technologists, don't know the paradigms
    involved, and the FBI just wants to catch as many people as possible
    and to hell with civil rights for "perps".  Wasn't it Ed Meese who
    put it most succinctly, they "must be guilty if we suspect them of
    something".  Both of them need the statute to have an utterly clear
    and utterly defensible line beyond which they cannot cross.
    
    The only way to preserve ANY bright-line test between the contents of
    a communication and the "addressing and signaling info" is to tie it
    directly to a physical layer of a physical information switching
    system (*see below) that they DO understand.  Which was what the
    original wiretap law tried to do.  The CALEA made the same call: it
    applied ONLY to the physical layer connectivity provider, NOT to any
    higher layer providers, and only applied to uninterpreted telephony
    signalling.
    
    We should SUPPORT these efforts, that stretch over decades, to restrict
    pen register orders to merely apply to telephonic dialing information.
    What the FBI is doing today is absolutely and completely illegal,
    and we should not lift a finger to make it become legal; in fact
    we should strain with all our might to prevent them from getting away
    with it.
    
    If someone wants to tap my Ethernet (or my fiber, or my leased T1
    line, or my IP router, or my web browser, or my email) and pull ANY
    information out of it, then they are going to have to convince a judge
    that they have probable cause.  The Constitution demands no less.
    The same is true if they do it at my ISP (the Fourth Amendment protects
    people, not places).
    
    I hope this helps,
    
    	John
    
    PS: (*) You will run into serious trouble if you try to tie the
    legislation to non-physical layers.  The conceptual layers can be
    layered in many interesting ways.  For example, email that goes
    between my site and my collaborator Hugh Daniel's site over the
    Internet goes through an additional couple of layers, because we have
    set up a "Virtual Private Network" between our two sites.  Rather than
    the IP datagrams being held in Ethernet packets, they are encrypted
    and transmitted in the "content" field of ESP (Encapsulating Security
    Payload) packets.  These ESP packets are contained in the "content" of
    larger IP datagrams.  Thus when an email goes from me to Hugh, these
    layers are easily visible:
    
    	text
    	headers and text
    	SMTP
    	TCP
    	IP
    	ESP
    	IP
    	Ethernet
    	bits
    	voltages
    
    Suppose just for jollies that you tried to legislate that judicial
    warrants were not needed to get "IP addresses" out of "IP datagrams".
    There are IP datagrams at two levels of the above stack, and one of
    them is fully encrypted.  Would this mean that the FBI could come with
    a non-judicial order to my company and demand to have access to the
    addresses in the upper-level encrypted IP datagrams I'm sending, which
    they could not obtain at a phone company location due to the
    encryption?
    
    By the way, the Freedom network by Zero Knowledge, as well as the US
    Navy's own 'Onion Routing' protocol, protects online anonymity in a
    similar way.  They encrypt the higher level IP addressing information,
    while squirting the encrypted packets back and forth among a mesh of
    cooperating anonymity routers (using another layer of IP packets to
    carry them).  Since the Supreme Court has ruled in no uncertain terms
    in the last ten years that anonymous speech is guaranteed by the First
    Amendment, should the DoJ be handed the power to demand that that
    anonymity be breached, without even a judge looking over their order?
    NO.
    
    PPS:
     > And will IP telephony use IP addresses the way we now use telephone numbers?
    
    Nobody knows, since IP telephony is only used in niches today.  My
    guess would be no, because peoples' IP addresses change all the time,
    depending what network they are plugged into, dialed into, what
    internet cafe they walked into, or what cell system they roamed into.
    Higher level abstractions such as email addresses, URLs, or user names
    will be the common way to reach someone by IP telephony.
    
    ---
    
    Date: Mon, 8 Apr 2002 22:15:19 -0700
    From: Tom Perrine <tepat_private>
    To: declanat_private
    CC: politechat_private
    In-reply-to: <5.1.0.14.0.20020409063451.02246b20at_private> (message from
    	Declan McCullagh on Tue, 09 Apr 2002 06:53:49 -0700)
    Subject: Re: FC: More on UK firms can't police personal email during office
       hours
    X-Organization: San Diego Supercomputer Center, San Diego, California
    
     >>>>> On Tue, 09 Apr 2002 06:53:49 -0700, Declan McCullagh 
    <declanat_private> said:
    
         Declan> Previous Politech message:
         Declan> http://www.politechbot.com/p-03356.html
    
         Declan> ---
    
         Declan> Date: Tue, 09 Apr 2002 01:12:10 +0000
         Declan> From: Jeremy Barker <jeremy.barkerat_private>
         Declan> To: declanat_private
         Declan> CC: CBeckat_private
         Declan> Subject: Re: FC: UK firms can't police personal email at work 
    during
         Declan> officehours
    
         Declan> There's a fundamental difference between what employers want 
    to do (look at
         Declan> the contents of e-mail their employees are sending and receiving)
         Declan> and what the government wants to do (record nothing more than 
    the to and
         Declan> from addresses of e-mail and the time it was sent or received).
    
    The following discusses US law and US law enforcement.  UK law is
    likely different, but I haven't been able to find the right laws
    online.
    
    Note that the FBI has stated that Carnivore has been offered to LE
    outside the US.  I would not be surprised to see that the UK was
    offered the software, considering the good working relationship
    between the FBI and New Scotland Yard, as well as the UKUSA monitoring
    agreements.
    
    Actually, the (US) government often wants the contents of the email,
    as well as the list of URLs accessed, the IRC and other "chat logs"
    and lots of other "content" stuff.  Not just the To and From
    addresses.  (See below for more detail.)
    
    For example, Carnivore.  I saw it, I know (some) of what it is capable
    of.
    
    And even getting the "envelope" addresses is problematical.  Note that
    Carnivore, in some of its "pen register" modes, actually pattern
    matches (filters or triggers) on and captures the Subject line, as
    well as most if not all of the RFC 822 headers that are actually
    within the message itself.  It should be filtering on the RFC821 SMTP
    transaction (MAIL FROM and RCPT TO) which are the true envelope.
    
         Declan> Unfortunately a lot of people, perhaps deliberately, have 
    misunderstood the
         Declan> government's monitoring proposals which talk about "traffic data".
         Declan> "Traffic data" is legally defined as data showing the origin and
         Declan> destination of e-mail but people have been reading it as if it 
    meant "data
         Declan> within traffic" - which is legally termed "content" and can 
    only be
         Declan> monitored with special authorisation.
    
    Not exactly.  In the US, the standard of evidence required by a court
    to grant "pen register" access is substantially lower than that
    required for a "full content" or "Title III" search warrant.  "Traffic
    data" in the terms of a pen register is the telephone number, the time
    (and sometimes the duration?) of the telephone call.  There is no
    concept of origin and destination email addresses in the current "pen
    register" laws.  In the absence of competent guidance from Congress
    (and the Supremes) the FBI built (phone-number =
    email-address-headers) into Carnivore.
    
    The problem with interpreting the pen register laws for email is that
    the Supreme Court decision that set the lower standard for pen
    registers specifically mentioned that the "end points"
    (e.g. telephones) did not specify an individual, but a location and
    perhaps a list of people who might reasonably be expected to have
    access to the instrument.  Because the end point did not identify an
    individual, it was deemed to require a lower standard of protection.
    Additionally, because this was information (telephone number) that the
    communicant was required to present to the telephone company so that
    they could complete the call anyway, there was also a limited
    expectation of privacy.
    
    But the FBI (and most LE) has decided (as implmented in Carnivore
    "Classic") has decided that almost, if not all, email headers are "end
    points" or "phone numbers", as far as I could tell.  DCS 1000 may be
    different :-)
    
    This was just one of the problems cited by Bellovin, Blaze, myself and
    others in 2000 when the "Carnivore questions" first appeared.
    
    --tep
    
    -- 
    Tom E. Perrine <tepat_private> | San Diego Supercomputer Center
    http://www.sdsc.edu/~tep/     |
    
    ---
    
    To: declanat_private
    Cc: CBeckat_private, jeremy.barkerat_private
    Subject: Re: FC: More on UK firms can't police personal email during office 
    hours
    From: Matthew Francey <mdfat_private>
    Date: Tue, 09 Apr 2002 13:00:10 +0000
    X-UIDL: e0f4bd7291756a4f924e541c438d772f
    
    Jeremy Barker <jeremy.barkerat_private>:
    
     >There's a fundamental difference between what employers want to do (look at
     >the contents of e-mail their employees are sending and receiving)
     >and what the government wants to do (record nothing more than the to and
     >from addresses of e-mail and the time it was sent or received).
    
    What is the difference between government X that conducts pervasive,
    massive, email traffic analysis in order to find and kill all members
    of some dissident group, and government Y which observes the content
    of the communication to do _exactly_ the same thing?
    
     >Unfortunately a lot of people, perhaps deliberately, have misunderstood the
     >government's monitoring proposals which talk about "traffic data".
     >"Traffic data" is legally defined as data showing the origin and
     >destination of e-mail but people have been reading it as if it meant "data
     >within traffic" - which is legally termed "content" and can only be
     >monitored with special authorisation.
    
    If anything, Barker has woefully misunderstood the governments intentions:
    he is unaware of or ignores the fact that the plain existence of communication
    between people is generally more useful than the content, particularly
    if the content has been encrypted.  This becomes all the more true as
    the frequency of communication increases ... various contextual clues
    and other side-information are no longer communicated directly within
    the "content", and thus much more difficult to discern unambiguously --
    even if crypto-layers are peeled away.
    
    So while Barker is technically correct that there may be "legal
    difference" between traffic and content analysis, there is ultimately
    almost no _practical_ difference in the real world.  Why, then, should
    we be impressed by the "special authorisation"'s and other bureaucratic
    games the governments play to achieve plausible deniability?
    
    
    
    
    -------------------------------------------------------------------------
    POLITECH -- Declan McCullagh's politics and technology mailing list
    You may redistribute this message freely if you include this notice.
    Declan McCullagh's photographs are at http://www.mccullagh.org/
    To subscribe to Politech: http://www.politechbot.com/info/subscribe.html
    This message is archived at http://www.politechbot.com/
    -------------------------------------------------------------------------
    Politech dinner in SF on 4/16: http://www.politechbot.com/events/cfp2002/
    -------------------------------------------------------------------------
    



    This archive was generated by hypermail 2b30 : Tue Apr 09 2002 - 17:22:30 PDT