FC: Data miner replies to Politech, says TIA can ID terrorists

From: Declan McCullagh (declanat_private)
Date: Tue Dec 10 2002 - 12:04:40 PST

  • Next message: Declan McCullagh: "FC: TIA, Poindexter, and the Incredibly Shrinking .Mil Website"

    Previous Politech message:
    "What's so bad about Total Information Awareness? by Ben Brunk"
    http://www.politechbot.com/p-04234.html
    
    Gregory Piatetsky-Shapiro's bio is here:
    http://www.kdnuggets.com/gps.html
    
    ---
    
    From: "Braunberg, Andrew" <Abraunbergat_private>
    To: "'Declan McCullagh'" <declanat_private>
    Cc: "'KDnuggets Editor'" <editorat_private>
    Subject: FW: Can TIA work or how can you separate bad coins from good ones? 
    By repetition
    Date: Tue, 10 Dec 2002 14:52:45 -0500
    
    Declan,
         I follow the data mining industry fairly closely and am also a reader 
    of Politech. I passed on Ben's TIA concerns to Gregory Piatetsky-Shapiro, a 
    well known expert in the industry. His response follows. He is happy to 
    have you post it to Politech if you desire.
    
    Best,
    
    Andrew Braunberg
    Senior Analyst,
    Data Warehousing
    Current Analysis
    
    abraunbergat_private
    912/236-6912
    
    "Never express yourself more clearly than you think."
    --Niels Bohr
    
    
    -----Original Message-----
    From: KDnuggets Editor [mailto:editorat_private]
    Sent: Tuesday, December 10, 2002 12:16 PM
    To: Braunberg, Andrew
    Cc: editorat_private; Farhad Manjoo
    Subject: Can TIA work or how can you separate bad coins from good ones? By 
    repetition
    
    Andrew,
    
             thanks for the note.
    
    There are serious questions about whether TIA can work and how much privacy 
    it will erode.
    
    There is no doubt that TIA will produce some false positives.
    
    However, the statistical analysis given by Ben Brunk is very naive and 
    shows lack of understanding how the system might work. Data mining as a 
    useful technique has not been debunked -- all large companies are using it 
    every day.
    
    The whole idea of finding patterns is that with enough history and 
    repetition the suspicious patterns will stand out, despite noise and errors.
    
    Imagine that you have a thousand coins and that two of them are crooked 
    (i.e. probability
    
    of heads is not half but 1/4)
    
    If you flip each coin once, you cannot determine which one is crooked.
    
    If you flip each coin a twenty times, crooked coins will have about 3-7 
    heads, but a few dozen "false positive" honest coins will also have about 7 
    heads.
    
    If you flip each coin a thousand times, crooked coins will have about 230 - 
    270 heads, while honest coins will have 480 to 520 heads. So with a rule --
    
             number of heads < 350
    
    you will catch all crooked coins and no honest coins.
    
    How many times you will need to flip each coin to find at least one crooked 
    coin?
    
    That depends on the level certainty you want, the number of crooked coins, and
    
    how crooked is each coin.
    
    Applying this to terrorists, if there is ONE terrorist that does ONE thing 
    that is a LITTLE suspicious, no system cannot find it.
    
    However, if there are MANY terrorists that do MANY things that are STRONGLY 
    suspicious, the system will find them.
    
    How many is needed? We don't know, but that is one of the questions that 
    TIA wants to investigate.
    
    Of course the real system will be using much more complex reasoning than 
    what I presented above.
    
    Gregory Piatetsky-Shapiro
    
    President, KDnuggets
    
    The Source of Expertise in
    
    Data Mining and Knowledge Discovery
    
    At 08:30 AM 12/10/2002 -0500, you wrote:
    
     >>>>
    
    
    
    Gregory,
    
    Good morning. I thought you would find the attached analysis of TIA 
    interesting. It comes from the Politech mailing list which is moderated by 
    Declan McCullagh, who was until recently Washington bureau chief of Wired 
    and is now at Cnet. The list is very well respected in the civil liberties 
    community. I thought you might have some unique insight or might wish to 
    pass the discussion on to your wider readership.
    
    Best Regards,
    
    Andrew Braunberg 
    
    
    
    
    -------------------------------------------------------------------------
    POLITECH -- Declan McCullagh's politics and technology mailing list
    You may redistribute this message freely if you include this notice.
    To subscribe to Politech: http://www.politechbot.com/info/subscribe.html
    This message is archived at http://www.politechbot.com/
    Declan McCullagh's photographs are at http://www.mccullagh.org/
    -------------------------------------------------------------------------
    Like Politech? Make a donation here: http://www.politechbot.com/donate/
    Recent CNET News.com articles: http://news.search.com/search?q=declan
    -------------------------------------------------------------------------
    



    This archive was generated by hypermail 2b30 : Tue Dec 10 2002 - 12:12:52 PST