Re: [logs] Signatures

From: Devdas Bhagat (devdas@private)
Date: Fri Aug 20 2004 - 10:23:04 PDT


On 20/08/04 09:51 -0400, Marcus J. Ranum wrote:
> [I updated the subject: line]
> 
> Stephen P. Berry wrote:
> >Well, part of this is obviously the heavily-entrenched `signature'
> >mentality which characterises the (vast) majority of system monitoring/log
> >analysis/intrusion detection/antivirus/spam handling products today.
> >Let us imagine that I have just ranted for several paragraphs about this.
> 
> :)
> 
> Let me point out that:
> a) you're right
> b) signatures have their place
> 
> "Signatures" have become a marketing buzz-word for "bad thing you don't
> want in a security product" to the point where vendors are bending over
> backwards to claim that their systems are "signatureless" for some very
> limited definition of the term "signature."  Most of them are, in fact, signature
> based anyhow.
> 
> A "signature' is a matching rule attached to a identifying diagnosis.

The term signature used in the IDS industry, as I understand the use of
it, is "A pattern of bits in a packet (or group thereof) that signifies
an attack". A similar definition is used in the antivirus industry.
Of course, I may be wrong.

A signature triggers an action, and what action should be triggered is
defined by the rule. 

In the sense that you defined it above, all detections are caused by 
signatures. Even heuristics are signatures, with more wildcard
behaviour.

>         Thus:   IF ( strlen(username) > 100 ) THEN PRINT ("user name too long!\n");
>                 port 25: /^DEBUG/ -> "SMTP Wiz attack"
>                 if more than 20 RSTs come from a machine warn("scanning activity?")
> 
> I saw one "signatureless" security product that counts the number of ARP requests
> from a given host that are not fulfilled and, if the number goes higher than a set
> value, concludes that scanning is taking place. That's not signatureless. The
> signature is:
>         IF ( bad ARPs > X ) THEN PRINT ("scanning from $src")
> 
> Anyhow, if you accept my definition of "signature" (it is admittedly broad) then
> you'll notice that it's ONLY through a signature that you can get ANY KIND of
> fixed diagnosis about the significance or type of event that took place. That
> is incredibly useful!!

The general point that you are making is:
IF (condition) THEN (action) ELSE (see_next_input);

The hard part, of course, is determining the condition.
The action part is determined by business needs (or should ideally be).

> True signatureless systems generate results like: "the ratio of SYN to FIN
> packets is 2 standard deviations from the norm for this time of the day."
> They leave it entirely up to you to figure out the significance. By the way,
> one could maybe even argue that knowing that there IS significance to
> the relationship of SYN:FIN packets might be a form of signature. If, in the
> example above, the system said:
>         IF ( SYN : FIN > X ) THEN PRINT ("possible SYN flood in progress")
> that's a signature.
> 
> Let's not be quick to dismiss signatures!! They're the only tool we have
> for encoding knowledge into our security systems.
> 
> That's 1/2 of the problem!! The OTHER 1/2 the problem is how to encode
> ignorance (anti-knowledge) into our security systems!!!!!!!

Ignorance is not anti knowledge. What we explicitly ignore is knowledge too. 
Rather than define what is bad, we define that everything except the
following patterns are bad. Negative matching, if you will.

> Nobody has tried this, yet. But what it someone tried to do "artificial ignorance"
> in an IDS: model what everything that's OK looks like and alert whenever
> traffic occurs that doesn't fire an "ignore this" signature. Note to readers; I
> hereby disclose this as prior art so if some idiot patents the idea, we can
> all point to this posting. ;)
> 
> >Pause for thought:  how many monitoring widgets (system monitors, log
> >monitors, IDSes, or whatever) allow you to enunciate a risk analysis or
> >threat model in their configuration?  My contention:  if you can't
> >enunciate such a thing, then the concept of `anomaly' is almost certainly
> >poorly defined.
> 
> You're 100% correct. This is where the industry is sloooooooowly heading.
> What you're describing is 'just' "default deny" writ large. Know your policy
> and alert to deviations from it. This concept is ancient in security and has
> been long-considered to be "too expensive" or "too hard" etc.

Ok, how would you get a generic system to enunciate a risk analysis?
Syslog makes a very rough attempt at this by classifying logs into class
and severity. Keep in mind that syslog was not originally designed for
security logs, but as a generic information mechanism for an
administrator to know what was wrong with the system.

"Alert to deviations from it" is nice in theory, but often enough there
are too many deviations to manage. Chatty network protocols, web
services, tunneled services and protocols, TLs wrapped connections....

What we are trying to do is prioritise the deviations and then say "All
deviations below this level are acceptable".

Something like:

"Under normal circumstances, there should be 0.25 Mb/s of NetBIOS
traffic. This should be ignored in reporting. However, there may be 
spikes of 5 Mb/s occasionally. This should not be alerted, but should 
be recorded for further analysis. A spike of more than 5Mb/s or a
sustained increase in traffic over 0.25 Mb/s is to be alerted on. Both
the occasional spikes and sustain increase should be reported in summary
reports."

Now, what kind of logging system supports this level of input? Also,
the above example gives baseline numbers, but for a large number of
scenarios, the baseline data is not available, or varies a lot over
time. An example given in a much earlier post about where anomaly
detection systems break down was about a university setting. There will
be low activity on a host for 5 months, and then suddenly large amounts
of activity for the next month. This cycle is roughly consistent, but
the anomaly based detection system did not have a capacity to learn for
that long. 

> Here's a theory:
>         - It's reaching the stage where the costs are going to flip-flop
>         so that it'll actually be CHEAPER to know what's good than
>         to look for what's bad. This is thanks to the huge proliferation
>         of hack-ware.

It always was. The trouble was that connectivity was trumping security.
Now the costs of unbridled connectivity are becoming greater than the
costs of security and a loss in connectivity. So now organisations need
to decide on whether connectivity trumps security, or vice versa. Only
this time, the cost of connectivity is possibly higher than its benefits.

> >My point:  either you tell a widget what is Known Bad (with the assumption
> >that everything else is Good---or at least Acceptable) or you tell a widget
> >what is Known Good (with the assumption that everything else is Bad).
> 
> Right! 100% on the money!
> 
> Put another way: it's easier to know who your friends are, than to keep
> track of all your enemies IF and ONLY IF you have fewer friends than
> enemies. ;)

A few good friends are a lot better than many average ones :)
Documented, standardised protocols which can be proxied are good. There
are few of those :).

Devdas Bhagat
_______________________________________________
LogAnalysis mailing list
LogAnalysis@private
http://lists.shmoo.com/mailman/listinfo/loganalysis



This archive was generated by hypermail 2.1.3 : Fri Aug 20 2004 - 10:53:38 PDT