Re: [logs] NBS

From: Marcus J. Ranum (mjr@private)
Date: Thu Sep 02 2004 - 08:55:16 PDT

Jim Prewett wrote:
>How do you envision this being used?  I'm struggling with coming up with 
>an application.

This list recently had a protracted (and interesting) discussion of the
kinds of things that you might want to report. A lot of those revolved
around "top N instances of X" type reports or "new thing" type reports.
NBS is intended to be a general purpose driver for pre-computing both
of those types of values simultaneously. Back when I used to write
"top N instances of X" reports for my firewall(s) I did it the old way:
        - collect a bunch of stuff
        - sort | uniq -c | sort -r -n | head -N
that works fine but it turns out that it's a bit slow when your data
sets get large. And one of the reasons I believe a lot of people ignore
their firewall logs, etc, is because the data sets get large and it's
painful to wait for the multiple passes through "sort" to complete.
So NBS was written to give me that data as close to _instantly_
as possible. If you precompute event counts in a B-tree you
can just walk the tree and retrieve ordered results any time you
want 'em. Of course there is some computing cost associated
with maintaining the trees but it's far less than you get with a
SQL database.

So, how do I envision this thing being used?

First, a simple example: say you take your firewall and tell it
to log all PERMITted operations. Let's say you're logging
the source, destination, and service and let's suppose for the
sake of argument that your firewall logs them in some
relatively simple format, e.g.:


So what you'd do is write some little perl script to strip
off the 1st, 2nd, and 3rd fields and stick each field into
a different NBS database. Now what you're doing is
maintaining a list of: most frequently seen clients permitted
through the firewall, most frequently seen servers accessed
through the firewall, and most frequently permitted services
through the firewall. You might also build an NBS database
of source/dest/service.

Remember that since the NBS database just takes a single
string, you can just concatenate whatever you like and stuff
it in there, i.e.:
echo "$1 $2 $3" | nbs -d database -o output
if [ -s output ]; then
        cat output | mail -s "never before seen service/target combo" root

Now you can get an instant monthly report of top firewall user by
nbsdump -R -C -c 25 -d field-one-db
in this case you'd be less interested in detecting never before seen
entries and would mostly be using it for rapid counting. I wonder
if I should add a time range to the nbsdump utility so you can
tell it to restrict its output to things seen in the last X time? That
would save having to keep multiple databases.

Some of the databases (my guess is that firewall source/targets)
will change frequently so you might not want to get alerts
from those NBS databases. But services should (emphatically!)
not. Depending on the size of your network, internal clients
should not. You might want to wrap a shell script that matches
and divides addresses into "internal" or "external", etc.

Another use for NBS might be for looking at web server
logs. Suppose your web server is like mine and only
serves a number of static files and pages. You could
push your URL logs through NBS and get:
a) rapid popularity counts
b) alerts when someone tries to access a page that
        nobody has accessed before - you might detect
        a new worm or attack or type of spider

I see the NBS driver as kind of a back-end engine that can
be harnessed into a lot of different shell scripts for log analysts.
I hope it's useful to the community; I find it useful myself. :)

>Is this because of my view that an admin should obsesively document their 
>environment in his/her log analysis tool?

I think they should, too. :) I think admins should be able to answer
questions like:
"when did this IP first appear on the network?"
"what are all the IPs this MAC has ever been issued?"
keeping that kind of info in a lightweight, fast, simple
database seems like a good thing to me.

>I'm really cureous as to how you use this tool.

Me too, and thanks for asking!! I think the tool can be used
for all kinds of stuff. I'm hoping that people will come up with
ideas for it that I haven't, yet! :)


LogAnalysis mailing list

This archive was generated by hypermail 2.1.3 : Thu Sep 02 2004 - 10:26:37 PDT