Re: [logs] Log Analysis for Law Enforcement

betat_private

2003-01-20T10:59:00 Buck Buchanan:
> In an ideal world, all systems would have WWV(B), GPS, and other
> radio broadcast time signal receivers built in along with network
> synchronization for those systems that can not sync to a broadcast
> signal.

I believe Best Practice these days is to have one or more timebase
references, either "local" references like GPS, or NTP repeaters
(Stratum 1 servers, I believe they call them) slaved directly off
such references (some of which are publicly available on the
internet). This is believed to give you absolute precision to within
a few tens of milliseconds once it has time to completely stabilize.

All other systems are synced off your time master.

> Also in an ideal world, there would be sub-microsecond delay
> between an event and the timestamp generation of the event.

I don't think absolute time precision in the microsecond range is
achievable on current commodity hardware.

> Reality is most systems are not synchronized, and there can be
> significant delays in timestamping events.  Timestamps in logs can
> be tampered with.

All the above are true. People who care about their timestamps can
synchronize their systems to within a few tens of milliseconds using
commonly-available tools and technologies, and that's believed to be
as close as is practical today; that's on the same order as the
jitter of latency in network propogation.

Re tampering with logs, if you wish to make that harder, use
encryption if needed for transit, and forward all your logs to a
tightly secured log server. There's no possibility in principle of
preventing a compromised logging host from injecting bogus entries,
but the logserver can at least apply its own timestamp and add its
own notation about the src from which the messages were received. I
don't know of off-the-shelf software used for this today myself, but
I suspect some of the enhanced syslogds offer this capability.

> I envision a time synchronization tool being given logs covering
> a minimum of several hours of logs from multiple systems that are
> supposed to cover the time period of interest.  The tool would
> scan the logs to look for events that would be common to two or
> more systems.  For any pair of systems, the tool would identify
> events that originate on each system and result in a log entry on
> the other.

I think, if this sort of tool is desired, the way to work it is to
deliberately inject timestamp log entries for the sole purpose of
establishing these relationships, rather than heuristically looking
for existing entries capable of proxying for this functionality.

> By analyzing these events the tool should be able to estimate the
> time difference and network delays with error bounds on those
> estimates.

I'd be very interested in learning that a log analysis tool would be
capable of learning anything interesting in this way; unless of
course the system clocks aren't synchronized with the best available
tools; and if they aren't, I have trouble seeing why the operators
would be expected to care about such analysis. A priori clock
correction really seems to be the more fruitful line, and if that's
done all such an analysis tool could discover is that network
propogation delay jitter (in which I include the network stacks of
the endpoints --- I'm referring to the irregularities in propogation
delays from user-space code on one machine, via the network, to
user-space code on another) is of the same order as any skew left
between the system clocks.

> Analyzing events several hours later to compute another time
> difference estimate would give some idea of the clock drift
> between the systems.

Squashing clock drift out of existence is what NTP and clockspeed do
best.

> Development of this tool would also require determining the
> characteristics of timestamping delays for various common
> operating systems to further aid in computing the uncertainty
> bounds on an event.

I suspect that the development of this tool would be impractical
except with systems having really decoupled clocks --- no NTP,
nothing like it. For folks without the interest to use such
commonly-available sync code, I'd expect them to be uninterested in
this loganalysis tool as well.

Simple manual timestamp rewriting is easy, trivial with timestamp
parseing/formatting code (e.g. perl module TimeDate or Date::Manip)
plus simple arithmetic. Automated analysis to determine clock skew
and jitter seems like a problem isomorphic to clock synchronization,
so why not fix the clocks if you need that code complexity anyway?

> The logs given to the tool to analyze should be given
> trustworthiness ratings relative to the other logs.

I think the easiest way to get a measure of clock precision included
in logs would be to have a cronjob that asks the sync code (ntp,
clockspeed) what the current skew looks like and logs that
periodically.

-Bennett