[logs] Re: Centralized Logging + large number of active hosts

From: Taneli Otala (taneli@private)
Date: Wed May 10 2006 - 10:50:58 PDT

A thousand Linux hosts, on syslog-ng (I'm assuming TCP connections), 
means a thousand ports open... Generally speaking you'll find subtle 
problems when you reach the thousand port situation. You would need to 
tune the host carefully.

Once the host gets the slightest hickup, you're losing data from all 
thousand hosts... not a pleasant thought. Imagine a denial of service 
(insider) timed with an attack -- that's what I would do if I wore my 
black hat...

Consider hierarchical collection... not a single wham-bam solution. 
Buffering would also be good. Basically, break the problem into smaller 
pieces. A thousand (and more) hosts into a single aggregation point does 
not sound good.

A thousand hosts, with 1 MB a day... Is only 11 kb/sec, if syslog 
traffic was evenly distributed... which it isn't... the peaks tend to 
easily be 100x the valley.
So, we're talking about 1.2 MB/sec sustained during the peak hours. Good 
news is, that this is fairly easy to digest. My experience is that about 
20MB/sec is what you can expect (sustained) from the file system.

NFS? Have not had good experiences trying to aggregate into a NFS 
server... I would use SAN (if I was serious), or local disks (with RAID 
or replication).
Putting everything directly into NFS introduces another network related 
point of vulnerability, plus it doubles (or triples) your network 
traffic - latter obviously alleviated with a switch.

LVS... with NFS? I would steer clear of that approach.

Keywords for design: traffic analysis, peak traffic flow, reliability, 
buffering, staging, sustained throughput, fail-over, fail-back, 
acceptable outage duration


ScottO wrote:

>Okay, so here is the current task I am working on and was looking to see 
>how people have tackled it, basically any ideas out there to ponder. Any 
>thoughts, comments, etc. will be appreciated. Thanks.
>Key Highlights:
>- Centralized logging setup for over 1000 Linux hosts.
>- Need it to be scalable to even more eventual hosts.
>- Estimate less than 1MB of data per host per day. Want to do 
>summarization with syslog-ng to reduce network traffic, to make this 
>even less.
>- Need it setup so that the network isn't saturated.
>- Rollout syslog-ng to the hosts, for using filtering etc.
>Two ways I'm considering doing the backend right now:
>- Potentially some sort of Linux LVS cluster with an NFS backend. So a 
>pair of Linux load balancers that will hand off the syslog data to 
>centralized syslog servers in a cluster, that then dump into some shared 
>NFS server/solution.
>- Or, maybe having distributed "collector" syslog servers that somehow 
>dump back to a central syslog server. So a distributed architecture 
>The LVS setup seems appealing to me for the scalability potential, but 
>not sure if it is overkill.  What I am currently most concerned with is 
>the amount of traffic over the network.
>Thanks for any help.
>LogAnalysis mailing list
LogAnalysis mailing list

This archive was generated by hypermail 2.1.3 : Wed May 10 2006 - 16:26:14 PDT