I want to chime in with a vote, and an example, of why exposing the internal data structures might be an acceptable idea. Lets suppose that I have a program that is going to do logging. It is a big program, and to simplify the logging, different parts of the program are going to log differently. For simplicity, I want to be able to create logging instances quickly and easily, so I have a custom logger that takes two arguments for the "core" information, plus some other arguements for the data to be logged. The code could look like this: static logcontrol_t loginit = { LOGCONTROL_INIT, /* this structure has never been used */ "Initialization Errors", /* human readable name for the structure */ LOGCONTROL_PRIORITY_ERROR, /* default priority */ LOGCONTROL_CHANNEL_INIT, /* predefined "initialization" channel */ LOGCONTROL_SOME_FLAG, /* set a flag, such as blocking, or encryption */ }; logcontrol_t logruntime = { LOGCONTROL_INIT, /* this structure has never been used */ "Runtime Errors", /* human readable name for the structure */ LOGCONTROL_PRIORITY_ERROR, /* default priority */ "Random Channel", /* user defined channel */ LOGCONTROL_NO_FLAGS, /* no flags are set here */ }; So now, when I call the syslogreplacement() function, I have a choice of two "styles" of log message to generate. One, loginit, is only available in a certain module of the program, but the other is global and can be called from anywhere. -----init.c----- init() { syslogreplacement(loginit,<something to syslog>); syslogreplacement(logruntime,<another something to syslog>); } -----end----- -----main.c----- main() { init(); syslogreplacement(logruntime,<something else to syslog>); } -----end----- So now, from init.c I can easily syslog in two different ways, but only one way from main.c. The next problem, is what if I want to override something? The only thing I can think of to override is the priority, so it can be defined as a special number and passed as an arguemnt to a slightly different function. syslogreplacement2(logruntime,LOGOVERRIDE_CRITICAL, <another something else>); The benefits of this system are that I can create a new "log" instance on the fly, even as a locally scoped variable in a for() loop. for (i=0;i<MAXINT;i=nextprime(i)) { logcontrol_t l={LOGCONTROL_INIT, "prime for loop", LOGCONTROL_PRIORITY_ERROR, LOGCONTROL_CHANNEL_KERNEL, LOGCONTROL_FLAG_EPHERMAL}; /* don't cache an fd in me */ syslogreplacement(l,<something to log>); } By having all the "guts" in a place that the programmer can get to them, he is encouraged to use them. We should learn from the mistakes of syslog() and its hardcoded "facility", by letting facilities be arbitrary stings, and providing lots of predefined (and REGISTERED to prevent collisions) facilities. One option I didn't complicate things with was the idea that we could have a "facilitiy" and a "sub-facility": logcontrol_t log={ LOGCONTROL_INIT, "Yendor Lives!", LOGCONTROL_PRIORITY_ERROR, LOGCONTROL_CHANNEL_NETHACK, "pet movement", /* a finer granularity of the NETHACK log */ LOGCONTROL_FLAG_BLOCKING|LOGCONTROL_FLAG_TCP }; There are lots more policy items that could go into logcontrol_t as well, such as automatic transmission of the hostname, uid, gid, local time, mac address, CPUid, current PID, i-ching hexagrams, etc. A downside, of course, is that if everyone can stick random things into the "facility" and "sub-facility" field, that we could have a proliferation of them. Or is that a downside? If the logging system was designed from the beginning with the idea that this will happen, then maybe it would cope just fine. Ok, so now I theororized what I think is a pretty cool logging paradigm. What about the actual data? I want data that I can easily program, and not have to muck with. Making an object that has to be created, filled, and then passed is too annoying. Instead, we have string keys and string datas (kinda like gnudb). syslogreplacement(log,LOGKEY_USERNANE,"root",LOGKEY_TEXT,"is a dork"); Since keys are just little strings, we can make them up as we goo. syslogreplacement(log,"koo-koo","kachoo"); Now I have free text! But a terrible varargs problems. The last argument of the call could be a special token. syslogreplacement(log,"varargs","sucks",LOGKEY_TERM); I admit, that isn't pretty, but assuming I don't make a mistake I have a pretty spiffy log system. Things are passed around as strings, and the underlying structure doesn't care much what they are. I can create channels, and facilities, and datatypes on the fly without worrying whether or not the implementors planned for them to exist. As long as the final logfile manages to preserve the key and data (and protect against quoting problems) then my data has structure, even though it is just text. If the analysis software wants to analyze the logs, it needs to know what the keys mean, but with a registered and populate default listing of keys, plus a design paradigm that expects them to be added freely, it shouldn't be a problem. On Thu, Dec 19, 2002 at 01:32:20PM -0500, Marcus J. Ranum wrote: > Darren Reed wrote: > >initlogging(name,options); > >logitems[0].type = STRING; > >logitems[0].value = "marcus login: from"; > >logitems[1].type = HOSTNAME; > >logitems[1].value = where; > >addlogmessage(logtype,priority,logitems,2); > > This API has problems - mostly because it's exposing > the internal data structure to programmers who will > either get it wrong or mess with it. Thus it'd be > impossible to change the structure in the future. For > all that the API I was suggesting was butt-ugly, you > could replace it completely without changing user-land > code since it's all done through calls rather than > direct assignments. > > >Maybe this is good, maybe it's bad, but it gets away from > >varargs and is hopefully clear about relationship between type and > >object data. > > Typing log data's a problem I think it's best to ignore. > Systems aren't going to always have the best information > and if they can't type it right we need to give them a > chance to send something else - whatever they have. Which > means that a lot of this stuff is going to get promoted > to strings eventually. So you may as well just make it > official and treat everything as string data since that's > where it'll wind up. How do you deal with a machine address > that is variously "amnesiac" 127.0.0.1 "127.0.0.1" and > "burfle.ranum.com" (not really in DNS) and "www.ranum.com" > (is in DNS) > > Must keep it simple and stupid or it'll be ASN.1 before > we know what hit us.. > > mjr. > --- > Marcus J. Ranum http://www.ranum.com > Computer and Communications Security mjrat_private > > _______________________________________________ > LogAnalysis mailing list > LogAnalysisat_private > http://lists.shmoo.com/mailman/listinfo/loganalysis -- William Colburn, "Sysprog" <wcolburnat_private> Computer Center, New Mexico Institute of Mining and Technology http://www.nmt.edu/tcc/ http://www.nmt.edu/~wcolburn _______________________________________________ LogAnalysis mailing list LogAnalysisat_private http://lists.shmoo.com/mailman/listinfo/loganalysis
This archive was generated by hypermail 2b30 : Fri Jan 03 2003 - 18:58:17 PST