On Configuration Files

I meant to write this almost a week ago when people were more active in discussing the format of configuration files.

For one, I think that ideally we would drop our main configuration in /etc/blobd.conf by default, but have the path configurable with an optional command line argument. This is in keeping with Unix conventions. That is only the config file for blobd of course. Any client apps ought to have their own config file modulus default choices for the blobd socket path (/var/run/scimp.sock probably) and tcp/udp port numbers (I picked 42000).

It sounds like we've already picked our file format, which is kind of the one I preferred. I'd like to reiterate my thoughts on XML though.

In my opinion, XML is rather heavy weight. We would need to pull in a library to parse it because it would be too much work ourselves. This requires us to learn that other library then. Typically, the traversal functions in libraries resemble tree traversal functions. That may or may not feel like overkill to deal with when we may not have too many configuration options in the first place.

Plus, XML is very over-engineered for what we need. I get the impression that it is designed for generic data interchange, not just config files. So it has a mini-language for transforming our particular XML schema to another XML schema and validating files against our schema. These features are useful, but we would not use them. We would be living with the complexity of the system built to support those features, though.

I consider the human readability to be a tiny bit of a myth -- just a little. Yes, the files are ascii, but if you are doing all the validation above and you have a very complex schema, then it is NOT fun at all to edit by hand, in my opinion. I hate it whenever I have to do this for other tools. As opposed to a simpler key=value paradigm..

Key=value is what I was imagining for our format. Even if the above sounds negative, I'm overall neutral on the format. I would just ask people to consider the time to get up and running with XML vs the benefits. Seems like it kind of already happened.

Oh, and one more thing. In my compiler class, we learned (and I subsequently kind of forgot) how to use some nice tools for generating scanners and parsers. By now, Kevin has already pretty much written the parser already. With the tools we used in CS104, I can just write a couple files specifying the special characters of our "language" and the syntax for it, and the tools will generate C code to handle the actual parsing. There is alot of CS theory involve state machines with writing good parsers, and these tools (yacc/bison specifically) do it nicely. Jas' C++ book has the Yacc grammar for C++. Here is the one for C: http://www.quut.com/c/ANSI-C-grammar-y.html

I think it might be fun to go back and do it for our config file. But if Kevin already has something that works, then I would just move on until it became an issue. I'm sure there are innocent bugs in his, whereas a machine-generated parser is much less likely to be buggy.

1 comment:

  1. I deliberately hustled on getting the parser done so that Eddie wouldn't spend time learning the XML library for pretty much the reasons you mentioned above.

    I programmed comments into the parser too, so the key=value format is actually pretty readable.

    I still have to write the output file interface, but that should be fairly easy, particularly compared to the task of handling calibration.