I picked up three of these 9800 GTs at Frys on my way back from Maker Faire for $120 (after tax) each. Notice the connector on the right? It is component output, but I wonder if that can be converted back to composite? If so, we could run three projectors from 1 card..

Also, they come with molex to 6-pin power converters, so now *ONLY* the power supply wattage out to be an issue.

I was choosing between these and some 1024MB Geforce 9500 GTs which were about $75 before taxes. The 9800 has way more processing power, but I wasn't sure if the video memory would be a performance issue with a 3D desktop.

There used to be issues with video memory on Nvidia cards and a compositing window manager like Compiz, because each window needed to be backed by video memory. I just checked though, and it looks like it may have been fixed in this driver release, which is a bit older than even the old driver I'm installing on the final table.

Moment of truth

Compiz Plugin

My plugin may be doing the distortion now. I left last night when it seemed to be drawing a proper unmodified screen.

I need to reread my graphics book a little bit about display transforms and make up a simple homography to try. The homographies the plugin uses are just Opengl transformation matrices, nothing particularly special. I'm crossing my fingers and hoping most everything is in order now.


Multithreading blobd's networking

I partly multithreaded blobd's netcode tonight, which will give proper behavior for multiple clients. It sort of seems to run ok on my computer, in that it seems to kind of print interesting things on stdout when I wave my hand in front of the camera, but I have no idea what appropriate output should be. This is what it looks like when I wave my hand in front:
0x7fff5cdb9fb0BlobList size is: 33
0x7fff5cdb9fb0BlobList size is: 2
0x7fff5cdb9fb0BlobList size is: 1
0x7fff5cdb9fb0BlobList size is: 10
0x7fff5cdb9fb0BlobList size is: 0
BlobList size is: 1
0x7fff5cdb9fb0BlobList size is: 0
BlobList size is: 27
0x7fff5cdb9fb0BlobList size is: 0
BlobList size is: 3
0x7fff5cdb9fb0BlobList size is: 20
0x7fff5cdb9fb0BlobList size is: 2
0x7fff5cdb9fb0BlobList size is: 14
0x7fff5cdb9fb0BlobList size is: 0

I doubt that memory address should be there -- but does it just list blob counts now, not coordinates? If so, then perhaps mine is behaving correctly. I haven't committed it because I don't want to mess people up, and also because it is a huge hack.

Two big problems are that when a client goes away, there is no thread managing the camera video anymore -- it just hangs waiting for clients. The other big problem is that if two people connect, they are both grabbing frames which is probably not what we want.

I need to rework it I suppose so there is one thread handling the video (and Kevin will expand that lately), and N threads feeding the data out on the network.

I think I would like to rework the network API also. So if/when I get into this, it'll be a good segue into that task as well.



I got trimming to work on my laptop with Compiz:

Need to see if Xgl won't ruin it though..

.. and after some testing, it doesn't. There are some other caveats though. The regions I am blowing away still represent logical pixels, so if I remove them from the middle of the screen, I leave a giant hole.

On the other hand, I figured out how to draw DUPLICATE screen areas. So we could maybe draw duplicated images in the overlapped areas, which would look coherent (assuming rotation is good..) albeit being brighter as well.

And actually, this is WITHOUT a plugin. I found it after talking to another Compiz developer.


Compiz plugins

I've been spending all day digging further into the Compiz code. Cscope in Vim has been pretty helpful as far as enabling me to jump around and look at things quickly. I don't think it has fundamentally improved my process, only made me faster. I've basically been looking through the code that does the cube rotation to try and see where it sets up the cube geometry and texture maps the composited desktop.

I feel like I'm getting close, but there is a weird lack of OpenGL in spots where I would have expected it to be. Obviously, I need to read more..

Tonight I was talking on irc to the guy who wrote MPX. I've been talking to him periodically since maybe week 7 or 8 last quarter. He suggested I re-email the Austrians who did Compiz work similar to mine and drop his name. I also noticed that I had emailed the professor who probably just deletes all his email every morning anyways.

So I emailed the students who worked on the project instead, this time. I hope they come back with something interesting. I'm a little nervous because the whole project depends on me now since we cannot resolve the projector overlaps mechanically.


More Netcode

It is a good thing I threaded the netcode earlier. We probably need to use it soon.

I debugged a problem in blobd today. The apparent problem was that the server would crash when two clients connected, the server would crash upon one of them disconnecting. First, I will discuss networking and then I will address the nature of the crash.

On networking:
I was surprised that two clients could connect at all. If this were a TCP connection, that would not work. I suppose that in this case, the semantics of Unix sockets allowed two clients to share the same socket since they both did not have named client sockets. I am guessing it is even more accidental since the clients do not talk to the server at all right now.

The reason I did not expect it to accept multiple people at first is because for a connection, there are 2 sockets: the server socket and the client socket. The server binds an address to a socket and then waits for people to connect on his socket. When someone connects, the server gets a second socket file descriptor which is associated with the client: if he reads or writes to it, the operating system makes that interact with the client who connected.

So the model is that the server waits, a client connects, and the server uses that newly forged socket to talk to the other program. Normally, the wrinkle here is that at this point the server is no longer waiting -- so a new connection would be denied (or queued by the OS for the next time the server gets around to waiting.)

It appears that when using the Unix sockets, and maybe this is because the way it is written the clients all have the same "address" right now, if the second guy tries to connect after the server has taken the first guy, the OS just hooks it up anyway. The socket file is only an address in the name space of the filesystem -- really just an inode. It identifies a communications channel between programs which the operating system maintains, and it isn't completely unintuitive that this behavior may happen. UDP behaves in a similar way. I think both of these go away if the server chooses to inspect his peers's addresses. Anyway, that behavior is a pleasant surprise, I guess -- although I don't want to keep that behavior.

The better way to handle it, which my test code does, is spawn a new thread for servicing a client when someone connects. The main thread then returns to waiting for new clients. It isn't too bad in the way of restructuring code, and has two big payoffs: you can spread processing load better and you can more easily send different data to different clients. We will want to be able to send different data if we ever implement a window manager plugin which can inform the blob server about the window geometries of clients and thus which blobs they ought to receive.

The downside is that we have to have more consideration of data manipulation. There are now N threads reading the blobLists to send over the network and 1 thread writing to that list. Mutexes or semaphores would need to be implemented to synchronize those accesses so sane results come out. I stopped short of implementing semaphores in my test code because I would also need to implement some dynamic data generation too (it only sends the same blobList over and over).

On the crash:
With some pointers from Zach, I figured out the problem of the crashing. My test server happens not to crash, as Zach and Davide noticed, because only the service threads terminate and not the main program. Also because maybe they used my test client, which is friendlier and would not quite tickle this issue.

What happens is that a client goes away. The server doesn't know when a client is going to go away. So when a client goes away, the server is almost always in the middle of sending blob data over the socket. The server is probably inside, or will enter, blobList::send() which will write() to the socket. Because the client quit, his socket was implicitly closed (I doubt there was an explicit close() call!) Writing to a closed socket/pipe generates a SIGPIPE signal, which has the default action of causing the application to terminate. Suppressing the signal (right now I wrote a handler that tells us the signal happened and then returns) allows us to instead pick up the return code of the write() calls and inspect errno.

I think that is a better behavior for now, because in any case we want to recover from the error in the application logic and not in a signal handler. So now, if someone disconnects, it says something nice about it and then exits. A slight improvement (maybe I will write this now) will be to make it go back up to the accept() call -- essentially wiping the slate and waiting for a while new client. That would still mess up the current multi-client situation though.

I think the thing to do now is to multithread the client handling code and to also consider adding some synchronization messages like "I am ready for blobs", "I am leaving now" to the network protocol.


Threaded Netcode

Today I threaded skt_server so it can service multiple clients at once. I've tested it to verify that it can service multiple people at once, and (mostly) clean up after itself. I need to think harder about how to make it service people with changing data streams as time goes on. That is one of the more complicated concurrency problems.

I wrote it today because I think we may soon approach where we will want to service multiple clients at once, and it is either this or forking processes. The netcode has actually been upgraded a bit since it was put into blobd. It is a little bit simplified (could stand to be more simplified probably) and can be easily told to run over unix sockets or TCP.

I am wondering if soon we might start encountering cases where the protocol is inadequate as well -- Davide's program is encouraging since it appears to maintain an arbitrarily long blob subscription from the server.

It occured to me that it may behoove me, depending on where I get with Compiz, to work on a proxy from our protocol to Tuio. Tom is programming against Tuio I assume, and it will be a little rough for him if we don't deliver that. I was kind of hoping that our mouse driver, if we made one, would consume Tuio as well -- although in the short term, it is probably simpler to implement it with our current protocol.

On Compiz

The Compiz codebase is pretty intimidating. Last night I had the idea that I should begin documenting their code as I unravel it while trying to figure out what I need to do with it.

I don't think it is entirely a sure thing I can pull off all the screen correction effects that we want. I reread the paper of the people in Austria who did something similar, and for two paragraphs they describe their Compiz code at a high level. The description is helpful, but it may be a little bit beyond me. If not because of the concepts, because of the large (to me) codebase.

One thing I am especially worried about is pixel loss. It seems like if I straight away "trim" pixels, that they will get lost. Writing this sentence just now, I think I may have had an idea how that might work out though: Opengl texturing.

In the paper, it sounds like they create some number of polygons to represent their corrective surface and then texture-map the composited desktop onto that. I don't know the ins and outs of texturing, but maybe some of the Opengl options will let that texturing algorithm span gaps in the polygons w/o losing the pixels which would normally fall in. I do not really kniow, though. I'm still just unraveling the Compiz framework to let me get at that functionality, let alone brushing off my Opengl knowledge and taking a crack at the more advanced parts.

Another depressing part is the Compiz puts a pretty archaic set of constraints on the software versions in our system. To span a display across multiple GPU cores, we need something called the Xinerama extension. To use Compiz, we need something called the Composite extension. Unfortunately, right now, they are not compatible. Maybe if we were a year ahead in the future, they would work together in the "current" X server.

To get around that incompatibility, we use a third program: Xgl. Xgl essentially sits on top of Xinerama and beneath Composite so they work together. Unfortunately, XGL is deprecated so we can't easily use it past Ubuntu 8.04 (specifically, this is to say X.org server 1.4 I think.)

Ok, that is doable I suppose. But we are already installing boost libraries outside of apt for Davide, and this precludes us using MPX also: release-quality MPX only exists in the 1.6 X.org server.

I think XGL is the worst part of the setup because it kind of holds up everything else. But if we don't use XGL, we can't do the display correction across multiple GPUs because Compiz won't run because the Composite extension would be absent.

I think doing the display correction outside of Compiz would be an order of magnitude more difficult, because Compiz exposes this functionality and sits at the right spot in the graphics pipeline. To do the work in another spot would possibly mean rearchitecting many other components -- and I'm barely capable of doing it in Compiz anyway.

On Configuration Files

I meant to write this almost a week ago when people were more active in discussing the format of configuration files.

For one, I think that ideally we would drop our main configuration in /etc/blobd.conf by default, but have the path configurable with an optional command line argument. This is in keeping with Unix conventions. That is only the config file for blobd of course. Any client apps ought to have their own config file modulus default choices for the blobd socket path (/var/run/scimp.sock probably) and tcp/udp port numbers (I picked 42000).

It sounds like we've already picked our file format, which is kind of the one I preferred. I'd like to reiterate my thoughts on XML though.

In my opinion, XML is rather heavy weight. We would need to pull in a library to parse it because it would be too much work ourselves. This requires us to learn that other library then. Typically, the traversal functions in libraries resemble tree traversal functions. That may or may not feel like overkill to deal with when we may not have too many configuration options in the first place.

Plus, XML is very over-engineered for what we need. I get the impression that it is designed for generic data interchange, not just config files. So it has a mini-language for transforming our particular XML schema to another XML schema and validating files against our schema. These features are useful, but we would not use them. We would be living with the complexity of the system built to support those features, though.

I consider the human readability to be a tiny bit of a myth -- just a little. Yes, the files are ascii, but if you are doing all the validation above and you have a very complex schema, then it is NOT fun at all to edit by hand, in my opinion. I hate it whenever I have to do this for other tools. As opposed to a simpler key=value paradigm..

Key=value is what I was imagining for our format. Even if the above sounds negative, I'm overall neutral on the format. I would just ask people to consider the time to get up and running with XML vs the benefits. Seems like it kind of already happened.

Oh, and one more thing. In my compiler class, we learned (and I subsequently kind of forgot) how to use some nice tools for generating scanners and parsers. By now, Kevin has already pretty much written the parser already. With the tools we used in CS104, I can just write a couple files specifying the special characters of our "language" and the syntax for it, and the tools will generate C code to handle the actual parsing. There is alot of CS theory involve state machines with writing good parsers, and these tools (yacc/bison specifically) do it nicely. Jas' C++ book has the Yacc grammar for C++. Here is the one for C: http://www.quut.com/c/ANSI-C-grammar-y.html

I think it might be fun to go back and do it for our config file. But if Kevin already has something that works, then I would just move on until it became an issue. I'm sure there are innocent bugs in his, whereas a machine-generated parser is much less likely to be buggy.

Possible Calibration Programming Interface

Note: I am writing several posts tonight. I've had things to say saved up, but never quite sat down to write them.

On Sunday night, me and Davide implemented 1-camera calibration manually inside of his keyboard app. It was not too hard to work out the math, and it took less time than that to empirically discover our calibration parameters.

Earlier when I took my shower, I was thinking about how the calibration class might look software-wise. It may have been because me and Davide discussed calibration a little the night before, or because I had been thinking about Opengl for Compiz.

My thought is that it may be neat if the Calibrator/Stitcher/Whatever class wrapped or extended the blobList class. They could then expose an interface where we feed them some kind of transform. Then when we request blobs from them using the same interfaces we do now, the blobs pop out with the transformation applied. This is not very different from how Opengl behaves or how some of Compiz behaves.

Implementing a hardcoded Calibration in Davide's app validated this for me a little bit, because we are pretty much manually transforming all the blobs in a single spot in his code. Of course, the question is really whether it is useful to be able to have arbitrary transforms like that. If there is every only going to be one transform (the one to stitch/fix the cameras), then that can be hidden altogether inside something that wraps or extends blobList.



More to come.


On Networking

I really want to go to bed, but I felt I should write about this now since I probably will not be awake for a while.

I don't like the idea of using RTP or SIP. My gut as a network guy tells me it is a strange pick of a protocol. Honestly I don't know enough about the protocols to argue it down technically, although I have the RTP RFC open on my workstation right now. I'm not sure if I want to invest time researching the protocols very far.

Instead, let me point out a few things.

First, RTP and SIP are typically associated with Voip apps. Apparently some programs used RTP for video as well, but the dominant use is Voip I am pretty sure. I think it is maybe a little goofy to classify our touch data as multimedia. That could be a small discussion on its own, though.

What are our goals for choosing a protocol? Is it ultimate correctness and scalability? Or are we trying to just build something that works? We wouldn't even need a protocol before if we didn't choose to split the blob and gesture programs (I still think that is a good design choice).

If we wanted ultimate correctness, I think we would probably use a Tuio variant (which is apparently out of the picture) or think long and hard about a good protocol design ourselves (which we do not have time for.)

Since I think we are going for a standard of just building something "good enough", then we can put far simpler things on the wire. So why aren't we just doing the stupidest, rawest, simplest stuff over our socket? Choosing SIP or RTP doesn't totally save us from protocol design: we still have to devise a system of messages we would like to pass in the RTP stream -- some kind of protocol for messages, you might say. We just would not have to worry about error detection -- instead we get to learn some huge non-standard library.

If we are aiming for good enough, then we should just use TCP. Over loopback, there should be little or none dropped packets. Most of the latency from TCP is going to be from it managing reliability on an imperfect line, but when we talk to ourselves the line WILL be perfect. The only latency then is an extra few dozen bytes of memory copies, which hopefully is not that big of a deal. If we really reall cared, we could benchmark it ourselves and choose.

In contrast to RTP, TCP is definately simple to program against and the same programming interface will exist on any system we could imagine. Ditto for UDP. For UDP though, we would probably want some degree of error detection. Simply computing some kind of checksum of the data sent/received before we do something important with it would be enough.

At the very least, if/when we discuss this subject, we should do it in light of updated assumptions: how forward-looking should the design be, how important is latency, do we care very much about this iteration having good performance BETWEEN different machines or not, etc.