Friday, 18 March 2011

Makefile Madness (probably part 1)

I've visited a number of alien planets now, and one of the things that always astonishes me is the way people will abuse Makefiles.

On some worlds where the happy optimistic colonists have accidentally built a fascist totalitarian regime, they'll have carefully churned out acres and acres of paperwork about the way that code has to be written. This often includes details down to the number of spaces one must place between tokens and how many blank lines must be in files and so on, and carries sanctions of death or worse for violation. These people still allow their makefiles to look like the demented native amoeba-analogues wrote them; the glorious code of the Spacefaring People's Republic will still be built by a ramshackle collection of recursive, copy-and-pasted makefiles full of the most eyewatering convolutions[1].

So, here's the first of a cut-out-and-keep guide on how to avoid your Makefiles being dodgy.

It's a simple rule; "Make has a looping construct. Don't use shell loops, use Make's"

I'm forever seeing makefiles who carefully construct a list of stuff -- often files -- and then in a target do something like this;

           for i in $(foo-target-inputs); do dosomestuffto $$i; done

Now, the problem with this is that if anything goes wrong, it tells you at best that it failed while building foo-target. In general though, it will just succeed -- as in the example given. Why? Because for loops return true.

So when Make has completed executing foo-target, it may or may not have built it and there's no trivial way to tell.

Usually, what happens on alien worlds at the far edge of civilised space, is that some bright cadet figures that this might fail and hence adds checks to subsequent steps where they all look for the file they expect to exist and bail if it doesn't...

Honest to goodness, if you are in that place, it's time to stop digging.

I want to emphasise this isn't the ONLY thing wrong with this approach; it's not trivially possible (for instance) to run one single part of that "dosomestuffto" cycle. Even if you get it to fail, make will halt with an error report saying that the loop system as a whole failed. It won't tell you where.

Cadets, Junior Grades of the Starfleet Corps of Software Engineering, of course, will at this point add voluminous amounts of "Starting to do X!!", "Finished X!!!" type logging. Which makes the logs bigger, which means compiles take up more space.. and more time to wade through when doing failure analysis... and then people get into the whole "what can I prefix my log messages with to make them stand out best" competition that ultimately ends up with your compilation scrolling ASCII-art banners at you.

{It's amusing how often engineering cadet practices ultimately end up in the "try and be the loudest voice" trap.}

So what should people be doing?

Well, the simplest way of doing this is to realise you have the list of things to do, and make will quite happily iterate it for you.

foo-target: $(patsubst %,%.dosomestuffto,$(foo-target-inputs))

         dosomestuffto $*

That's nicely taken care of the iteration. Purists will argue that there's a problem here, because everytime foo-target is asked for, then the dostuffs will happen. That's true, but crucially that was true to start with -- solving that is actually a different problem.

If something fails, it will fail on a commandline with the variables substituted so you can even tell which dostuff operation failed. Your starship crew will thank you for that small bit of plumbing in the way starship crew always thank their engineers -- by finding another triviality to whine like hell about.

If you're really cunning about this, you can then start putting flagfiles into place. The obvious thing to do is store the logfile;

%.dosomestuffto: %.dosomestuffto.log
%.dosomestuffto.log: %
            dosomestuffto $* >$*.dosomestuffto.log 2>&1 || ( cat $*.dosomestuffto.log;false)

The bracketty junk at the end means (if I've typed it from memory right) that if the dostuff returns false, the second part will be evaluated which will dump the logfile and then fail, meaning the line fails. If the first part works lazy evaluation will skip the second bit and we just carry on.

Even better, you can test out a single one of the operations by saying "make myfile.dosomestuffto" and it will run it on its own.

The moral of this story is that if you have a stack of systems which include things like turing complete languages (make in this case) reasoning about systems which include turing complete languages (the shell commands), get the one at the top of the stack to do as much of the work as possible because it's always easier to talk to it rather than through it at one of the worker languages.

[1] The exception being anywhere that uses Qt. Sometimes they'll try and use makefiles, but it's the case that you can hypertravel to an alien world using Qt and from orbit make the single correct prediction that their build system will at best involve make being invoked by shonky scripting of some sort but more usually not involve lazy compilation at all. The reason for this is that although make is competely capable of doing both the dependency graph generation AND solving it, the recipes for this are non-obvious and for some reason the Qt people seem to actively discourage you from doing it; I personally suspect this is because they've always seen their development environment as some sort of "Visual Studio for the rest of the world" and are following the same strategy of pretending you get a "compile" button and that's the only option. Certainly as soon as you realise you don't have to use their "IDE", you don't and that's a moment as bright as when all three suns rise at the same time, so they've got some justification in not telling people how to leave.

No comments:

Post a Comment