Testing messy bash scripts

Read this book too!

I am reading the book “Refactoring” by Martin Fowler, and just reek with ideas about improving software, as well as solving problems I head-banged during my “software” development career.

On my last job, there was this huge messy heap of bash scripts that were “The Installation” of their main software product. It was a remarkable amount of bad smelling bash code, however it somehow managed to work. My work, at the time, was from-scratch-rewrites of this or that functionality and then somehow plugging it into the existing framework (damn, i call it a framework now).

It could have been just amazing if I could take the existing pile of dirt dung bash scripts – and refactor it into something that is readable, sort of.

Today, while I was riding the bus, reading “Refactoring” pg.110, it struck me. It’s actually can be extremely easy to test bash scripts! All it takes is a collection of all the familiar commands, like “cat”, “rm”, etc … and one sneaky PATH environment variable. These commands would be fakes, stubs – they all just print their name and parameters into a log file. In fact, there is just one actual script and the rest are links to that single one.

That log file can be compared before and after a refactoring. It takes less time to output names and parameters to a file than to execute the actual commands. So while running the actual script might take hours, with the stubs it should run in less time, much less. Finally, by comparing your pre and post refactoring log files, you get a really nice test-suite that can help you refactor. I might even call it replacing unreadable code with readable one without breaking much of anything.

In the particular case of the scripts that I’ve mentioned earlier, I guess that the most used refactoring to improve readability would be Inline Method. Whoever wrote the original scripts was very fond of wrappers for commands like “ln”. Even though commands like “ln” don’t really need a wrapper. The wrapper with at least two echo commands and a very long name is quite redundant and adds unnecessary complexity.

There are several problems with this simplistic approach though. One of the problems might be that the script is using full path in names (like /bin/ln), instead of relying on the PATH environment variable. Such things can be relatively easily taken care of until the testing solution is perfect. (I guess). One of the things to try, for example, is running bash in restricted mode, if I remember what that is correctly.

I’ll try that on some new messy scripts that I got on my new job!


PS: Google docs rocks! (even publish-to-blog kinda works)

Leave a Reply

Your email address will not be published. Required fields are marked *