Testing messy bash scripts

Read this book too!

I am reading the book “Refactoring” by Martin Fowler, site and just reek with ideas about improving software, as well as solving problems I head-banged during my “software” development career.

On my last job, there was this huge messy heap of bash scripts that were “The Installation” of their main software product. It was a remarkable amount of bad smelling bash code, and it somehow managed to work. My work, at the time, was from-scratch-rewrites of this or that functionality and then somehow plugging it into the existing framework (damn, i call it a framework now).

It could have been just amazing if I could take the existing pile of dirt dung bash scripts – and refactor it into something that is readable, sort of.

Today, while I was riding the bus, reading “Refactoring” pg.110, it struck me. It’s actually can be extremely easy to test bash scripts! All it takes is a collection of all the familiar commands, like “cat”, “rm”, etc … and one sneaky PATH environment variable. These commands would be fakes, stubs – they all just print their name and parameters into a log file. In fact, there is just one actual script and the rest are links to that single one.

That log file can be compared before and after a refactoring. These commands writing their names to a file take less time than their actual execution. So while running the actual script might have been hours, with the stubs it should run in less time, much less. Finally by comparing your pre and post refactoring log files, you get a really nice test-suite that can help you refactor. I might even call it replacing unreadable code with readable one without breaking much of anything.

In the particular case of the scripts that I mentioned earlier, I guess that the most used refactoring to improve readability would be Inline Method. Since whoever wrote the original scripts was too fond of wrappers, when most commands like “ln” don’t really need a wrapper to do the same as the command does perfectly well. The wrapper with at least two echo commands and a very long name is quite redundant and adds unnecessary complexity.

There are several problems with this simplistic approach though. One of the problems might be that the script is using full path in names (like /bin/ln), instead of relying on the PATH environment variable. But such things can be relatively easily taken care of until the testing solution is perfect. (I guess). One of the things to try, for example, is running bash in restricted mode, if I remember what that is correctly.

I’ll try that on some new messy scripts that I got on my new job!

Leave a Reply

Your email address will not be published. Required fields are marked *