These tests are intended to be run on cygwin on windows. It compares not just the run output of the compiled programs but the debug output of the compiler. This of course is useless if the order the files are traversed are different, but comparing the debug output has allowed me to find serious bugs in the compiler before, so I'm keeping it. The order of the printouts of the debug messages would also differ (as these are printed out from HashXXX structures from the program), but generally don't change as long as the AST traversals that generate the objects don't change too much. To handle these non-bug changes, just do a normal testrun using harness, and then do a normalizetests. This would copy over the output files to the default output files. Make sure that all the changes are just due to normal variation before doing a normalizetests.