1) Use font-size instead of color This makes it easier to compare reference and test because the values don't change. 2) Actually sort the reference properly This unbreaks the test.