This happened to me yesterday at work, and the reasons had me a bit stumped. The project I'm working on had a bug and the comments said it was (temporally) solved when they removed the optimization flag. We use the GNU C compiler, and optimization level 2 (-O2). This level of optimization is suppose to be fairly safe. So I was curious why this had any effect. After some research, I came across the following scenario.
First of all, this isn't the most efficient way to do rounding, but the problem isn't with the rounding. If you're clever, you've already seen that VALUE_B is a non-integer number below 1. However, VALUE_B is being multiplied up by 150 (VALUE_C), so it will end up greater then zero.
This function happened to be part of a high-resolution delay function, which is why it was so tricky to catch—tricky because it had worked until optimization. To figure out what happened, we have to go to the assembly level.
First, the assembly output if no optimization is used.
Our project doesn't use an x86 CPU, but nonetheless, I read Intel assembly just fine, and I can tell you this looks to be what I expect from this function. There is some math in the first 2/3rds, two jumps for the "if" statements and a call to "Delay". The problem is here, and quite early on, but it doesn't stand out. And it didn't stand out in the other assembly language output as well.
Now, the same function compiled with optimization set to level 2.
You don't need to know assembly to know something is wrong here. In fact, all that happens is the stack is setup, then cleaned up, and we return. This is a completely empty function. On our platform, the assembly was even shorter—just a return statement.
A clever reader would know the problem early on—it's the non-integer number. But what happened? How did optimization "break" this code? The answer is that the code is broken in both assembly versions. In the non-optimized version, "Delay" is never called. It's never called because "ticks" will always be zero, and it will be zero because of the first math operation. VALUE_B is zero, because it is a number below 1. All non-integer numbers resulting from a divide are truncated, or rounded down. Doesn't matter that it was then multiplied by VALUE_C, which would have brought the value above zero—the arithmetic was done using integers. A more careful read of the non-optimized assembly reveals the problem early on.
Register "ecx" is being set to zero (first two instructions). "eax" (the accumulator register) is being set to some seemingly strange number. Then the two are multiplied together. The result: 0 of course!
Since the result is 0 and always zero, the rest of the logic always happens the same way, no matter what input is given to the function. The compiler knows this, and when optimization is turned on, it gets rid of all that dead code and leaves one with an empty function.
Here's where things get tricky and rather project specific. This function was part of a high-speed timer, and had to calculate the number of wait cycles before calling an assembly delay function. The amount of delay time we were requesting was in the hundreds of nanoseconds on a system where instructions execute in a few nanoseconds. It doesn't take many instructions to achieve the desired delay. The non-optimized function is just as broken as the optimized function. In both cases, "Delay" is never called. However, there happened to be enough instructions in the non-optimized function to achieve the desired delay time. No one had ever noticed that requesting different amounts of time from this function always resulted in the same amount of time delayed—there are not a lot of places one needs a delay like this. Thus it was thought that optimization broke the function.
I didn't question the function too much at first. It had worked. I was looking for why it stopped working, not for what was wrong with the implementation. Right away, I saw the math that was sure to end up having a non-integer value. However, I assumed the last multiple must have pushed the value up, and thus why it could have worked. Before getting out the o-scope, I did do some calculations to see if the problem was a rounding error (the numbers in our implementation were obviously different), but didn't find anything significant enough to cause our problem. I went ahead and fixed the math so the rounding error wouldn't happen, went home, and forgot I had done made the change. The next day I went to produce the error and watch it on the scope, and didn't see a problem. I was running my corrected version, which did multiplication before division—thus no decimal numbers and (the reason I corrected it) no rounding errors due to truncation. After figuring out I had changed the function and restoring the original, I quickly found the empty function in the assembly output. But it wasn't until today I put it all together as to why. Once I considered that the function had always been broken, it just took a simple "printf" statement to prove it.The result, optimized or not, always: "Value: 0".
The lesson here: Always remember to think in integer math unless you specifically tell the compiler otherwise. There are a few things that could have made this function work.
Both of these cause the value to be evaluated as a floating-point type. However, unless you cast "VALUE_B * VALUE_C" as an integer, you will end up with floating-point math in the assembly, optimization or not.
The method I think is most correct is to arrange the math so multiplications happen first.
And while I'm at it, let's clean up that rounding code.
Although I spent the last two days working this problem out, most of my time was spent coding a better total alternative—not searching specifically for this problem. But this is a good example of what we embedded programmers do for a living!
On a positive note, I got to spend some time using my favorite oscilloscope. There is just something that feels good when you're sitting in front of a $20,000+ oscilloscope grab'n at numbers in the nanoseconds range. It's gota be a geek thing!