Personally, I find it painful that the compiler detects this kind of undefined behavior, and silently uses it for optimization, rather then stopping and emitting an error. In the printf example, the compiler could trivially emit an error saying "NULL check on p after dereference of p", and that would catch a large class of bugs. (Some static analysis tools check for exactly that.) Similarly, a loop access statically determined to fall outside the bounds of the array should produce an error.
To summarize at least one of them, the compiler doesn't really see it as “detecting undefined behavior and optimizing accordingly”. It sees it as doing the right thing for all defined behaviors. The sort of imprecise analysis it does lead it to consider plenty of possible undefined behaviors, many of which cannot happen in real executions. It ignores these as a matter of fact, but reporting them would not tell the programmer anything it doesn't know, and would be perceived as noise.
On the example for (int i=1; i==0; i++) …, the compiler does not infer that i eventually overflows (undefined behavior). It infers that i is always positive, and thus that the condition is always false.
How about using statistics/machine learning and showing these spam warnings only when people want them? Yes, it is hard, but this is not an excuse!
Besides, fixing spammy warnings shouldn't be more difficult than fixing actual spam! I mean, common, with spam you have intelligent adversaries and compilers haven't reached that level. Not yet, anyway...
Overflows are not undefined. They are overflows. Maybe I want to overflow on purpose. Your for loop (int is signed) will complete assuming the body of the loop doesn't manipulate i, and given enough time.
This has to be one of the most irritatingly pedantic aspects of C, as the vast majority of systems use 2's complement and so would overflow in the same way, but the compiler writers think it's an "opportunity for optimisation" and I think this ends up causing more trouble than the optimisations are worth. The only sane interpretation of something like
if(x + 1 < x)
...
is an overflow check, but silently "optimising away" that code because of the assumption that signed integers will never overflow is just horribly hostile and absolutely idiotic behaviour in my opinion. A sensible and pragmatic way to fix this would be to update the standard to define signed overflow, and maybe add a macro that is defined only on non-2s-complement platforms.
There is one solution that would keep the crazy semantics of C, but would still allow for 2's complement arithmetic to be well-defined when one wants to.
C99 defines int8_t, if it exists, to be a 2's complement signed integer of exactly 8 bits. Same for 16, 32, etc. The standard could very well define behavior on overflow for these (that is, turn them into actual types instead of typedefs), and leave int, long, etc alone. I think this would be a viable, realistic, solution. Integer conversions would probably still be a pain, though.
That makes sense (sort of). Better to use unsigned if you are trying to do modular arithmetic.
Signed integers have some weirdness attached. The number that's one followed by all zeroes in binary (INT_MIN in limits.h) is defined as negative, because the sign bit is set. But, the rules for 2's complement arithmetic predict that -INT_MIN == INT_MIN. So it's not a normal number.
Similarly amusing, INT_MIN / -1 will throw a "division by zero" on Intel CPUs, even though there isn't a zero anywhere in sight. INT_MIN * -1 is fine, of course (according to the CPU, even if not the language spec).
This is a good reminder that "undefined behavior" includes "doing what I want it to do".
It's also a good example of how undefined behavior allows for optimizations. The compiler is able to evaluate your expression at compile time rather than emitting a division instruction, even though this changes how the program behaves.
You can come up with specific cases and decide that they should be handled a different way. You're absolutely right that this NULL check elimination should generate a warning. But it's really really hard to come up with a general algorithm that correctly differentiates between "your code implies that this check can never be true, watch out!" and "your code implies that this check can never be true, so we correctly removed it during optimization".
For a trivial NULL-related example, the standard C free() function does a NULL check on its parameter. free(NULL) is legal and does nothing. A naive "does this check for NULL after dereferencing the pointer?" checker would therefore warn for this code:
printf("the pointer's value is %d", *p);
free(p);
To a human, this obviously shouldn't be warned about, while the other example should be. But how does the computer tell them apart? It's hard.
Super, super pedantic point: that probably wouldn't happen with your example because most of the time, free is defined in a shared library somewhere else, and the compiler wouldn't be able to inspect its code. Even if it's in the same source file, most compilers don't optimize across non-inlined function boundaries.
But! You made a good point, and it would apply to a function that did a null-check which was inlined. It's easy for us to imagine a function which 1) does a null-check, 2) gets inlined, and 3) is used in places in the code which dereference the pointer before calling the function.
While compilers won't optimise functions they don't yet know about, they do know the standard library and the respective guarantees and constraints. This includes removing calls to memset for things that are never read again (horrible for passwords or keys in memory, which is why there is a SecureZeroMemory or related function in operating systems) or other things.
Great example, however this does not apply to free - or I would be very surprised if it did. It is relatively common for people to override malloc and free at runtime, so I would be very surprised if the compiler treated malloc or free as a compiler intrinsic, and inlined the code. I would not be surprised, however, if the compiler used the semantics of malloc or free to reason about the surrounding code. The original point, however, was about inlined code leading to generated code that no reasonable person would write (null check after use). So I still think that would not happen for free.
I just tested clang, and free(malloc(42)); gets completely optimized out, as does free(NULL);. free(argv) doesn't, so it's not quite that clever, at least.
The standard library thing is interesting. Modern compilers actually understand that stuff a lot. I don't think they'll reach into the implementation, but they understand the guaranteed semantics. For example, they know that this is undefined behavior:
free(ptr);
printf("after free, it contains %d\n", *ptr);
And the nasal demons shall flow freely. The compiler will certainly know that there's a NULL check in there. However, the smarts are different enough that it will also know not to warn you about it. So, yes, this is an example that won't happen in reality, although there's no reason it couldn't.
Sorry, I don't understand what's so hard about this problem? Why not just emit a warning when the compiler exploits undefined behavior to make some line of code unreachable. By "line of code" I mean code that's written by the user, not code after macroexpansions, inlinings or whatever. So the warning would mean that either you have a bug, or you can safely delete some code. Both of these are helpful.
Technically the compiler doesn't exploit the undefined behaviour. It exploits the assumption that it cannot happen and thus it's free to assume everywhere that only defined behaviour happens. Which means, the optimisations are for optimising the defined cases with no regard at all to the undefined behaviour.
You'll notice in a lot of cases that the exploitation of UB looks different for the same cases with different compilers or even compiler versions. This is because the compiler doesn't see »Oh, UB, I can optimise that« but rather »In this case I can do this which remains valid for all defined cases«.
Also, as others have pointed out, even if the compiler would emit a warning, it would be way too much noise because such things happen all the time.
> Also, as others have pointed out, even if the compiler would emit a warning, it would be way too much noise because such things happen all the time.
How so? For example, this code:
printf("the pointer's value is %d", *p);
free(p);
would not cause a warning under my proposal, even if free() contains a NULL check. The source code contains no unreachable lines, only the inlined/macroexpanded code does. On the other hand, most "gotcha" examples proposed so far do have unreachable source lines, and would lead to warnings.
Can you give an example of useful code that contains unreachable lines before macroexpansion and inlining? What's wrong with emitting a warning so the programmer can delete the useless line?
> You'll notice in a lot of cases that the exploitation of UB looks different for the same cases with different compilers or even compiler versions.
That's OK. The problem is with each individual compiler deleting code without warning. If compiler X deletes a line of my code, then it should warn me about it. If compiler Y doesn't delete that line, it doesn't have to warn me.
Issuing an error or warning about this would flood stuff with warnings due to inlining/macros, you name it.
This happens all the time.
Basically, distinguishing between the things that are accidents, and things that are on purpose and expected to be optimized away, is very very very hard.
The unchecked doesn't add anything that makes it easier, and debugging functions and others rarely check things.
/* Assumes you have a valid foo */
int printfoo(struct foo *bar)
{
/* Print the main part of our foo */
printf("First field: %d\n", bar->first);
/* Get the substructure value */
int foosub = get_foosub(bar);
printf("Second field: %d\n", bar->second);
}
/* Doesn't assume you have a valid foo */
int get_foobsub(struct foo *bar)
{
if (bar != NULL)
return bar->second;
assert();
}
In any case, people have spent a long amount of time trying to make warnings like this work without massive false positive rates. It's just not easy.
If you know that p is not NULL then why are you checking for it? Either way, something is wrong here. Either you made a mistake with where you check for NULL, or you are performing extraneous operations for no reason.
Usually this sort of thing comes up in code that's the result of several rounds of function inlining. Nobody would write that kind of code by hand, but it arises from the indirect results of several chains of function calls. In this regard, it's a very important optimization to be able to perform.
Kent Dybvig's classic response to "Who writes that kind of code?" is "Macros do." Inlining has much the same effect as macros.
I can see how macros can throw a wrench into this, but couldn't compilers tell the difference between when someone wrote code that might have a bug and when it inlines a function that is a bit paranoid for that situation?
Yes, by allowing the optimizer the freedom to exploit undefined behavior, but coupling that with static analysis (or a stricter language semantics) that can catch bugs before they get to the optimizer.
> If you know that p is not NULL then why are you checking for it?
These dummy tests can happen if you have a lot of macros for example. It still valid C code as long as p is not NULL. A compiler generating an error on this particular example wouldn't be a correct C compiler.
The compiler can reject this code if it can prove that p can be NULL, but in many case that's impossible at compile time.
LLVM does something similar if you pass -fsanitize=undefined: it tries to insert code that will crash the program when it invokes something that has undefined behaviour. It cannot emit a compiler error, because it's perfectly fine to have a function with undefined behaviour in your program, as long as you don't call it.
I don't understand why you say "it cannot emit a compiler error" -- just because something is allowed doesn't mean it's impossible to emit a compiler error for it (c.f. -Werror). Why can't there be a different option to emit a compiler error whenever -fsanitize=undefined would cause the compiler to add program-crashing code? Personally I would definitely use such an option as I can't imagine a purpose for having undefined behavior anywhere in my code. If I have a function that's never being called then I either forgot to call that function somewhere, or I have a useless function that I should remove from my code. Edit: or perhaps I'm implementing a library -- regardless, I can't imagine why I would want to compile successfully with undefined behavior in my code.
Because that is undefined behavior for x == INT_MAX, and if the compiler is detecting undefined behavior and emitting errors, this would be an obvious candidate.
How about this simple function?
int Divide(int x, int y) { return x / y; }
This is undefined behavior for y == 0, or for x == -INT_MAX - 1 and y == -1. Shall it produce an error?
(Never mind why you're writing such simple functions. Imagine they do something more complex and just do the division or addition or whatever as part of their work.)
This will invoke undefined behavior if passed a string constant as its parameter, or if passed an array that doesn't have a 0 in it. There is no portable (i.e. without invoking undefined behavior) way to check whether the parameter is a string constant or doesn't have a 0, so it is impossible to assert away the undefined behavior for this.
Many real, practical, production-worthy C functions will invoke undefined behavior with some inputs. Turning undefined behavior into a compile-time error will cause virtually all C code to not compile.
"There is no portable way to check whether the parameter is a string constant"
Well, the compiler should warn you if you try to pass a (char const *) to a function expecting a (char *). Use -Wwrite-strings to make string literals have type (char const[]) rather than (char[]).
> Why can't there be a different option to emit a compiler error whenever -fsanitize=undefined would cause the compiler to add program-crashing code?
The simple answer is "the halting problem".
If you can build a compiler that knows with certainty what runtime behavior would result from any program (including whether undefined behavior occurs), then you could solve the halting problem, but the halting problem is provably undecidable. So such a compiler cannot exist even in theory for the general case.
Yes, you can template-match a bunch of special cases, but the user can always write new code that doesn't match any of your "known to be defined behavior" patterns but still executes only defined behavior. Guaranteed!
Nick Lewycky submitted this code:
#include <stdio.h>
#include <stdlib.h>
int main() {
int *p = (int*)malloc(sizeof(int));
int *q = (int*)realloc(p, sizeof(int));
*p = 1;
*q = 2;
if (p == q)
printf("%d %d\n", *p, *q);
}
This got my attention for a lot longer than the OP, because it maintains the surprising behavior (prints different values for * p and * q even if p == q) if you move the assignments inside the if-statement: http://codepad.org/PBUAgnQq
I'm told that a pointer passed to realloc has to be assumed to be invalidated, even if it's exactly equal to another pointer that you know is valid, but it's hard to wrap my head around that and I certainly didn't get that out of looking at the C89 standard.
Under what circumstances does it print different values? I just tried your code locally with clang on OS X, and I get `2 2` at all optimization levels.
If the compiler is indeed allowed to assume that the pointer passed to realloc() becomes invalid, then I would expect it to actually optimize out that entire if-check, under the assumption that the `*p` is undefined behavior, and therefore that `p == q` must never be true.
Getting different values for the print if you move the assignments inside the if statement suggests to me that a) it's assuming the pointers don't alias, and therefore b) that it assumes it doesn't have to read the values back out of the pointer when printing them but can just reuse the values it knows it wrote to the pointer. But if it assumes the pointers don't alias, then I would think it would assume that means `p == q` can't be true.
FWIW, inspecting the LLVM IR of `clang -O3`, I get the equivalent of the following:
#include <stdio.h>
#include <stdlib.h>
int main() {
int *p = (int*)malloc(sizeof(int));
int *q = (int*)reallocf(p, sizeof(int));
if (p == q) {
*q = 2;
printf("%d %d\n", 2, 2);
}
return 0;
}
Note how it removed the write to p and removed the read of the pointer value.
Here what it's done is assumed that because p == q, that means they alias, and therefore the write to p will be overwritten by the write to q, and that it doesn't have to read the value again to know what will be printed.
So the optimization here seems to be proceeding under the assumption that realloc() does not necessarily invalidate the pointer. And it behaves the same way with reallocf() as well.
It's undefined behavior, so the compiler is free to do pretty much anything it wants. It can always assume it's true; it can always assume it's false; it can omit code to return true 50% of the time.
In theory it can do whatever it wants. In practice, it generally just assumes undefined behavior won't happen, and will therefore assume that any conditions that could lead to undefined behavior are false (and prune any dead code that results from those assumptions).
The other thing that can happen is LLVM has the concept of a undefined value, which is distinct from undefined behavior. Undefined values may be unknown, but the compiler can assume that any possible value still results in defined behavior, and optimize accordingly. As an example, an un-initialized stack variable has an undefined value, but various operations on it may still result in defined behavior regardless of the value.
clang -O3 on linux amd64. I'm told that aliasing is unrelated to equality, and I guess the trick is that realloc's returned pointer is attributed noalias. I get assembly that does the check, does the stores, but omits the loads:
Omitting the loads doesn't mean it thinks the pointer is invalid, it just means it thinks the pointer doesn't alias. If it doesn't alias, and it's not volatile, then it can assume that what it just stored is what it would get by loading, so it can skip the load.
I guess the only difference between what you described and what I'm seeing is my clang recognizes that since p == q, then a store to one will overwrite the store to the other, and therefore it can skip storing the 1 and it can assuming reading both will return 2.
I've had a bug that seemed like time travel before. I was doing something weird with threading and unix pipes. Then I was trying to print out some debug information, but an unrelated string got printed out instead.
This unrelated string never should have been printed to the pipe in question in the first place (!), and also didn't even exist at the point where it got printed out - being calculated a few lines down (!!).
The issue went away when I fixed a seemingly unrelated bug (that didn't look like it involved undefined behavior at all), but it all still gives me nightmares to this day D:
Heisenbugs. They're devilspwan. I encountered something similar in VB.Net of all languages. Visual Studio Express has a bug in that adding a custom control to a Windows Form seems to drop the line from the auto -generated code that initializes that control. The error message said that it tried to assign a string to an integer or something, in code that executes way after the form is initialized. Took me ages to work that one out...!
This is a great article. I'm really enjoying some of the compiler optimizations I'm seeing. It's an area not oft explored for me.
However, I'm having a bit of an issue understanding what the compiler is doing here, at the beginning of the article.
If someone can explain, it'd be appreciated.
FTA:
A post-classical compiler, on the other hand, might perform the following analysis:
The first four times through the loop, the function might return true.
When i is 4, the code performs undefined behavior. Since undefined behavior lets me do anything I want, I can totally ignore that case and proceed on the assumption that i is never 4. (If the assumption is violated, then something unpredictable happens, but that's okay, because undefined behavior grants me permission to be unpredictable.)
The case where i is 5 never occurs, because in order to get there, I first have to get through the case where i is 4, which I have already assumed cannot happen.
Therefore, all legal code paths return true.
As a result, a post-classical compiler can optimize the function to
bool exists_in_table(int v)
{
return true;
}
I see might return true. Ignored. And never happens.
I think the idea is that since i=4 is ignored, and the loop goes while i <= 4, you can never reach the return false statement? That's my understanding, I'm just not confident on it.
Basically, since i=4 causes undefined behaviour, the compiler assumes that it can't possibly happen[1] and the only way that it can't happen is if one of the prior (i < 4) checks were true.
Therefore all legal code paths (ie all code paths that don't result in undefined behaviour) return true.
So it optimises it to simply return true. Because otherwise i=4 would have happened, but that's undefined, so impossible[1]
[1] but if it does happen, that's ok, because that would be undefined behaviour, which allows the compiler to do whatever it feels like anyway
if (table[0] == v) return true;
if (table[1] == v) return true;
if (table[2] == v) return true;
if (table[3] == v) return true;
undefined_behavior();
return false;
Since the compiler is allowed to assume undefined behavior never happens, and you certainly can't expect control to pass through undefined behavior and then do something reasonable like execute the next statement, the compiler can just assume that control exits the function before it gets to the undefined behavior, and the only way to do that is to take one of these `return true;` statements. It can't tell you which table element supposedly contains v, but no one asked for that, so all these `return`s serve equally well.
The mental model I'm seeing there is simple - the compiler is allowed to assume a 'contract' of "The caller will ensure that the function is called only such argument values that the undefined condition is never reached".
In this particular case, the 'contract' and the actual code imply that "The only allowed values of 'int v' are those that actually are found in the table"; for those values the function correctly returns true; and for all the possible 'illegal' arguments any and every possible behavior would be correct.
Articles like this and the three-part series about undefined behavior on the LLVM blog [0] ought to be required reading for anyone who still has the impression that C is "portable assembler".
The legitimate use of the notion that "C is portable assembler" is as a reflection of the purposes to which it is put. Unquestionably there can be an arbitrarily large gulf between the C code and the micro-semantics of the generated assembly. Though it doesn't stop there - there's always some space even between the machine code and what actually happens - arguably larger these days, with out-of-order execution and similar optimizations.
I've read this a couple of times. And I can't help shaking the feeling that a 'post classical compiler' is, to my way of thinking, broken.
The compiler should, again in my opinion, in the presence of undefined behavior simply spit out an error and say "Behavior here is undefined, fix it." that any compiler could recognize some undefined behavior in the way the code was written, and exploit that as an "optimization" boggles my mind.
I'm wondering if the exception is when the preceding ring_bell() function never returns? Then there is no undefined behavior since the line with undefined behavior is never reached.
So one could conclude that the compiler has to prove termination of ring_bell() before performing the optimization discussed, which is impossible for just any external function.
According to the C++11 standard, the compiler may assume that any loop terminates, so unless you mark ring_bell as [[noreturn]], the code will be assumed reachable.
Furthermore, when undefined behaviour is invoked anywhere within a program, the whole program is undefined.
Does this article mix undefined with implementation defined?
It's assuming undefined means the code can never occur (so it removed that code), but aren't most programmers assuming the code can occur but something weird will be done?
No, implementation defined means the standard said, This is allowed, but we do not specify what will happen. That's up to the compiler to decide. Undefined means This is not allowed. All bets are off.
The problem is that programmers probably assume that many things which are undefined are implementation defined.
An example of something that is implementation defined are struct layouts: the standard (I think) does say that the order must be the same as defined in the struct definition, but it allows the implementation to put as much space as it wants inbetween those values (for optimal architecture alignment.) Things that are implementation defined are typically going to be things which must happen, but if the standard defined exactly how, it would unnecessarily constrain the implementation (such as being able to perform optimizations).
That's not what that means. If a program has undefined behavior that means the program is allowed to do anything. If a program has implementation - defined behavior that means the compiler writer must decide a behavior and document it.
Those aren't different. The idea is entirely that "implementation defined" is vague enough to allow definitions such as "do whatever it takes to simulate the undefined behavior as never even having occurred". That opens up new optimization techniques as shown.
"Implementation defined" means that the compiler can do whatever it feels like, but that it must choose something specific and that this must be documented, and any program that invokes implementation defined behavior is perfectly well formed. "Undefined" means that the compiler can do whatever it feels like and it can be completely inconsistent and documented nowhere, and any program which invokes it is ill-formed, and programs can be assumed not to invoke it.
It would make things so much less error-prone if the undefined behavior only could affect anything touched by the undefined statement, in a cascading fashion, and only forwards.
So if you did the following:
int data[1];
int foo = data[1];
printf("Bar");
foo would be undefined, but you know that "Bar" would be printed regardless.
My question is: are there any legitimate optimizations that would be prevented by this?
Yes, the biggest problem would be that undefined behavior would have to allow the program to keep going. For example, if reading 'data[1]' may seg-fault (And there are valid situations it could), then the compiler would need to prevent that seg-fault or else "Bar" wouldn't print.
It's also worth noting that your trivial example would result in basically everything be removed, but most non-trivial examples don't do that. The most common 'optimization' from undefined-behaviour is that the compiler doesn't need to check for those conditions and can let whatever will happen happen, and that only works if it's defined in a program-wide anything-goes sense. If it's defined on a local sense, then if say 'data' was passed-in as a parameter instead of declared, the compiler would have to insert a NULL check to make sure no undefined-behaviour happens and the program doesn't crash (So that "Bar" prints). By defining undefined-behaviour like it is, there's no requirement for the compiler to do a NULL check, it can instead just assume the programmer will never let it happen and produce code with that in mind. Same thing with integer overflow and similar cases (Though things get a bit hairier there).
For the argument checking case, the compiler can turn the function into two functions, a wrapper function that checks arguments and calls the internal function, and an internal function that doesn't check its arguments, calling the two as appropriate. Then only export the wrapper function, but allow code that the compiler knows to not do things that might be undefined to call the inner function directly.
(This was actually how I'd always assumed compilers optimized publicly-accessible functions, and was quite surprised when I found out they didn't.)
If you're willing to get weird, you can even optimize it into one function with two entry points on some platforms.
Also, personally segfaults shouldn't exist, or rather not in their current uncatchable form. Everything that is potentially recoverable should be able to be caught. So the compiler would wrap the access to data[1] in a try/catch block, which doesn't hinder performance in the common case, while retaining "good" behavior in the bad one. (It can do so because it is not writing anything, just reading it.) Haven't ever used it, but look at https://code.google.com/p/segvcatch/ for something similar.
It's generally easy to catch segfaults. On UNIX-like systems, you can just set a signal handler for SIGSEGV. Other systems generally provide similar functionality.
The problem isn't that they're hard to catch, it's that it's virtually impossible to proceed in any sort of sane manner once a segfault has happened. You have no idea how much state got corrupted before the segfault actually happened. You have no idea what cleanup the functions currently on the stack expect to be able to accomplish before they return. You have no idea what kind of inconsistent state the data structures in memory are in.
If you're really lucky, everything is fine and you can keep on going. If you're not so lucky, stuff is corrupted and you just crash again the moment you try to resume, and again, and again, in an infinite segfault loop. If you're really unlucky, your program doesn't crash again, but proceeds with corrupted data, saving it out to disk and displaying it to the user and causing all sorts of havoc.
I actually helped out a little bit with a similar system:
Although instead of throwing an exception, it simply tried to proceed to the next instruction.
The whole thing was done as a joke for April Fools' Day, because it's a completely awful idea. Making it throw an exception instead of continuing immediately doesn't really make it better.
I agree in general that segfaults shouldn't exist, but your proposed solution is frightening. Segfaults shouldn't exist because the compiler enforces bounds checking, safe memory management, and other such things that ensure that your program never attempts to access memory it can't access. Once the attempt is made, it's far too late to do anything but crash.
Yes. One of the major ways undefined behavior lets compilers optimize is by outsourcing the proofs to humans. If the compiler detects undefined behavior, it can interpret that as a guarantee that the execution path leading into that behavior cannot happen. The simplest example I can think of is code that looks like:
I don't think the compiler is supposed to treat an entire function behavior as undefined just because a pointer was dereferenced without checking for NULL. The array indexing example may be valid, but the second one is probably not. It is possible to have a pointer to zero, and failing to check for that condition should not cause the compiler to assume the pointer is non-zero in subsequent lines of code.
If you have a function that references a pointer without checking, then it's valid for the compiler to assume that it is not possible for that pointer to be null in subsequent lines of code, that's the whole point - if it sees that the pointer is used without checks, then this implies that all your other code somehow ensures that at that point it won't ever be null, that you've made sure that null pointers are checked somewhere else.
The compiler often can't verify it (halting problem and friends), but it allows for nice optimizations by assuming that the code as written is actually correct, and the check was skipped intentionally.
It's not just the function that becomes undefined, it's the entire program. There's literally nothing that the compiler is supposed to do following undefined behavior.
That's the exact thing that the compiler is doing - if there are code paths that might access an array out of bounds, then the compiler is assuming that in runtime it actually never would happen.
With this assumption you might even deduce that
int f(int x) {
if (x != 42) undefined_behavior;
return x;
}
is the equivalent of
int f(int x) {
return 42;
}
since (according to the assumption) in runtime it would always be called in a way that doesn't reach the undefined behavior, i.e., with x=42. And, of course, the many similar assumptions about array boundaries, pointer nullabilities, numbers not reaching overflow, etc.
Does the optimization performed on `unwitting` require the compiler to determine that `ring_bell` will return (as opposed to calling `exit`) or is there something in the spec that allows it to assume that functions return?
Which seems like a good rule of thumb: when working on x86 and x64 and doing things with numbers that you think might overflow, do it with unsigned and cast back to what you need.
Since dereferencing 0 is undefined, the compiler can assume that a = 3 never needs to be executed??
But b may legitimately be 0 and then the second branch SHOULD be entered
How does this fit with what the author said? The compiler cant just go back and assume b is never zero just because it's being dereferenced, since the dereference is guarded.
That's why his last part doesnt make sense -- that even f you try to prevent a bad dereference, undefined behavior is triggered.
No, that's not true at all. You check before you dereference, which is completely legitimate. If b is NULL then the dereference never happens, exactly as it should be.
To invoke undefined behavior and strange optimizations, you'd need to rearrange the code a bit:
a = *b;
if (!b)
a = 3;
Here, the compiler can omit the if statement and its contents entirely, because b cannot be NULL, because the first line would invoke undefined behavior if it were.
A check for NULL before you dereference is always safe. It's when you do it the other way around that the compiler can start doing strange things.