First, Djikstra is a computer science professor who made several theoretical contributions in the programming languages area (and probably many other areas :-). He is most famous due to his "invention" of computer related semaphores (which have nothing to do with traffic lights of any kind).
Second, in my opinion, the easiest way of considering the problem of thread protection is to view a multiple-threaded program as a PARALLEL program. Think of the problem with ALL threads executing simultaneously (really simultaneously) even though it is not the case, and you will see the annoying cases much more easily. (And let the linux kernel do its scheduling in the order he wants. Our parallel brain has a better way of exhibiting the problems.)
Third, there are two relatively different problems addressed by the use of semaphores and mutexes:
The latter is probably the easiest way of seeing the problem, and the most common case. Synchronization may also be an issue, but... well, let's speak of the data protection.