It might seem strange that we’ve included the probability of what doesn’t happen. Consider tossing a coin, and let x denote the occurrence of “heads.” Unless we have a one-sided coin (whatever that is), or a coin in which heads always occurs, we have to consider the other possibility, of not-heads, or tails. There would be no information conveyed in tossing a coin that always lands with heads showing—no news. For a system to have non-zero information, there must be at least two outcomes for experiments performed on it.
Thermodynamics by James Luscombe, page 179