I originally posted this as a series of tweets. With the state of Twitter, I decided to convert it to a blog post.
So… A floating point question.
How many items are expected in a set (Python’s set
, C++’s std::set
, Go’s map[float64]bool
, etc.) when I fill it with NaN
values?
This seems to work differently in different languages.
C++
If we run the following C++ code:
|
|
We get a single value in the set.
This is a bit unexpected, as NAN
is not equal to itself.
If we try and add more items to the set after the NAN
, we’ll also see that the set is effectively broken:
|
|
Returns 1
as well.
Note that if we first add non-NaN values to the set, it seems to work ok.
Python
In Python code, things behave a little differently.
Filling a set with the same NaN
object will give us a set with a single element:
|
|
While filling it with different NaN
objects will give us a set with multiple objects:
|
|
You see, objects in Python sets are only required to be hashable.
There is no requirement for implementing equality.
So, if 2 values are the same object (spelled a is b
, or id(a) == id(b)
), they’ll only appear once in a set
.
If they are not the same - equality (if possible) will be checked.
Totally expected behaviour.
Go
Go code seems to be the only one that behaves as expected:
|
|
We insert “the same” NaN
values, and still get 10 values in the set.
NaN Bonus!
Go is now getting a clear
function added to clear maps.
This is mostly required because there’s no other way to remove NaN
keys from a map.
More Words
Floating point numbers are weird. They are weird when they work correctly (thanks Go) and weirder when they don’t.
If you have any choice in the matter, never use them as keys. If you do, well, be prepared to have some interesting bugs.