An unordered list of things I miss in Go

SkiFire13 · on Aug 17, 2024

> While I understand the reason for this (i.e.: to avoid developers relying in a specific iteration order), I still find it weird, and I think this is something unique for Go.

Rust's `HashMap` and `HashSet` also do the same with the default hasher (you can plug your own deterministic hasher though). The reason for this choice, which I think also applies to Go, is to be resistant against HashDOS attacks, which can lead to hashmap lookups becoming O(n) on average. This is especially important for maps that will contain data coming from untrusted sources, like data submitted in a GET/POST request.

I do agree though that an ordered map would nicely solve the issue.

masklinn · on Aug 18, 2024

> Rust's `HashMap` and `HashSet` also do the same with the default hasher

Technically no.

Rust does use per-hashmap seeds for its keyed hashes (where many languages use per-process seeds).

Go does that, will also randomise iteration for each iteration (it randomises the start offset of the iteration). This latter operation has nothing to do with hashdos mitigation, and exists exclusively to frustrate accidental reliance on iteration order.

znkr · on Aug 18, 2024

You would have that frustration anyway during toolchain upgrades if the map implementation changed from one go version to the next.

kibwen · on Aug 17, 2024

> I do agree though that an ordered map would nicely solve the issue.

To be precise, Rust does provide an ordered map in the standard library, it's std::collections::BTreeMap. However, iteration order is key comparison order, not insertion order.

yarg · on Aug 18, 2024

If you want insertion order you need something like a LinkedHashMap.

Java's had that forever, but it's not really a common use case.

SkiFire13 · on Aug 18, 2024

I personally prefer something like Rust's `indexmap` instead, which is basically a hash table mapping from key to an index into a `Vec` containing the values. AFAIK this is also the approach taken by C#'s Dictionary.

KMag · on Aug 18, 2024

I believe one rewrite of Python's dict was the first mainstream use of this sort of hash map as a default implementation.

I wish they provided a sort method to re-sort and re-index the vector to change the iteration order without the space overhead of creating a and sorting a separate vector/list of keys (or key-value pairs, depending on use case. You might want to change iteration order based on the currently held value).

masklinn · on Aug 18, 2024

> I believe one rewrite of Python's dict was the first mainstream use of this sort of hash map as a default implementation.

Technically pypy and I believe php implemented this model first, though it's probably most well known from cpython (for which it had been proposed first, by raymond hettinger).

lifthrasiir · on Aug 18, 2024

I believe PHP is one of the (much) earlier languages with them.

andrewshadura · on Aug 18, 2024

Tcl was much earlier, in fact.

rezaprima · on Aug 18, 2024

first time I know of LinkedHashMap is by reading a JSON library (Jason, iirc).

matwood · on Aug 18, 2024

If you spend any amount of time programming Java, reviewing this list is a big help.

https://www.geeksforgeeks.org/collections-in-java-2/

meling · on Aug 18, 2024

Since Go 1.23 you can do this: slices.Sorted(maps.Keys(m))

masklinn · on Aug 18, 2024

Ordering generally denotes the preservation of insertion ordering, and possibly the alteration of that order. Sorted iteration is a completely different beast (and generally you’d use a tree-based collection if you need that).

akira2501 · on Aug 17, 2024

> lookups becoming O(n) on average.

If you're using a central dictionary then it should be worst case O(log n), but the price you pay for that is attacker controlled allocated memory growth.

knome · on Aug 18, 2024

there are various ways to structure dictionaries. one was to essentially create an array with each entry pointing into a list of values. you hash the key, then add the key/value to the list in that slot. the hash attacks would ensure everything hashed to the same slot, forcing the dict to become effectively become an alist.

it would do similar to implementations that keep jumping X slots mod size looking for an open slot, using the hash as the start slot. since everything hashed to the same value, everything you stuck in the dict would have to walk over every slot that had been stuck in the dict, again effectively making the dict into an overly complex association list for all practical reasons.

coldtea · on Aug 17, 2024

>While I understand the reason for this (i.e.: to avoid developers relying in a specific iteration order), I still find it weird, and I think this is something unique for Go. This decision means that even if you don't care about a specific order, you will still need to sort the map before doing something else if you want reproducibility

Even if they didn't randomize, unless they also explicitly guaranteed stable map order across versions, you DO need to sort the map if you want reproducibility.

Because if you relied on it being conveniently stable within the same Go version, with no guarantees, your program would still be broken (reproducibility wise), if they changed the hashmap implementation.

dwattttt · on Aug 17, 2024

There was a very insightful comment here on HN a while back; someone noted how terrible some SAP UI for travel was, because it provided a huge list of destinations in random order.

The insightful comment said they were sure that whoever was implementing the UI noted the list of destinations was always ordered, so never bothered to sort out themselves. And I guess the data source never guaranteed it was ordered, it was just a coincidence, and one day they stopped providing the list in-order.

tialaramex · on Aug 18, 2024

Yeah, that's frequent card creation cause in our systems.

Relatively junior (e.g. graduated less than a year ago) engineer is tasked to make a List of Rooms. They write some EF, despite an actual formal course on SQL none of them seems to know the first thing about databases, but they can figure out EF apparently, the EF gives them a list of Rooms and it appears (at a glance) to be right so they commit their work and this ships to our users.

The people using the software ask why "8340B" is between "8340D" and "8340E" and the technical answer is "During database changes a year ago, the underlying data was reordered and that's where this record is" but that's not what they mean, they mean "Why didn't you idiots sort the data?" and the answer is well, junior engineer.

Of course with a layer of people to decide requirements by the time the card is in my next item pile it doesn't say "Just sort the EF query" - now it's about how we need to adjust the column widths, change to a new colour scheme, and add a checkbox nobody will use... oh, but also can this list be sorted ? So there's an excellent chance if another junior engineer takes that card the sorting gets overlooked and the users assume we can't (rather than don't) fix it.

kokada · on Aug 18, 2024

> Even if they didn't randomize, unless they also explicitly guaranteed stable map order across versions, you DO need to sort the map if you want reproducibility.

Author here.

The particular case where this behaviour took me by surprise was when I wanted to print all available options from a toy CLI program that I wrote. I defined the options in a map (that had a function handler as a value) and didn't care about the order at all, but at least I expected the print order to be stable.

So it took me by surprise when I ran the program 3 times and the order changed. First I thought I did something wrong, then I found the issue after some search.

Now, it can be argued that this behaviour was good because I ended up sorting the values before printing, and now they have a predictable order. But I still think this was that kind of code that I didn't have to write (thanks to how Go works, I need to create a separate list and sort it instead; I think it is better with iterators in Go 1.23 though).

> Because if you relied on it being conveniently stable within the same Go version, with no guarantees, your program would still be broken (reproducibility wise), if they changed the hashmap implementation.

Well, not in this particular case. I just wanted it to print it in a stable way between multiple runs of the program. Again, maybe this whole thing helped me write a better program. But I still feel that it is unnecessary, and would prefer to just use a orderedmap instead.

lelanthran · on Aug 18, 2024

> I defined the options in a map (that had a function handler as a value) and didn't care about the order at all, but at least I expected the print order to be stable.

But that means that you did care about the order. You may not have cared what it was, but you cared that it existed.

IOW, you relied on the order.

In any hashmap, it's a given that you cannot rely on the order. Some languages may explicitly give you those guarantees but that is separate from the prescription for 'hashmap'.

kokada · on Aug 18, 2024

I think you're nitpicking about semantics, but ok.

> In any hashmap, it's a given that you cannot rely on the order.

Again, any order was fine, as long it was stable (it could be even stable between compiled versions, if Go say, add a salt during compilation). The weird part here is that it changed between runs.

lelanthran · on Aug 18, 2024

> Again, any order was fine, as long it was stable

Understood, but that is not a guarantee of hashmaps. Hashmaps, by themselves, are a concept independent of any programming language.

So, sure, I get that your expectations of a hashmap's behaviour was set by some programming language $FOO, and that it is reasonable to expect that someone who learned on $FOO has the expectation that a hashmap has a specific behaviour.

But, that being said, after you learn that a hashmap is a concept independent of any programming language, and after you learned that that concept does not guarantee the behaviour you expect, it's signficantly easier to change your expectations for hashmap behaviour than to change the definition of the concept's guarantees.

pests · on Aug 18, 2024

It was stable for any one run :D

Is this just not human desires projecting extra requirements?

You say you don't care about order, but then you clarify that you do want them to be stable between runs.

Those are not the same thing. To a computer without judgement, any order is fine. The first run is somehow blessed by you, so any runs after seem off or weird. If it didn't actually matter to a computer, it wouldn't care if it was different each run.

kokada · on Aug 18, 2024

> It was stable for any one run :D

Not even that, because if you iterate multiple times you get random order in each different iteration:

    package main
    
    func main() {
      m := map[string]bool{"foo": true, "bar": false, "baz": true, "qux": false, "quux": true}
    
      for range 5 {
        for k := range m {
          println(k)
        }
        println()
      }
    }
    
    
    $ go run ./main.go
    bar
    baz
    qux
    quux
    foo
    
    foo
    bar
    baz
    qux
    quux
    
    qux
    quux
    foo
    bar
    baz
    
    baz
    qux
    quux
    foo
    bar
    
    quux
    foo
    bar
    baz
    qux

> You say you don't care about order, but then you clarify that you do want them to be stable between runs.

Being stable between runs doesn't mean I want a specific order. Slightly different things.

> Those are not the same thing. To a computer without judgement, any order is fine. The first run is somehow blessed by you, so any runs after seem off or weird. If it didn't actually matter to a computer, it wouldn't care if it was different each run.

If you want to argue about how computers work, remember that computers are bad at doing random things, this is why they need special hardware to do so.

The fact that Go goes as far to actually randomise each iteration is the surprising thing here, making iteration unstable, and not the fact that iterating between hash maps values are random.

lelanthran · on Aug 18, 2024

> The fact that Go goes as far to actually randomise each iteration is the surprising thing here,

I gotta be honest, having come to Go after having first programmed professionally for around 25 years, I have never run into this problem. My expectation for hashmaps was set decades ago, and it was set as "Thou Shalt Not Depend On The Order Of Hashmaps!" :-)

Having used hashmaps via some obscure C library in the 90s, I quickly came to realise that I couldn't rely on the order - simply adding an element and then immediately removing it with no lines of code in between those two calls was enough, in some cases, to completely mess up whatever order I thought there was.

My own OSS hashmap library, on github, specifically doesn't mention order, because I (maybe naively) assumed that everyone reaching for a hashmap already knows that the order isn't guaranteed to be stable at all, even if the hashmap is never modified by the caller in between reads.

kokada · on Aug 18, 2024

> simply adding an element and then immediately removing it with no lines of code in between those two calls was enough, in some cases, to completely mess up whatever order I thought there was.

Yes, this is what I expect too. But this is NOT what Go is doing here. Go is actually randomising the order EVERY time you try to iterate, even if you do NO manipulation of the items inside the map. See this comment thread for more details: https://news.ycombinator.com/item?id=41274850#41276274.

This is the surprising part, not that the iteration order may change if you add/remove items.

> My own OSS hashmap library, on github

Does your hashmap library, say, have a `iterate()` method that randomises the order of items every time it is called? If yes, ok, we are talking about the same thing here. But I am almost sure that it does not, and this is what Go is doing here.

lelanthran · on Aug 18, 2024

> Go is actually randomising the order EVERY time you try to iterate

Which is 100% perfectly within specification of the hashmap concept. It may not match your expectation of what a hashmap is, but it is what it is, and that's what a hashmap is.

IOW, there is no guarantee, and a library that randomises the order that it returns hashmap elements when iterating is perfectly correct!

kokada · on Aug 18, 2024

I never said once that the implementation is incorrect, just that it is weird and uncommon.

coldtea · on Aug 18, 2024

Yes, but the expectation for non-weirdness is not based on the actual specs a hashmap must have, just on contingencies ("but no other language I know does it like this") that apply to quite narrow criteria (iterating over statically added hashmap entries, if you added entries from a dynamic source, or added them in parallel, etc, they could very well change for every run without the language randomizing anything explicitly).

pests · on Aug 19, 2024

I agree with you but I do see his perspective.

"Hashmaps make no guarantee about iteration order" does not imply the implementation is going to do _EXTRA WORK_ to make that true. It being ordered, or providing a guarantee of some order, is still within spec of the hashmap concept.

To expend extra cycles to force no order, when any order is fine (esp if it falls out of the data structure), is extra work. Its an extra constraint, one a hashmap does not require.

I wonder how much energy we have collectively spent sorting lists for no reason.

kokada · on Aug 18, 2024

That is still a valid expectation, just look at other commenters on the thread that they also find the behavior surprising.

coldtea · on Aug 18, 2024

>Yes, this is what I expect too. But this is NOT what Go is doing here. Go is actually randomising the order EVERY time you try to iterate, even if you do NO manipulation of the items inside the map.

The parent's point is that this is still within the guarantees hashmap (the abstract data structure) gives about order: none.

All the rest are details that are based on your particular conditions, which are very very very specific:

- you only had a set fixed list of items, added always with a particular order (whereas for a program the items or order can also be totally dynamic, e.g. coming from a REST call, or a DB query of data that changes, or from parallel parsing something which doesn't guarantee any order etc).

- you only cared about specific runs within the same build or within the same Go compiler version to be stable. People who care for reproducibility normally care for reproducibility across builds and Go versions too. This "I care for reproducibility and order, but only very partially" is quite rare. E.g. if your tests depended on the order being stable, one would want the tests to keep working if Go changes their salt or hashmap algorithm or whatever in a subsequent version, no?

>Does your hashmap library, say, have a `iterate()` method that randomises the order of items every time it is called? If yes, ok, we are talking about the same thing here. But I am almost sure that it does not, and this is what Go is doing here.

That's an implementation detail that shouldn't concern the user though. Go just goes out of the way to instill this, whereas other languages/libs do not.

kokada · on Aug 18, 2024

> The parent's point is that this is still within the guarantees hashmap (the abstract data structure) gives about order: none.

This is not the point from the parent point, at least nothing else in the text suggests so.

> All the rest are details that are based on your particular conditions, which are very very very specific:

They're not "very very specific", they're based on my experience in other programming languages. And when I say this, I am not talking about Python (that has ordered maps by default), but languages like Ruby or Java.

> That's an implementation detail that shouldn't concern the user though. Go just goes out of the way to instill this, whereas other languages/libs do not.

I think this thread is going anywhere, again, I am not saying Go or any other hash map implementation that does the same is wrong here. It is just that it is uncommon, that I had "one case in the past that if Go didn't randomise I wouldn't need to write extra code" (and I didn't say this case was common either), and that I would like ordered maps in the stdlib not just because of this particular case but because they are a really versatile (and would help in this case, but I never said it was the only case either where ordered maps matter).

Keep in mind that I wrote this post on a sleepless night and didn't give much thought. My argument was never "this is the reason why we should have ordered maps in Go", it was mostly "I had this issue once, I think ordered maps are cool and if Go had them I wouldn't had this issue".

I am still surprised how much (good) discussion this post ended up generating considering how little thought I gave at this post.

fsckboy · on Aug 20, 2024

>My expectation for hashmaps was set decades ago, and it was set as "Thou Shalt Not Depend On The Order Of Hashmaps!"

"when chasing a bug during development, it is nice to have a program demonstrate the bug every time you run it with the same inputs and not randomise things to make the bug intermittent" is what people mean when they say they want something to "be stable" or "the same". It's not a crazy thing to want control over, and for some reason this whole thread yammers on and on and doesn't address this point.

> :-)

:-(

wizzwizz4 · on Aug 18, 2024

Without semantics, our programs are nothing.

randomdata · on Aug 18, 2024

> But I still feel that it is unnecessary

So did Go in the early days, which quickly proved to be problematic. Randomization was eventually added to try and call attention to buggy (in the typical case; yours may be an exception) code.

thcoldwine · on Aug 18, 2024

SQL does not guarantee the order, as well as go but we don’t complain about this in SQL :)

Some things needs to be accepted and I am actually glad go does randomization of the order — it makes programs more robust if the backing storage of the map will change in the future.

Having better data structures in stdlib will be beneficial, but we just got iterators so this will follow in 1-3 years time I believe.

I would prefer to not have generics in the language personally, but luckily they’re not that common in the codebases anyways.

garyrob · on Aug 17, 2024

Borgo is an interesting attempt to address some of these issues. I would love it to get real traction.

https://github.com/borgo-lang/borgo

0cf8612b2e1e · on Aug 17, 2024

This looks incredible. I used to write Go, but it always felt like wearing a straight jacket. Too many missing features relative to alternatives, but this fixes a big list of my complaints.

I am leery to tap into a new ecosystem, but the risk in this case might not be terrible. Theoretically, you could always take the transpiled code and port to idiomatic Go if the project died.

garyrob · on Aug 17, 2024

Yeah. That's similar to my thoughts. But only 2 contributors and no update in 3 months. And there are a number of open issues with zero responses. I'm not in a position to contribute myself. I don't know enough about compiling/transpiling and I have too much on my plate already. So I can't complain, but I'd by happy if more people joined the project and it had more activity. It's exactly what I'm looking for.

jfoudfwfayasd · on Aug 17, 2024

Go with a competent type system would be wonderful

nickm12 · on Aug 18, 2024

It looks like Borgo is to Golang what Typescript is to Javascript. It's kind of ridiculous that Golang would need this, but it actually makes sense to me as something you'd want to use.

antonvs · on Aug 18, 2024

> It's kind of ridiculous that Golang would need this

It's because Go is a ridiculous language. People don't want to admit this, but it was designed by people stuck in the last millennium when it comes to language design.

Bognar · on Aug 18, 2024

I've always wondered where the love for Go comes from because this is exactly my take.

anothername12 · on Aug 18, 2024

This exactly. I’m doing go now for a few years after coming from Java, Lisp, Ruby etc. I’ve often wondered while coding in it, if it was the result of a time traveller explaining garbage collection and lambdas to a 70’s C programmer living under a rock.

qaq · on Aug 18, 2024

Was really excited when I first found it but it looks abandoned

neonsunset · on Aug 18, 2024

Come here to us in .NET land, we have even smaller AOT binaries, nullability and better systems programming story. And type unions, whenever they come around in one of the next releases.

garyrob · on Aug 18, 2024

How small can a .NET binary for a CLI application be using .NET? Any special instructions for compiling one? (Sorry that this is offtopic but I had the impression that .NET binaries were big so I thought I'd take this opportunity to ask... I like F#)

neonsunset · on Aug 18, 2024

Short answer: 1-1.3MiB AOT, down to ~800KiB if you add flags to really push it (impractical). Will grow as you add dependencies. ~130KiB for runtime-less JIT, ~13MiB for JIT+runtime. All these imply a single runnable executable you can ship to user as is.

To get this just type `dotnet new console --aot && dotnet publish -o .` in a folder of choice. The binary and .csproj will have the name as the folder they are placed in. You can rename Program.cs too if it bothers you.

Long answer:

There are 3 main ways to publish a binary for a CLI (as well as back-end and often GUI too). .NET is quite flexible about this, which comes down to what you need:

- AOT which starts at 1-1.3MiB, this is what you get out of `dotnet new console --aot` template compiled with `dotnet publish -o .`

- JIT+runtime which starts at 12-13MiB, this is a combination of flags (which frankly would make a good default) trimmed, self-contained and single-file. Normally you get those by specifying them in .csproj or just doing `dotnet publish -o . -p:PublishTrimmed=true -p:PublishSingleFile=true`.

- JIT without runtime (expects the host to have .NET runtime installed) which starts at 120-140KiB. Notably, it's just a thin runtime launcher with pure CIL assemblies embedded in it. This can be achieved with `dotnet publish -o . -p:PublishSingleFile=true --sc false`.

All these have their own use cases that determine which one is the best. Usually, for CLI you either want to use the first or the third one. My personal preference for all kinds of on-off utilities is AOT as it has the best startup time. There are other ways to publish a binary, including the historical default which dumps all assemblies separately, but I think they are not useful given the nature of your question, nor something you need to deal with in practice.

For more comprehensive comparison of what to expect from .NET AOT, you can look at https://github.com/MichalStrehovsky/rt-sz/issues/63 which is a job that tracks binary size improvements/regressions from runtime contributions with the exact data for different templates and sample use cases (full vs stripped down console hello world, asp.net core webapiaot template, avalonia template, etc.)

Overall, what I meant by "smaller binary sizes" is that .NET's AOT tooling has become quite advanced over the last two releases, and provides better scalability as you add dependencies than Go due to metadata compression, dehydrated binary sections, flow analysis, etc.

To give you an example, there's https://github.com/codr7/sharpl that was on HN not so long ago, when compiled with .NET 9 RC.1 it takes about 2.6MiB on my machine.

On F# - 'FSharp.Core' has quite a few dated bits inside, and custom metadata, both of which are not very friendly to linking and AOT compilation size - it will produce trim warnings which means that there is code that might have been "trimmed away" but might be dynamically accessed at runtime, causing an exception. This is normally addressed by using one of the JIT options. Mind you, they still have good startup latency, just not the <100ms one.

garyrob · on Aug 18, 2024

Thanks for all this info! I appreciate it.

qaq · on Aug 22, 2024

Honestly decided to focus of Swift server ecosystem is lacking but I really like the lang

kitd · on Aug 18, 2024

I disagree with having the ordering embedded into the map implementation. That imposes an unnecessary performance overhead to support a small subset of use cases.

I think what the author requires is iterating over a sorted list of keys. That is pretty easy to implement using the standard library, and imposes the performance penalty only when it is needed.

tialaramex · on Aug 18, 2024

There are three distinct types being discussed here, let me try to briefly explain them.

1. Just a hash table, Rust's std::collections::HashMap, C++ std::unordered_map, Go's map

This type is not about the "order" of its contents. If you want the "order" in any sense, that's not what this is for and you have the wrong type just as surely as if you were surprised that your integer type can't store a half. Types of this kind can be optimised to provide extremely fast indexing by key which is why they exist as this is useful in many problems.

2. A container arranged by the value of the keys, Rust's BTreeMap, C++ std::map

This type is about the order of its contents by value. It doesn't matter when you put a 4 into this container, it goes between 3 and 5 anyway. This type is good when you need to work in that "by value" order later, for example to take the "Most important" item or the "Soonest". It doesn't remember the order in which things were added, and it is relatively slow to find items by their key.

3. A container forever arranged by order of insertion, Python's OrderedDict (and dict), in Rust that's https://crates.io/crates/linked-hash-map LinkedHashMap

This type remembers the order in which you inserted items into the container and can give them all back in that order efficiently. In other ways it's like the first container, but it compromises performance significantly to deliver this "order" promise.

It is problematic that people talk past each other on this, both in terms of a useful discussion on HN, but much worse in a Software Engineerign team if you thought you were being given an OrderedDict, but it was actually a BTreeMap for example.

Python chooses to provide (3) because Python is slow anyway so why not at least provide the least surprising container given how slow the language is. The existing Python dict was so awful that OrderedDict is actually faster (not fast in the wider scheme of things, but faster than that) so that's good enough.

bostik · on Aug 18, 2024

> Python chooses to provide (3) because Python is slow anyway so why not at least provide the least surprising container

I believe this misses quite an important bit of history. Python's dict retains key creation order since 3.6, and the behaviour was made official from 3.7 onwards. But before that, the keys were in unspecified order. Not random, because the order was stable: if you iterated through the same dict twice within the same process, you got the keys in the same order.

The property of retaining key creation order was a side effect from the underlying implementation. In the 3.6 release, Python switched over to their new dictionary implementation, lifted from the PyPy project. From what I recall, the reason for the change was that the new implementation had a notably lower per-key overhead. That decrease in memory use resulted, I believe, in the slightly faster performance as well. Order retention came "for free".

Personally I believe that order retention as a default property is a mistake. Now, I admit that OrderedDict semantics are often more convenient, but they break from the expected dict/map semantics with other languages. And since we are stating our opinions, in my mind Go choosing to forcibly randomise maps' key traversal order is a good safeguard. It guarantees that no-one can even accidentally depend on key iteration order. (Yes, I have seen production outages thanks to someone's code implicitly relying on key creation/traversal order when processing RESTful payloads.)

As should be apparent, I disagree with the current behaviour being "least surprising".

fuzztester · on Aug 18, 2024

>(Yes, I have seen production outages thanks to someone's code implicitly relying on key creation/traversal order when processing RESTful payloads.)

Interesting. Example of that?

bostik · on Aug 18, 2024

Sure. This is from the previous job. And for background: the service in question would routinely have several thousand concurrent, live user sessions. All stateful.

Two teams, let's call them Team A and team B, would have their respective services handling user traffic. Service maintained by Team A would handle the traffic, while service maintained by Team B would handle the more complex background state transitions that Team A would not have to care about during the day.

Service A would hold a complete client session state. Service B was stateless. Messages sent from service A to service B would contain all the necessary data to build up the correct state for every message received. Team A wrote their service in "not Python" language. Team B wrote theirs in Python. Communication between the services was RESTful, so essentially "JSON payload in a HTTP POST message".

Service B had a construction in their code that in simplified terms looked a bit like this:

    data = json.loads(msg.data)
    for key_, val_ in data.items():
        do_stuff(key_, val_)

And then inside the do_stuff() routine, there was a piece of logic that used an implicit state machine. It wasn't written like one, but it happened to rely on the processing order... Like this:

    def do_stuff(field, vals):
        if field in (<possible action triggers>):
             # do something based on field
        else:
             # other things

Because service B was written in Python, and this was post python 3.6 days, the 'data' read from the message created a dictionary with keys in the same order they happened to come off the wire. Everything worked fine, because the way the on-the-wire JSON payload at service A was constructed also happened to put field keys in a specific order. Service B could process the fields in the order they came through and could build a larger state internally based on each of the fields.

Then, as happens to every well used service, requirements change. In order to support new use cases, service A would need to include a new field - and for the maintainers of that service, the most logical place in their internal structure was between existing fields. This change was known, and service B had added support for this additional field. In 'do_stuff' internals, they had added the new field to the end of the possible action triggers. They also had unit tests - written by themselves - to ensure their service would work correctly whether it received old or new payloads.

The unit tests had added the new field after the existing fields in their test inputs. Their internal state machine was coherent and correct.

And then, Team A ships their new service. Service B promptly starts to crash. Every crash triggers Sentry client to serialise the full stack trace and send it over. In order to prevent a cascade failure, Sentry itself has been configured with a throttle, so once enough in-flight request are lined up, it starts to apply backpressure and response delays. Sentry clients within service B end up blocking their respective workers. Service A can not reliably send its message over, because service B is bogged down waiting for N+1 Sentry client submissions to complete. In order to capture the error situations properly, service A also has Sentry client within it...

It takes about 10 minutes for the teams to figure out what's going on before team A rolls back their deployment. But that was nonetheless visible downtime during live trading hours.

The root cause was obviously a logic bug, but it was only possible to build up to having such a logic bug due to the key iteration order semantics.

Zamiel_Snawley · on Aug 18, 2024

What a great example, thank you for writing it!

nbadg · on Aug 18, 2024

Not to take away from your broader point (that different data types are appropriate in different scenarios), but:

> Python chooses to provide (3) because Python is slow anyway so why not at least provide the least surprising container given how slow the language is. The existing Python dict was so awful that OrderedDict is actually faster (not fast in the wider scheme of things, but faster than that) so that's good enough.

The python dict implementation is actually extremely optimized, and used in very critical hot paths throughout the interpreter and object model (for example, ``object.__dict__``). Additionally, python dicts (in cpython) are implemented in C, so any "slowness" there is going to be the result of the python code written to use the dictionary, and not the dict itself.

Up until python 3.6, cpython dictionaries were not ordered. At version 3.6, cpython dicts were made ordered, but only as an implementation detail. And at version 3.7, the preserves-insertion-order property of dicts was officially made part of the language spec, so that all python implementations need to support it.

The 3.6 change was made purely for performance reasons (and the stdlib already included an OrderedDict anyways). It was then made part of the language spec in 3.7 for several reasons: reduced maintenance burden for OrderedDict, convenience to developers using python, reducing the chance of accidental footguns of people relying on the implementation detail as if it were actually part of the language (and it then being removed later and breaking things), etc.

The decision was made as part of this thread[1], if you're curious.

[1] https://mail.python.org/pipermail/python-dev/2017-December/1...

tialaramex · on Aug 18, 2024

> The python dict implementation is actually extremely optimized

It was upgraded from "optimized" terrible garbage to a sane attempt to do the same thing but smaller and faster. In the Python world I'm sure that's "extremely optimized". In the rest of the world we know it's not optimisation unless you measure and when you measure the Python dict is mediocre (but used to be much worse)

> used in very critical hot paths throughout the interpreter and object model

The old even worse one was used in the very same Python "critical hot paths" for many years.

Actually the earlier Python dict reminds me of "I can't believe it can sort" which is a weird sort algorithm which looks like it's a defective Insertion Sort that won't work, but is actually a working (but O(n*2) best case) sort algorithm. The old dict does in fact provide a hash table type for Python. It's much bigger than it needs to be, in order to enable an "optimization" which also makes it much slower than it needs to be.

fuzztester · on Aug 18, 2024

>The python dict implementation is actually extremely optimized, and used in very critical hot paths throughout the interpreter and object model (for example, ``object.__dict__``). Additionally, python dicts (in cpython) are implemented in C, so any "slowness" there is going to be the result of the python code written to use the dictionary, and not the dict itself.

Yes. Raymond Hettinger has one or more videos on YouTube titled something like "Python dictionaries" or "Modern Python dictionaries" that talk about the optimisations done on them.

masklinn · on Aug 18, 2024

> reduced maintenance burden for OrderedDict

The maintenance burden for ordered dict was not changed: ODict supports constant time moving to or removing from the start or end, so it has to be a linked hashmap, regardless of the ordering of the underlying map.

masklinn · on Aug 18, 2024

> Python chooses to provide (3) because Python is slow anyway so why not at least provide the least surprising container given how slow the language is. The existing Python dict was so awful that OrderedDict is actually faster (not fast in the wider scheme of things, but faster than that) so that's good enough.

That is completely incorrect. The builtin dict is similar to https://docs.rs/indexmap/latest/indexmap/ not a linked hashmap, it was used because it significantly improves iteration speed and uses less memory.

tialaramex · on Aug 18, 2024

The crucial thing about IndexedMap is that it is not actually order preserving.

If it gets inconvenient to preserve order, it's just not preserved. For example if I put sixty items in, then remove thirty and add forty more, IndexedMap doesn't put all those forty items "after" the remaining thirty from the removal because that's more work.

Python does preserve order, the fact that internally it looks somewhat like IndexedMap is an implementation detail.

masklinn · on Aug 18, 2024

> the fact that internally it looks somewhat like IndexedMap is an implementation detail.

It really is not. IndexMap was directly inspired by Python’s naturally ordered dicts. It’s spelled out right in the readme.

And while indexmap has weaker ordering guarantees for performance reasons (though also additional features aplenty), “ordermap” was revived as a wrapper which does conserve ordering on removal.

assbuttbuttass · on Aug 18, 2024

The downside of something like indexmap is now removal is O(n)

masklinn · on Aug 18, 2024

If you do the remove naively yes, but you can use tombstoning which amortises the cost (at that of an increased iteration overhead), or using a non-order-preserving remove (like indexmap itself, though `remove` has been deprecated and you now get to pick your poison).

38 · on Aug 18, 2024

> This type is about the order of its contents by value.

No, by key.

tialaramex · on Aug 18, 2024

I deserved that, maybe I should write up a whole blog post about this topic where I can include a diagram so that we're clear it's the value (as opposed to its age or any other characteristic) of the key

38 · on Aug 18, 2024

if you want your comments to make any sense, you need to use "key value" with every usage of "value", otherwise people are going to think "container value"

DandyDev · on Aug 18, 2024

The author is not saying that ordering needs to be added to the _current_ map implementation. He suggests adding an additional map implementation that has ordering built in. That way, you can choose between functionality and (hypothetical) better performance

The author does not seem to require iterating over a sorted list. Sorting is not the same as ordering. An ordered map is a map in which the insertion order is preserved when iterating over the elements. A sorted map outputs the elements in an order defined by a comparison function when iterating, regardless of their insertion order. You can for example sort alphabetical in case of string keys.

gizmo · on Aug 18, 2024

In most cases the performance penalty of having an extra internal array to keep track of insertion order is minimal, and the whole point of built-in collections is so people can quickly write correct programs. When optimizing for performance default collections are likely to get replaced with hand-rolled versions anyway. But in all other cases a dictionary that “just works” is preferable to one that has such an annoying footgun that the go team had to randomize the iteration order in an attempt to treat the symptom instead of choosing correctness. Go isn’t even a high-performance language and many language design choices (channels!) explicit prioritize correctness over performance.

It’s like having an unstable sort as the default standard library sort function. People reasonably expect that when calling sort twice the second sort to do nothing, but you can always find people who will passionately argue that people deserve to get burned if they assume a sort function is stable.

icholy · on Aug 18, 2024

Yeah, who needs O(1) deletes anyway? /s

lifthrasiir · on Aug 18, 2024

In case you haven't realized yet, a hash table that maintains the insertion order can be still do O(1) deletes as long as the key order doesn't change arbitrarily after the initial insertion.

icholy · on Aug 18, 2024

I'm commenting on the proposed implementation of using an array to keep track of insertion order.

lifthrasiir · on Aug 18, 2024

An array can be used to efficiently simulate a linked list and other data structure, however. (Or an intrusive linked list may be embedded into the bucket structure like PHP, but this is less efficient with open addressing scheme which is nowadays better for cache locality.)

icholy · on Aug 18, 2024

> An array can be used to efficiently simulate a linked list

That's obviously not what the OP meant. Also, I don't think there's an efficient way of implementing deletes with an array backed linked list.

mjevans · on Aug 18, 2024

That depends on what someone is willing to compromise. Extra space to point back at exactly that key (but that also needs to be updated each compaction?); personally I'd normally rather pay the lookup or key sort on iterator snapshot fee. An 'insert, or re-sorted order' side index which allows for nodes to be marked as 'deleted' (maybe nil / null, maybe a sentinel value meaning skip?); I might propose that to see if it fit the requirements well enough.

icholy · on Aug 18, 2024

... or just use a normal linked list with the existing entries like a sane person.

masklinn · on Aug 18, 2024

> I disagree with having the ordering embedded into the map implementation.

Good thing that's not what they are asking at all. They just want an ordered map to be in the standard library.

> That imposes an unnecessary performance overhead to support a small subset of use cases.

Naturally ordered hash maps generally have a small performance hit on lookup and a performance gain on iteration, as iteration goes through a dense array.

Linked hash maps do tend to have worse performances for all cases.

> I think what the author requires is iterating over a sorted list of keys.

Had they needed that, they'd have said that. But they did not. And they specifically refer to an ordered map, and to Python's built-in and Ordered dicts, which are not sorted.

akdor1154 · on Aug 17, 2024

Nillability is the biggest thing that drives me to write 'unidiomatic go': there are a few Optional libs around, they work ok.

I write with the following rule: if a pointer is passed, it shouldn't be nil. If it might be nil, code it as an Optional<> instead.

Un-golike but works great.

Neikius · on Aug 18, 2024

This is actually terrible. I tried that a while ago, but after a short while removed all of the Optional[] code and went back to pointers. Why? The default ser/des in go just cannot play. It instead works great with pointers.

Using pointers that are nillable whenever I want to have an optional value. How did I solve the usability problem? I just introduced 2 simple methods: Or and Of. First one will resolve ptr to a value or default if nil (provided in param) and the second one will make a ptr from a value type. That is actually all you need! Don't have the code here or I'd post it but it's easy enough to make your own (with generics).

jjdhxhbe · on Aug 19, 2024

Using a pointer in go can mean two things and I don't like it: - Value can be nil - Value is mutable

It's neither possible to have a const but nilable value as well as it's not possible to have a non-nilable mutable value

autarch · on Aug 18, 2024

We have started doing this at work as well. It makes the code a lot easier to understand, and I think it's worth the small hassle that this entails (because without a `match` statement, dealing with an Option type is verbose).

hellcow · on Aug 18, 2024

I define my own Null[T] for this purpose. There's sql.Null in the stdlib already, so that seems plenty go-like.

cempaka · on Aug 17, 2024

> While I understand the reason for this (i.e.: to avoid developers relying in a specific iteration order), I still find it weird, and I think this is something unique for Go.

Haha well, fun fact, Java did this as well after a bunch of code was broken by a JDK upgrade which changed HashMap iteration order that programmers had been relying upon. Java does at least have ordered maps in the standard lib though. IMO it is a questionable decision to spend CPU resources on randomization in order to mollycoddle programmers with a flawed understanding of an API like this, but then again I'm not the one who gets the backlash when their stuff breaks.

Also, on the subject of nullability, while JSR305 may be considered dead, there's still pretty active work on the Java nullability question both from the angle of tooling (https://www.infoq.com/news/2024/08/jspecify-java-nullability...) and language design (https://openjdk.org/jeps/8316779).

kevindamm · on Aug 17, 2024

I was about to comment the same. Also happened in absl (Google's alternative to the STL template library for C++ which started its life as GTL) -- when changing the default hash function for unordered maps it led to breaking changes in both test and production code that had depended on it.

OrderedDict types are nice sometimes but shouldn't be the default behavior, IMO.

There are good reasons for the second point (about default/named parameters) -- any calling code is making some assumptions based on that default value so there's a risk it can't be changed or added to. If you really want a default value, make a wrapper function for it. In the example it would be a simple matter of defining ReplaceAll with three arguments that always passes -1 to Replace(...)

jsnell · on Aug 17, 2024

Perl has done per-run randomization of the iteration order as well for a while (more than a decade).

    % repeat 5 perl -e '%h=(a => 1, b  => 2, c=> 3); print for keys %h; print "\n"'
    bca
    cba
    bac
    bac
    bac

okwhateverdude · on Aug 18, 2024

Cute story on why that came to be. At Booking.com, a degenerate case that lead to a DoS that you could cause with specially crafted URLs (I think, memory is a bit foggy) spurned Yves Orton to do that hacking. And it broke a lot of code where the ordering had been consistent enough that people relied on it.

darrenf · on Aug 18, 2024

Way longer than a decade. You can see it in the perlfunc man page way back in 1996[0].

Of the same vintage is `Tie::IxHash`, which retains insertion order

    % repeat 5 perl -MTie::IxHash -E 'tie my %foo, "Tie::IxHash"; @foo{qw{ c a b }} = (1)x3; printf("%s%s%s\n",keys %foo)'
    cab
    cab
    cab
    cab
    cab

[0] https://metacpan.org/release/NI-S/perl5.003_02a/view/pod/per...

rurban · on Aug 21, 2024

No, Yves added randomization per iteration as described in the parent.

Before it was only dependent on the per-process seed. So the seed could be computed given enough attempts to iter a hash (eg a public JSON API)

darrenf · on Aug 23, 2024

But the parent is invoking `perl` itself 5 times, rather than iterating 5 times within one process.

h4ck_th3_pl4n3t · on Aug 18, 2024

[flagged]

Dylan16807 · on Aug 18, 2024

Notice how almost all of those are just a variable name or a number on a line by itself. Those being valid programs says very little about Perl as a language.

A few semicolons being okay in a mix with the above doesn't say much either, and the couple gnarliest examples happened to hit #, the comment character, so the rest isn't even being treated as code.

vessenes · on Aug 18, 2024

Counterpoint - Perl 5 should be (and frankly probably is) required reading for all language designers. You may not like its opinionated point of view, but it glued together the early internet, and occupied a niche so large for so long that calling it a niche is underselling it. Misunderstanding why Perl worked, and what worked about it is a miss in an any serious software engineer’s education.

IshKebab · on Aug 17, 2024

I wouldn't call it mollycoddling. I can understand why people make that mistake. Interfaces should be designed to reduce the chance of mistakes as much as possible, under the knowledge that people (including you!) make mistakes. They shouldn't be designed under the assumption that people always read and understand the manual and never make mistakes.

That's why we don't do `load` and `load_safe`; we do `load` and `load_unsafe`.

layer8 · on Aug 18, 2024

> Java did this as well after a bunch of code was broken by a JDK upgrade which changed HashMap iteration order that programmers had been relying upon.

This is incorrect (or I’m misunderstanding you). OpenJDK’s HashMap doesn’t use randomization, and the iteration order is thus deterministic under that implementation, although the API specification does not guarantee it. To mitigate DoS attacks, keys in the same hash bucket are stored as a balanced tree. For keys that implement Comparable (strings in particular), this guarantees O(n log n).

cempaka · on Aug 18, 2024

Ah you're right, what I had in mind was the iteration order of the immutable/unmodifiable maps created by Collections.unmodifiableMap() or Map.of(): https://docs.oracle.com/en/java/javase/20/core/creating-immu...

barsonme · on Aug 17, 2024

> IMO it is a questionable decision to spend CPU resources on randomization …

It takes about 3 ns to choose a random starting bucket, which is basically free relative to the iteration itself.

tgv · on Aug 17, 2024

Isn't the trick that the runtime picks another hash-constant every time?

masklinn · on Aug 18, 2024

Aside from keying the hash function, Go specifically randomises the start offset of each map iteration.

kbolino · on Aug 17, 2024

They just added custom iterators ("range over func") to the language in 1.23 but somehow missed the obvious opportunity to add e.g. maps.SortedByKeys. It's clunky to write:

    for _, k := range slices.Sorted(maps.Keys(m)) {
        v := m[k]
        _ = v // do something with k and v
    }

though, it's not as bad as before:

    keys := make([]string, 0, len(m))
    for k := range m {
        keys = append(keys, k)
    }
    slices.Sort(keys)
    for _, k := range keys {
        v := m[k]
        _ = v
    }

38 · on Aug 17, 2024

> maps.SortedByKeys

thats what you call overfitting kids

kbolino · on Aug 17, 2024

Whatever you think the best name for it is, it's still missing from the library.

jerf · on Aug 17, 2024

I think the problem with putting this into the standard library is that while Go may not be super focused on absolutely top-tier performance, it does generally try to avoid offering things that are unexpectedly slow or unexpectedly allocate large things. A sort-by-keys on the standard map would require allocating a full slice for the keys in the sort routine to do the sort, which would surprise people who expect build-in iterators to not immediately do that.

Plus it's in the class of things that's pretty easy to implement yourself now. There's always a huge supply of "but the library could just compose these two things for me". If you stick them all in things get bloated. You could literally have written it in the time it took to write the complaint. You got 80% of the way there as it is, I just tweaked your code a bit to turn it into an iterator: https://go.dev/play/p/agBGl_rT7XS

38 · on Aug 17, 2024

its completely pointless to make this an iterator, because you have to loop the entire map to do so, which kills any benefit of using iterators

randomdata · on Aug 18, 2024

Not completely pointless. It avoids the need to retain a full copy of all the values.

But a good demonstration of why this kind of thing isn't a good fit for the standard library.

kbolino · on Aug 18, 2024

And yet slices.Sorted was added which does exactly this already, but only for single-valued iterators.

randomdata · on Aug 18, 2024

> And yet slices.Sorted was added which does exactly this already

It does not. slices.Sorted accepts an iterator, but returns a slice.

Like the earlier comments point out, Go tries its best to give a reasonable idea of what kind of complexity is involved at the API level. slices.Sorted returning an iterator would mask that. By returning a slice, it makes clear that the entire collection of data needs to be first iterated over as the parent described.

kbolino · on Aug 18, 2024

This is a good point which likely explains why maps.Sorted doesn't exist (yet): what would it even return?

I think returning an iterator is acceptable, the docs could explain the expense of the operation, and the implementation could change in the future as needed. But that does hide some complexity.

If it ought to return a slice of entries, that opens up new problems. What is an entry? A two-member generic struct? Ok, fine, but then how do I ergonomically iterate over them, pulling out both members? There's no clear solution to that problem yet.

38 · on Aug 18, 2024

> I think returning an iterator is acceptable, the docs could explain the expense of the operation, and the implementation could change in the future as needed. But that does hide some complexity.

if a function returns an iterator, it should be iterating the input. thats impossible in this situation. you'd need to loop the entire map, then return an iterator that tricks the user into thinking they are getting better performance when they are getting the worst possible performance.

randomdata · on Aug 18, 2024

There is another, perhaps more important, reason: If you need sorted keys, the map is almost certainly the wrong data structure.

Sure, there may be some edge case situations, like where you are dealing with someone else's code where you don't have control over the structures you've been given, but:

1. The standard library doesn't appeal to edge cases.

2. The "noiser" solutions to deal with the edge case serve as a reminder that you aren't working in the optimal space.

kbolino · on Aug 18, 2024

This is a bridge that the standard library has already crossed, though. Off the top of my head, both encoding/json and text/template guarantee sorted iteration order of maps. I don't think it's an edge case at all.

Whether in particular cases, a properly ordered data structure (like a tree) should be used instead, is a valid question to ask, and thanks to the custom iterators, it'll now be more ergonomic to use. But if I usually use a particular map for its O(1) operations and only occasionally iterate over the whole thing, yet need consistent iteration order, then the built-in map still seems like the right choice, and having a standard way to iterate it is a reasonable request.

randomdata · on Aug 18, 2024

> both encoding/json and text/template guarantee sorted iteration order of maps. I don't think it's an edge case at all.

That is literally the edge case example I gave. Perhaps there is a better way to describe it than "edge case", but semantics is a silly game.

> then the built-in map still seems like the right choice, and having a standard way to iterate it is a reasonable request.

And, indeed, the standard library provides slices.Sorted(maps.Keys(m)) for exactly that. Ergonomic enough, while making the compromise being made reasonably explicit to help with readability – which is far more important than saving a few keystrokes. If typing is your bottleneck, practice will quickly solve that problem.

kbolino · on Aug 18, 2024

It's never really been about saving keystrokes, but about re-writing the same (fairly common) operation over and over again (and not necessarily the same way each time), and not being able to benefit from future optimizations.

However, as examined in a sibling thread, there doesn't seem to actually be any missed optimization which could potentially be applied here.

randomdata · on Aug 18, 2024

In what way is the operation common? We obviously would never say that there is never a use for such thing as there are clear edge cases where it is necessary, but as jerf points out, it is probably not what you actually need in most cases.

Even ignoring that in the most common case the map isn't the right structure to begin with, what even is the general case for the situations that remain? You mentioned the marshalling of arbitrary data case, but in that case you also have reflection details to worry about, and which you can optimize for with a custom implementation, and thus wouldn't likely use the built-in anyway. A sibling thread discussed the cache benefits of colocating the values with the keys if a map is exceedingly small, but as soon as the map is of any reasonable size the added overhead of the values is almost certainly going to blow the cache even where the keys alone might still fit.

All of which is to say that the best approach is highly context dependent. How do you even begin to choose which is the general case if you were to include such a function?

jerf · on Aug 18, 2024

I endorse this, as I commented in another reply under my post that the correct cache-aware answer is another data structure entirely.

But I'd also suggest that if you think you need sorted keys, double-check. I program an awful lot of things without sorted keys, and I am quite aware of the issues around sorting, and I suspect without proof that a lot of people swearing by sorted maps are imposing false ordering requirements on their code more often than they realize. The ideal solution is not need order at all.

(I am especially suspicious of extensive use of maps where the keys are sorted by insertion order. That smells... antipatternish to me.)

kbolino · on Aug 18, 2024

As I mentioned in another reply, this simple solution is not cache-friendly.

jerf · on Aug 18, 2024

The cache friendly alternative is to use a different data structure. There is no cache-friendly iterate-in-order on the standard map.

But I've got plenty of cases where this in fact is cache friendly, because the entire map fits into L2 or even L1 anyhow because it's going to have maybe 4 keys in its lifetime. Not every map has fifty million values in it. I'm always keeping at least a little bit of track about such details when I'm using maps.

kbolino · on Aug 18, 2024

I did some benchmarks, and it seems you are right that there's no (more) cache-friendly solution in general (at least, not that I could come up with). Memoizing the full entries (key and value) into a slice and then sorting that slice by key has basically the same cache-thrashing characteristics as randomly accessing the values, and is no faster (sometimes slower).

38 · on Aug 17, 2024

the point is you dont need it. by your own admission, it would save literally 0 lines of code from your current example. you need discipline when adding sugar otherwise you can ruin a language.

kbolino · on Aug 17, 2024

I said no such thing.

First, it would save one line of code (v := m[k]). Second, it would also allow an optimization. When iterating a map directly, you have both the key and the value at the same time. However, since we iterate only the keys here, we then have to look up the value anew for each key. That takes extra time and, for large maps, will thrash the CPU cache.

So the following would be both fewer lines of code and faster:

    for k, v := range maps.Sorted(m) {
        // do something with k and v
    }

Making common operations clear and concise is not mere sugar in my opinion. It not only improves the developer experience, it also clarifies the developer's intent, which enables better optimizations, and allows bugs and traps to be addressed in one place, instead of languishing in far-flung places.

jillesvangurp · on Aug 18, 2024

It's one of the things I like in Kotlin: it defaults to using ordered maps. It also has default arguments in functions, lambda functions, and of course nullable types.

worik · on Aug 17, 2024

> Java does at least have ordered maps

Weird

An "ordered map" is not a hash table. I think they want a tree.

But you can get the keys and sort them.

I really do not see the problem

Use a tree if order matters, Hash if not.

(Since I am not a Go programmer, maybe I missed something)

layer8 · on Aug 17, 2024

Java’s LinkedHashMap is a hash table with an additional linked-list structure on the entries that records the insertion order. The map is thus ordered by insertion order, an order that is independent from the keys.

A map ordered by keys is a SortedMap in Java. While ordered, LinkedHashMap is not a SortedMap. In other words, unordered < ordered < sorted.

masklinn · on Aug 18, 2024

> An "ordered map" is not a hash table. I think they want a tree.

An ordered map is absolutely a hashmap.

> But you can get the keys and sort them.

That gives you a sorted thing, which is completely different.

> Use a tree if order matters, Hash if not.

That is incorrect. “Ordered” in the context of maps generally denotes the preservation of insertion order, and more rarely the ability to change that order. Trees don’t help with that, quite the opposite.

worik · on Aug 18, 2024

> An ordered map is absolutely a hashmap

I have never heard of O(1) insert and retrieval from anything ordered.

So, no. An ordered map is not a hashmap

masklinn · on Aug 18, 2024

> I have never heard of O(1) insert and retrieval from anything ordered.

Then you’ve not gotten out much. Here’s one: https://docs.python.org/3/library/stdtypes.html#dict

Here’s an other one: https://docs.python.org/3/library/collections.html#collectio...

Here’s a third one: https://docs.rs/indexmap/latest/indexmap/map/struct.IndexMap...

And a fourth: https://docs.oracle.com/en/java/javase/22/docs/api/java.base...

> So, no. An ordered map is not a hashmap

Still wrong.

worik · on Aug 19, 2024

That is not O(1) for insert and read

Sort is O(log N)

Insert into sorted list is O(log N)

I am correct

If you need sorted keys that is easy but you cannot get O(1) which HASH gets on a good day

Not in this universe

The person who wants a HASH table with sorted keys actually wants a tree. Maths

masklinn · on Aug 19, 2024

> That is not O(1) for insert and read

Of course it is.

> Sort is O(log N)

Sort is irrelevant, as I already told you ordered != sorted.

> I am correct

No my dude, you’ve got no idea what you’re talking about and you apparently can’t read.

> If you need sorted keys

Then you’re in the wrong place because that’s not what ordered maps do.

> The person who wants a HASH table with sorted keys

Is not germane to the discussion.

> Maths

Maths have nothing to do with your apparent inability to understand basic English or intake new information.

worik · on Aug 19, 2024

How are you maintaining a sorted list (required) O(1)?

masklinn · on Aug 19, 2024

Again, for the fourth time, you are not. An ordered collection is not a sorted collection.

worik · on Aug 19, 2024

SO you want a HASH table and a stack?

Wih the Hash table keep a stack of keys.

How do you delete them? Oh. Same problem.

You have O(1) insert and O(log N) deletion

Or the stack grows for ever.

When you ask for ordered keys, at zero cost, in a Hash it is like asking the Tooth Fairy. You can ask for anything you want, but you cannot have anything you want!

masklinn · on Aug 20, 2024

Man you’re a lost cause. It takes you two days to understand a simple idea and when you finally do you’re incapable of even acknowledging it, and instead have to move the goalposts to an irrelevant aside only to be wrong again.

Is this a kink? Do you get off on appearing incompetent? If so good job.

worik · on Aug 20, 2024

It is a kink of mine to argue with people who wish for the impossible

The goal post was a HASH table with ordered keys. ' Such a thing cannot exist and retain the desirable properties of a HASH tab=le

Do you think that statement is untrue?

Do you understand order analysis?

Groxx · on Aug 17, 2024

Because it came up recently for me (triggered a starvation bug), a caution for people expecting Go to randomize everything like this:

Blocked channel reads and writes are unblocked in FIFO order, not random. Mutexes are similar (afaict not quite identical, but the intent is the same).

Go randomizes a lot and I am thrilled they do that, and I knew about mutexes already, but the chan part was a surprise to me. It's reasonable and matches mutexes, so that's probably for the better, but still a bit oof.

akira2501 · on Aug 17, 2024

> Blocked channel reads and writes are unblocked in FIFO order, not random

This should be obvious since the buffer is optional.

> but the chan part was a surprise to me

With multiple receivers it is effectively random _which_ receiver gets the wakeup.

Groxx · on Aug 18, 2024

Yeah, I don't mean buffered data. I mean blocked queueing operations.

>With multiple receivers it is effectively random _which_ receiver gets the wakeup.

That's why I brought it up: no it isn't. It's ordered. Intentionally.

https://github.com/golang/go/issues/11506

It's pretty easy to prove to yourself too, it only takes a couple dozen lines to write a test. I'm not confident it's guaranteed in all scenarios (mutexes are not, for example), but I've yet to see it do anything but perfect FIFO when not racing on starting both read and write simultaneously.

---

A single select statement with multiple eligible channel operations is random, which is part of why I expected blocked channel operations themselves to be random. But nope.

hombre_fatal · on Aug 17, 2024

Meh, what if people rely on randomness. They should flip between random and ordered at random too.

ithkuil · on Aug 17, 2024

Well, every once in a while the random order will look like it's ordered

therein · on Aug 17, 2024

Yeah, gotta keep the programmer on his toes. He needs to embrace non-determinism, and second guess everything.

In case Go developer switches to a different language, we don't want to build bad habits. Map key iteration should be non-deterministically deterministic.

akira2501 · on Aug 17, 2024

> Meh, what if people rely on randomness.

Then they are creating fragile software. They're always one language version upgrade away from disaster.

hombre_fatal · on Aug 18, 2024

Yes, but that criticism already applies to the original iteration order before Go introduces a performance penalty by intentionally randomizing order on every iteration just so people can't rely on it.

donatj · on Aug 17, 2024

The lack of default argument values initially annoyed me, but I kind of came to like it. It makes me put more thought into my function interfaces.

In the rare cases I do want a default it's usually reasonable to just add a second function that calls the first with the default value.

I don't end up doing this a lot, but I certainly have in a couple handful of cases. A lot of Go libraries for HTTP related activities do this with the default context. They'll have a function that accepts a context and a function that has the default context.

Example

https://github.com/slack-go/slack/blob/242df4614edb261e5f4f4...

Honestly, with good naming, I think this is just generally more readable and expectable behavior only takes three lines of code.

wild_egg · on Aug 17, 2024

FWIW a lot of those packages with default context wrapper functions have that as a backwards compatibility measure for code written before the context package existed. It's almost always preferable to call the newer function with your own context instead

mrj · on Aug 17, 2024

Sure, one can make a new function. That's what I do, too, but then I end up greatly missing function overloading. In the example's case I might end up making a ReplaceAll function. It creates namespace clutter.

Function overloading could easily handle context with and without arguments.

enneff · on Aug 17, 2024

I think if there are a lot of functions (or a function with a lot of arguments, common to python) you have clutter regardless. I would rather have a bunch of different names than a bunch of different functions with the same name. Having to choose names for each of the functions is a gentle push back asking “do you really need all these flavours of function or are you abstracting this in the wrong way?”

bediger4000 · on Aug 17, 2024

I've seen the python ordered dict thing bite ex-python programmers over and over, in two different ways.

First, assuming that the keys iterate in the order they're inserted, the cliche problem.

Second, marshalling JSON and unconsciously relying on the order in the JSON as hidden semantics. This makes it hard to understand the JSON as a human, as well as making what ought to be a portable format with other languages hard to reuse.

I've decided that Python is in the wrong here, not technically, but rather for encouraging humans to assume too much.

kokada · on Aug 18, 2024

> First, assuming that the keys iterate in the order they're inserted, the cliche problem.

Author here.

I never assume that hash maps can be iterated in the same order as the insertion in any language, however the fact that in Go each iteration results in a different ordering is surprising.

bediger4000 · on Aug 19, 2024

Good practice. My experience says that Python programmers consciously or unconsciously make that assumption often, which is bad practice.

zarzavat · on Aug 17, 2024

As the article says, reproducibility is important. If I have a bug on one run, I want to get that bug again on the second run. I want to be able to run the program again and again and have the same breakpoints hit in the same order with the same variables. If I run tests, I want them to give the same results each time.

Randomness is bugs. Adding randomness to a language is adding bugs.

dwattttt · on Aug 18, 2024

You can't avoid randomness and entropy, it's just something that's needed in too many places. For test repeatability though, whatever seeds a test run uses should be saved in order to repeat that test exactly.

Best of both worlds: you get to cover weird bugs that only show up when stuff is in a certain order, and you can repeat the tests exactly.

largbae · on Aug 18, 2024

I wish that an unhandled error would crash its way up the stack automatically returning error if the next function up can do so, until it is either caught into a variable or can't be returned (panic if error can't be returned).

This would get close to python try/catch with even lighter syntax.

This would cut so much boilerplate hand carrying error up the stack.

kokada · on Aug 18, 2024

I talked in my previous post (https://kokada.capivaras.dev/blog/go-a-reasonable-good-langu...) that nowadays I just implement a generic `must*()` family of functions that can be used as:

    func must(err error) {
        if err != nil {
            panic(err)
        }
    }
    func must1[T any](v T, err error) T {
        must(err)
        return v
    }
    func maybeError() (bool, error) { ... }

    result := must1(maybeError())

And this generally works fine 99% of the time when I just want a stack trace on error.

So this is why I don't care that much about having a syntax sugar for this operation anymore in Go.

daghamm · on Aug 18, 2024

This is not error-handling :)

In fact, this is the opposite of error handling. In any serious application this might be worse than ignoring errors

kokada · on Aug 18, 2024

I didn't say this is error handling, this is a stack trace for debugging on error.

Similar to an uncaught exception in other languages. And yes, I will not use this on production applications or libraries, I mostly use this in scripts where this kind of thing makes sense.

Neikius · on Aug 18, 2024

The thing that bothers me is... golang actually has exceptions. It is just that nobody will dare mention it. Panic and recover eh? It is just that by default error is the thing you should use and everything and everyone does. Then the exception mechanism comes along and you now have 2 ways of handling errors. Making things quite confusing.

assbuttbuttass · on Aug 18, 2024

panic() already exists

MarkMarine · on Aug 18, 2024

“I don't think the language needs to support the generic solution for nullability, that would be either having proper Union or Sum types.”

- then goes on to describe some of the problems sum types would solve. Why. Why doesn’t go need this? It was just presented as a blanket statement without a reason.

Personally, I see missing sum types as a major gap, and I reach for them all the time in other languages.

kokada · on Aug 18, 2024

I didn't develop too much because as I said in another thread, I wrote this post without giving much thought. Sorry for not being at the HN standards, but this post wasn't even supposed to be here, and here we are ;).

That said, I think given the current state of the language, adding nullability would be much easier to do than adding Union or Sum types. And from my experience with Kotlin, nullability already gives much of the benefits.

I am not saying I will not change my opinion in the future. I got my first job working with a language that has algebraic types (Scala), so my opinion may change in future. However, even if it changes I think getting Union or Sum types in a language like Go is impossible, while nullables are unlikely to happen but at least not impossible.

MarkMarine · on Aug 18, 2024

I don’t follow. Why would getting sum types in go be impossible?

There is already a proposal for this: https://github.com/golang/go/issues/57644

The generic type params already supports a sum type like interface, it’s almost there.

So, if you could, please expound with on “why”

Is there something in the type system, something in the compiler that prevents it?

Anyway, re: not up to HN standards… I’m not sure what you’re talking about. You made it to the front page, you called out some legitimate issues, I liked reading what you had to say, I was just asking for more

kokada · on Aug 18, 2024

I think I recommend you reading the proposal them. I don't claim of being a specialist in Go to say if it is possible or not, but I find it highly unlikely given how the language is.

The reason I think nullability is more likely than sum types is because Go tries to be a simple language (whatever the developers of the language think simple is), and nullability is simpler than sum types (both to implement and use). But again, I don't have any special insight

pansa2 · on Aug 18, 2024

I've heard this several times - "Go would be great if only they added <my-favourite-feature>"...

Go's philosophy is that a coherent, curated feature set is as valid an approach to language design as the C++/Python/... approach of adding every possible language feature.

In particular I doubt Go will ever add null-safety - given the above philosophy, the language's pervasive use of "zero values", and its strong commitment to backwards compatibility.

pkolaczk · on Aug 18, 2024

I can’t see how Go feature list is any more curated or more coherent than in other languages. Seriously to me it feels like many Go features were rushed and some added despite the evidence they are a bad idea. Like, why an unused import / unused variable is a hard error but an unused function is not? Or why for a very long time maps and channels were special by being generic but you could not use generics in your own types (that has fortunately changed). Why add nil / default values at the time where virtually everybody knows this is a bad feature and there are better solutions established?

EdwardDiego · on Aug 18, 2024

How do you distinguish between a value set to zero value explicitly, and one not set?

kzs0 · on Aug 18, 2024

In the cases where this is important, you can use pointers. Go allows you to make pointers to primitives as well (making the zero value nil) so you can explicitly define those edge cases.

You’d be surprised how infrequently that’s actually a concern though.

deergomoo · on Aug 18, 2024

Using pointers feels horrible as a workaround though because it completely muddies the intent. Is it a pointer because it could be zero or <not set>, is it a pointer for performance reasons, is it a pointer because the function mutates it? To me it flies directly in the face of Go's desire to keep things simple and easy to reason about.

> You’d be surprised how infrequently that’s actually a concern though

I admittedly don't write a huge amount of Go, but I run into this fairly often any time I'm dealing with user input. Something like an optional numeric input is not at all uncommon.

Neikius · on Aug 18, 2024

What? That is actually an everyday concern and devs assuming that information is not needed will bring pain down the road.

Just two cases of the top of my head: working with databases and json interop with other systems that actually do differentiate. And the second one should not happen, but people do make assumptions and sometimes those are inherently incompatible with how golang works.

metaltyphoon · on Aug 18, 2024

How do you have a PUT on an API to “unset a field”?

jochem9 · on Aug 18, 2024

There is no difference. You need to handle that based on the context you're in.

EdwardDiego · on Aug 18, 2024

I was wondering if there's a pattern of using Option types or something to signal the difference.

I ask because I hit this issue a fair few years ago with protobuf v3, we ended up using a wrapper type pattern. Can't recall why it was important to know at the time, but it was.

the_gipsy · on Aug 18, 2024

Sometimes you can abuse pointers as a (very) poor man's Option<T> type.

But it's turtles all the way down, you can't distinguish between for example `null` and "missing" when parsing JSON.

mervz · on Aug 18, 2024

Yikes...

wild_egg · on Aug 17, 2024

Been writing Go since 2012 and consider the status quo on all of these to be _features_. I may be in the minority there though

kbolino · on Aug 17, 2024

Being unable to access struct fields a.b.c without risking a panic sucks (and which caused the panic: a.b or b.c?). There's no remotely ergonomic solution to the problem, because there's no nil-safe struct member operator, there's no ternary operator, and if-statements can't be used as expressions.

This wouldn't be such a problem, since of course you can just "choose" to not use pointer- or interface-typed fields in "your" structs, but as soon as serialization, databases, or other people's APIs are involved, you don't have that "choice" anymore.

In the same vein, being unable to write e.g. &true or &"foo" is annoying too. If I can write &struct{...} why can't I write &true?

jen20 · on Aug 17, 2024

Not being able to take a pointer to a literal is one of my pet peeves with Go - especially when using the AWS SDK, which requires pointers to literals everywhere. At least with generics, a wrapper function doesn't require a separate function per type though, and can be simply:

    package ptr

    func To[T any](v T) *T {
        return &v
    }

deergomoo · on Aug 18, 2024

> especially when using the AWS SDK, which requires pointers to literals everywhere

On the plus side, at least the AWS SDK provides `aws.String()` and friends.

jen20 · on Aug 18, 2024

At least they did _something_, but it would be better if they moved into the modern age of Go and provided the version above as well.

anothername12 · on Aug 18, 2024

lol We have this function implemented at least a dozen times through our mono repo. All named slightly different.

kbolino · on Aug 17, 2024

Yeah, that particular case got better with generics. It's still more verbose to write ptr.To(true) than &true though.

Unfortunately, there's no generic type constraint for "is an interface" or even "is nilable" so you can't use generics to solve nil-safety issues in general.

jen20 · on Aug 18, 2024

Indeed, I see no reason why taking the address of a literal couldn’t be added though, that’s a cheap win.

isodude · on Aug 17, 2024

Horrid, but this works.

  &([]bool{true}[0])

But at least it allows me to write it without declaring a variable first.

postgressomethi · on Aug 17, 2024

I always thought there should be a two-arg overload of new, so you could write new(bool, true) or new(int, 20). Would solve the problem without any trickery.

seabrookmx · on Aug 17, 2024

C# also solved the nullability problem after the fact. It integrates well with the "?." operator also found in Typescript.

dexwiz · on Aug 17, 2024

Optional chaining is now in vanilla JS.

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

KronisLV · on Aug 17, 2024

This is really pleasant, when combined with the nullish coalescing operator: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

  const isStatusValid = myStore.someObject?.isStatusValid ?? false;

(e.g. if some data is not initialized along the way, or you put in some new object which just isn't meant to have that field but some related logic needs to check for it)

neonsunset · on Aug 17, 2024

Yup, it is Kotlin-style nullability analysis that is a default everywhere now. The last place that does not have ideal behavior is System.Text.Json, but that is fixable with a flag, maybe two, and there are edge cases where static analysis can't see through some expressions and assumes null so you have to specify it isn't with '!'. Nonetheless, works great with '??', '??=' expressions, required fields/props and pattern matching.

From practical standpoint, a project with Nullable: enable, WarningsAsErrors: nullable has almost never have to think about unexpected nulls again - a solved problem.

devjab · on Aug 18, 2024

Having nullability has its own set of issues. I prefer explicit error handling to null checks. That being said, I don’t think there is really that much of a difference in that you might use Interface as a JS Any sort of thing in Go (it’s obviously difference), and where you would deal with nil pointers is often where you would perform a null check anyway.

I prefer the go style of doing a check every time rather than some of the time with the risk of missing a time. Or on the flip-side simply defining your own default.

I know that a lot of people like how C# is becoming more and more reliant on the compiler to handle type checks, but it also opens up the risk of more developer mistakes. It’s mostly down to opinion and preference. So I don’t think you can really call it a “problem” as such. Having nullability will be a feature to some, and not having it will be a feature to others. I think both decisions were right for their respective languages. C# needs it for its close integration with mssql and its exception handling. Go doesn’t want it because it favours simplicity in types and direct error handling.

masklinn · on Aug 18, 2024

> Having nullability has its own set of issues. I prefer explicit error handling to null checks.

Nullability does not preclude explicit checks. In fact it requires them. Which Go currently does not: you can dereference any pointer or use any interface and it will blow in your face if either is nil.

> C# is becoming more and more reliant on the compiler to handle type checks, but it also opens up the risk of more developer mistakes.

That literally makes no sense. The entire point of moving checks to the compiler is that it catches developer errors upstream and prevents them.

> C# needs it for its close integration with mssql and its exception handling.

That's a nonsensical just-so story. Go on, explain how Haskell or Swift or Zig have explicit nullability for their close integration with mssql and exception handling (and while at it, do explain the relationship between exception handling and nullability)

moomin · on Aug 17, 2024

I am constantly criticising the C# community for not looking at other languages enough, but in this case it cuts both ways: C# has every feature mentioned here. Including a pretty good model for nullable.

Quothling · on Aug 17, 2024

I don't think having nullables can be called a feature as such. It adds complexity to your code and it directly goes against Go's philosophy of explicit error handling and simple types. Of course it's completely opinion based but I think having people handle errors directly instead of dealing with them through exceptions and checks makes for much more predictable code. I'm also not sure I really see the difference that GP does. I think that in a lot of code you're going to do a lot of null checks anyway, maybe even more than how often you'll check if your pointers are nil, but at least with Go you're not risking missing one.

Maybe the Go engineers didn't think about nullables, but I think there is a good chance they simply decided against them for various reasons.

moomin · on Aug 20, 2024

We could argue about the Go philosophy till the cows come home, but maybe it would be better to just consider that not having a nullable int makes modelling database tables painful.

coffeebeqn · on Aug 18, 2024

> C# has every feature

Yes it does. I just don’t understand why does everyone want every language to be the same language? If you want all the sugar then C#, Java, latest C++ all are perfectly mature and usable and fit that niche

Xeamek · on Aug 18, 2024

Because languages are more then just syntax, so ofcourse You'd rather have your favorite language copy some syntax you like, then be forced to switch entire stack to another language just to be able to use that syntax

jmyeet · on Aug 17, 2024

How is Go randomizing the map iteration order?

In Java, objects are responsible for their equals/hashCode implementations. The contract they must abide by is:

1. If two objects are equal, they must produce the same hash code; and

2. If they are not equal, they may produce the same hash code.

So if you had a list of 10 Strings and put them in a map in Java, it's likely you'll get a deterministic order for iterating over them unless you added a random factor. That factor could be a random seed tied to the map that you XOR the hash code with.

You can't really change the hash code itself to avoid a Hash DoS attack because you might break that contract. So how does Go (and Rust?) deal with that? Is Go adding a random seed to each hash map? If not, what is it doing?

As for nullability, there's no going back once you use a type system that expresses nullability.

Lastly, PHP arrays are incredibly convenient, ignoring the weirdness with them being array and hash map hybrids. But th ekey aspect is that they maintain insertion order when you use them like a map. This is so often what you want. Yes, other langauges do this too (eg Java's LinkedHashMap) but it's (IMHO) such a useful default.

jhgg · on Aug 17, 2024

> Is Go adding a random seed to each hash map?

Yes: https://github.com/golang/go/blob/27093581b2828a2752a6d2711d...

masklinn · on Aug 18, 2024

> How is Go randomizing the map iteration order?

Go does seed hashes on a per-hashmap basis, as does rust. But that still gives you consistent iteration order for a given map: https://play.rust-lang.org/?version=stable&mode=debug&editio...

Go also randomly offsets the start point of every map iteration: https://go.dev/play/p/SlfDjGKi77L

conradludgate · on Aug 17, 2024

What you do is have a "family" of hash functions. The random seed value chooses a new hash function. The same properties apply to each individual map's hash function, but each map has a different hash function

Secondary, go map iteration starts from a random position in the hashmap. The order on subsequent iterations is the same, but rotated as a result of the random start index

chowells · on Aug 17, 2024

You don't use the object hash as the key directly. You combine it with a value chosen randomly per-map using a function that works hard to erase correlations between the input object hash and the output table location.

karmakaze · on Aug 17, 2024

When I read 'unordered list', I was thinking it was more than 3 things:

  - Keyword and default arguments for functions
  - Nullability (or nillability)
  - Ordered maps in standard library

It's was a play on the hash map iteration.