Golang vs. C# (.NET 5.0) at Benchmarks Game

2021-08-2418:39101110benchmarksgame-team.pages.debian.net

Always look at the source code. These are only the fastest programs. Do some of them use manually vectorized SIMD? Look at the other programs. They may seem more-like a fair comparison to you.

Always look at the source code.

These are only the fastest programs. Do some of them use manually vectorized SIMD? Look at the other programs. They may seem more-like a fair comparison to you.


Read the original article

Comments

  • By throwaway894345 2021-08-2419:054 reply

    I can't imagine a better setup for a language flame war :). I really like debating languages, so I hope it doesn't go that direction.

    One of the standard caveats with this particular benchmark game with respect to Go is idiomatic optimizations are prohibited. To use the btree example, Go's memory management is low latency and non-moving, so allocations are expensive--any Go programmer writing a performance-sensitive btree implementation would pre-allocate the nodes in a single allocation--an absolutely idiomatic and trivial optimization--but the benchmark game requires that the nodes are allocated one at a time. In other words, the C# version is idiomatic, but the Go version is expressly contrived to be slower--not a very useful comparison.

    Mad respect for .Net though; it's really impressive, I like the direction it's going, I'm glad it exists, etc.

    • By lalaithion 2021-08-2420:141 reply

      The point of the btree example is to test how good programming languages are at allocating tree-like structures that can't be preplanned. It's a valid argument that this is a rare real-world requirement, but it's not contrived to be slower.

      • By throwaway894345 2021-08-2420:511 reply

        > The point of the btree example is to test how good programming languages are at allocating tree-like structures that can't be preplanned.

        Forcing allocations for every node isn't justified by a desire to demonstrate dynamically sized binary trees. A naive dynamically-sized tree would just keep a list of node buffers and allocate a new node buffer every time the previous one fills up (perhaps with subsequent buffers doubling in size). The benchmark is, by all appearances, contrived to be slower.

        • By igouy 2021-08-2422:411 reply

          > … a desire to demonstrate dynamically sized binary trees…

          Cart before horse — the binary trees are justified by a desire to demonstrate memory allocation.

          http://hboehm.info/gc/gc_bench/

          • By throwaway894345 2021-08-2512:482 reply

            1. That’s plainly not the case here since other languages are allowed to use custom allocators

            2. Why use a binary tree benchmark in the first place if you’re going to limit the implementation to certain naive implementations (and again, only for one language)? Why not just measure allocations outright or at least call the benchmark “allocator performance”?

            3. Showing allocation performance doesn’t help anyone understand the actual performance of the language, which is transparently what everyone uses these benchmarks for. If they wanted a general idea for language performance they would allow trivial, idiomatic optimizations. A benchmark that shows allocation performance is worthless, and a suite of benchmarks that includes a benchmark for allocation performance but not GC latency is worse than worthless: it’s misleading because latency is the more important concern and it’s what these bump allocators trade in order to get their fast allocation performance.

            • By lalaithion 2021-08-2514:501 reply

              Looking at the fastest Java, Haskell, Racket, OCaml, JavasScript, C#... they're all doing per-node allocation using the standard allocator, and all beating Go. The limit is not just for Go. I don't know why you think that Go is the only one being disadvantaged here.

            • By igouy 2021-08-2516:001 reply

              1. Please be specific.

              2. Again, not limited to only one language.

              3. You are allowed an opinion.

              • By throwaway894345 2021-08-2518:041 reply

                1. Rust, C++, C

                2. It is in practice, but regardless of the size of the cohort there’s no compelling reason for these limitations.

                3. And you’re entitled to ignore reason. It cuts both ways.

                • By igouy 2021-08-2518:291 reply

                  1. Which program?

                  2. Again, not limited to only one language. You are allowed an opinion about what is or is not compelling.

                  3. As before.

                  • By throwaway894345 2021-08-2518:371 reply

                    1. btree; see the Rust version which uses a bump allocator for example

                    2. Doesn't matter whether it's exactly one language.

                    > You are allowed an opinion about what is or is not compelling.

                    It's not a matter of opinion. The definitional purpose of benchmarks is to indicate something about reality; if you contrive rules that cause the benchmarks to deviate from reality, they lose their utility as benchmarks. I've demonstrated that the rules are contrived (i.e., they prohibit real-world, idiomatic optimizations), so I think we can say as a matter of fact that these benchmarks aren't useful.

                    Of course, no one can force anyone else to see reason (but I don't have any interest in talking with unreasonable people).

                    • By igouy 2021-08-2519:221 reply

                      1. bumpalo: Star 586, Fork 52 — a library, not implement your own custom allocator.

                      2. You have repeatedly claimed "only for one language".

                      > I think we can say as a matter of fact…

                      Apparently that is your opinion.

                      • By throwaway894345 2021-08-2519:401 reply

                        1. See all of the other arguments in this thread about "contrived rules"

                        > You have repeatedly claimed "only for one language".

                        How many languages are in practice prevented from using pre-allocation? How big is the cohort? Does it matter if it's exactly one or if it's two or three? Why are you fixating on this relatively irrelevant point rather than the more substantial point that has been reiterated a dozen times?

                        > Apparently that is your opinion.

                        In the same sense that "the sky is blue" is merely my opinion.

                        • By igouy 2021-08-2522:501 reply

                          > How many languages are in practice prevented from using pre-allocation?

                          How many provide GC?

                          The substantial point is that you wish special treatment for Go lang.

                          > … "the sky is blue" is merely my opinion…

                          When all can see the vibrant orange red sunset.

                          • By throwaway894345 2021-08-2613:15

                            Not special treatment, just a rules that allow for idiomatic programs. Of course I’ve said as much a dozen times now and you won’t engage with it, so I don’t expect you to now. ¯\_(ツ)_/¯

    • By _ph_ 2021-08-2420:281 reply

      That is one reason I don't consider the language benchmark to be really relevant: a lot of benchmarks are taited by the exact rules of the competition.

      • By throwaway894345 2021-08-2512:562 reply

        And these rules are particularly bizarre. Rust, C, and C++ are all allowed to use custom allocators while Java and C# have GCs which are optimized for this particular micro benchmark but not for real world applications (although I hear Java’s GCs are making good headway on latency lately, and clearly all of the GCs are suitable for general application development). So it’s really just Go which is forbidden from an idiomatic optimization as far as I can tell.

        • By Jweb_Guru 2021-08-2620:17

          Go optimizes ridiculously insanely for latency because it's driven by Hacker News articles about GC latency, not because it's better for "real world applications." Its atrocious throughput is entirely a consequence of that decision and is one of very the few useful things that the benchmark games do actually demonstrate. Java's unwillingness to provide an allocator with such an extreme tradoeff has proven a pretty good idea, and now they are able to provide only moderately worse latency than Go for "real world applications" with far better throughput.

        • By igouy 2021-08-2518:361 reply

          > … allowed to use custom allocators…

          No. They are allowed a library memory pool.

          As-it-says: 'Please don't implement your own custom "arena" or "memory pool" or "free list" - they will not be accepted.'

    • By abledon 2021-08-2419:401 reply

      reminds me of the 'im tired of being a hipster' post on frontpage recently.... "go learn 'unhip' tech/languages and live your life"

    • By igouy 2021-08-2420:111 reply

      > absolutely idiomatic and trivial optimization

      Which is not accepted for the C# programs either.

      sync.Pool is accepted —

      https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

      > the Go version is expressly contrived to be slower

      The requirements were contrived in April 2008.

      afaict Go initial release was March 2012.

      • By throwaway894345 2021-08-2421:061 reply

        > Which is not accepted for the C# programs either.

        Because C# doesn't benefit from this kind of optimization. Its GC is generational, which means that it has very fast allocations at the expense of high latency. In most applications, lower latency is more important than slower allocations (not least of all because these batch-allocating optimizations are nearly trivial), but these benchmarks don't reflect that at all.

        > The requirements were contrived in April 2008. afaict Go initial release was March 2012.

        Contrived = "the rules artificially prohibit idiomatic optimizations". It doesn't require that the maintainers have a prejudice against Go (although as you point out, the maintainers have had a decade to revisit their rules).

        • By igouy 2021-08-2422:531 reply

          > Because C# doesn't benefit…

          C# does provide a memory pool implementation.

          • By throwaway894345 2021-08-2513:041 reply

            So what about C, Rust, and C++? They’re all allowed to use bespoke pools and custom allocators. The fastest Rust implementation imports an allocator crate. No doubt you can lawyer the rules to make sure Go still appears slow, but in reality this benchmark doesn’t tell you anything about the language’s general performance because while Go’s allocator is slower, idiomatic Go allocates much less frequently than other languages, but this benchmark prohibits idiomatic optimizations. Moreover, there isn’t a benchmark that shows gc latency, which is the flip side of the allocator coin. So if you really want to die on the hill of “worthless but well-lawyered benchmark rules” be my guest.

            • By igouy 2021-08-2516:171 reply

              What-about-ism.

              https://golang.org/doc/faq#garbage_collection

              > They’re all allowed to use bespoke pools and custom allocators.

              No. They are allowed a library memory pool.

              As-it-says: 'Please don't implement your own custom "arena" or "memory pool" or "free list" - they will not be accepted.'

              > … while Go’s allocator is slower…

              So that tiny tiny program shows it's slower because it's slower.

              • By throwaway894345 2021-08-2518:291 reply

                > What-about-ism.

                Not sure what you're referring to here.

                > No. They are allowed a library memory pool.

                Yes, this is a contrived rule. In reality, a Go developer would write the extra ~dozen lines (all of the heavy lifting done by the builtin slice implementation) and call it a day.

                > So that tiny tiny program shows it's slower because it's slower.

                Tautology. It's slower because the contrived rules preclude idiomatic optimizations.

                My point is that these contrived benchmarks don't indicate anything about the relative performance of these languages, but you keep responding with some variation of "but Go is slower in these benchmarks!" which everyone already agrees with. So unless you're going to actually address the point, I don't see the point in continuing on. It feels like you're hell-bent on using this thread for your personal programming language holy war, which is disinteresting to me (see again my first post) and against the site rules.

                • By igouy 2021-08-2518:441 reply

                  > … a Go developer would write the extra ~dozen lines…

                  And a C# developer could write a program that would avoid GC, and a Java developer…, and…

                  It feels like you're hell-bent on using this thread for your personal programming language holy war…

                  • By throwaway894345 2021-08-2519:44

                    > And a C# developer could write a program that would avoid GC, and a Java developer…, and…

                    Are those idiomatic? If so, then they should be permitted to apply those optimizations. Again, the whole point of benchmarks is to indicate real-world use.

                    > It feels like you're hell-bent on using this thread for your personal programming language holy war…

                    I've reiterated my substantial point over and over again ("contrived rules don't indicate real world use, which is the definitional purpose of benchmarks") and you still haven't addressed it. But in any case, perhaps if both of us think the other is waging a holy war, it's an indicator that the thread has run its course.

  • By Thaxll 2021-08-2418:595 reply

    When you see how much effort it takes to C# and Java to optimize the runtime, there are a lot of people working on that. C# is fast but you see that it uses between 2 and 32 times the memory that Go needs.

    Overall you can see how fast Go is, it has little optimization compare to C# and it's as fast. Compare this: https://benchmarksgame-team.pages.debian.net/benchmarksgame/... and overly complicated C# version: https://benchmarksgame-team.pages.debian.net/benchmarksgame/... ( avx, Intrinsics etc ... )

    • By JamesSwift 2021-08-2419:032 reply

      I definitely have run into this, even when using 'server mode' in asp.net core. I never was able to figure out why the C# version of my POC was using so much memory, but rewriting to golang ended up using a very predictable, minimal amount of memory in comparison.

      • By Mattish 2021-08-2419:331 reply

        Server mode is much less likely to incur GC. Were you causing enough memory usage to force your app to actually free memory?

        It will intentionally use more memory for the sake of throughput, hence why this post has all .NET program flag for it, as it's a _speed_ benchmark primarily.

        • By JamesSwift 2021-08-2513:10

          I cant say for certain, but I'm pretty sure I manually GC'ed as a test and it didn't seem to help. Its been a while and the C# ended up being a temporary approach until I saw how much less memory the golang version used.

          They performed almost equivalently for RPS I believe.

      • By azth 2021-08-2419:072 reply

        My guess is that golang's GC is optimized for latency at the expense of throughput, limiting the max memory size.

        • By xh-dude 2021-08-2419:50

          It is, and it’s not directly tunable … the opinion is ‘we’re IO-bound, not compute-bound’.

          I spent a chunk of time recently (10s of hours) doing dumb C# vs Go benchmarks - files and networking, and nothing worth taking seriously - just, usually the part about being IO-bound was true. C# is really impressive and was just a little slower with the best async solutions I could come up with. The machinery for async has overhead, so do Go routines and channels … the first-pass, not very performant code was just a little faster and IMHO clearer with Go (but I’m much better with Go /shrug).

        • By merb 2021-08-2419:371 reply

          nah he just compares two different things. basically his golang hello world probably had basically nothing while his dotnet version used the "Microsoft.NET.Sdk.Web" which basically pulls the shared framework which will load a ton of stuff. BUT even after that the memory usage might be bigger. however does it really matter? I mean dotnet is not a big memory hog. it's pretty lightweight for what it is. compare it to java and the golang number would be insane.

          • By JamesSwift 2021-08-2513:071 reply

            I tried to do the most minimal, idiomatic design for both. I didn't save the C# code (I think it probably was just asp.net core and newtonsoft.json), but here is the go version [1].

            It basically just loads a JSON file into memory then allows you to query the data with 2 API endpoints.

            [1] - https://github.com/J-Swift/GamesDbMirror-go

            • By merb 2021-08-2522:03

              well asp.net core is not really minimal or idomatic. it pulls a whole framework your code doesn't do a lot of things that asp.net core would do. sadly since nancyfx died there arent that many c# http framework that are as lightweight as the golang once. asp.net core is more like java spring btw. nowdays most c# http frameworks do call `<FrameworkReference Include="Microsoft.AspNetCore.App" />` which is really really big compared to just Microsoft.NETCore.App most often the defaut aspnetcore also configures a "secure" application (working cors, etc.)

    • By joelfolksy 2021-08-2422:05

      "( avx, Intrinsics, etc ... )"

      I have to give you credit for trying to apply the Rule of Three to a single criticism.

      Of course, I don't really understand how the fact that someone took the time to vectorize the C# submission is supposed to be a mark against C#...

    • By agumonkey 2021-08-2419:582 reply

      and afaik, .net has record types unlike the jvm (yet) which means java is even worse

      • By kaba0 2021-08-2420:14

        I think you mean structs (value types. Will be called primitive types in Java). Records are not too interesting from a performance pov (and java has them, and I think they actually predate c#’s), though java will likely be able to optimize serialization/deserialization of records better.

      • By ternaryoperator 2021-08-2420:451 reply

        Java recently added record types.

        • By agumonkey 2021-08-255:431 reply

          did it land in a release ? i thought it was still at review proposal

          • By scns 2021-08-2520:29

            In openjdk 16 IIRC.

  • By Guillaume86 2021-08-2418:57

    The focus on performance since Core is really nice to see, dotnet 6 is continuing the trend as well: https://devblogs.microsoft.com/dotnet/performance-improvemen...

HackerNews