> The idea of sovereignty is a cornerstone of how we organize our global society.
It is, but it's kind of a thin lie.
How's sovereignty going for ukraine? Hong kong? Chechnya, South ossetia, and abkhazia? Puerto Rico? Western Sahara? Parts of Sudan? Border regions of bhutan? South american fisheries? People trying to set up micronations?
No, the whole point is to eliminate dependencies that they have to maintain. "not obligate" really doesn't mean anything if it's available as a backend--the obligation is on the Zig developers to keep it working, and they want to eliminate that obligation.
And the original question was "how will they reivent the wheel on the man-years of optimization work went into LLVM to their own compiler infrastructure?" -- the answer is that Andrew naively believes that they can recreate comparable optimization.
There are a whole lot of misstatements about Zig and other matters in the comments here by people who don't have much knowledge about what they are talking about--much of the discussion of using low-level vs high-level languages for writing compilers is nonsense. And one person wrote of "Zig and D" as if those languages are comparable, when D is at least as high level as C++, which it was intended to replace.
> the answer is that Andrew naively believes that they can recreate comparable optimization.
That's exactly wrong.
> There are a whole lot of misstatements about Zig and other matters in the comments here by people who don't have much knowledge about what they are talking about.
To clarify, my statement was based on comments I have seen and heard from Andrew Kelley when discussing this subject. I can't locate those at the moment, but here is https://news.ycombinator.com/item?id=39156426 by mlugg, a primary member of the Zig development team (emphasis added):
"To be clear, we aren't saying it will be easy to reach LLVM's optimization capabilities. That's a very long-term plan, and one which will unfold over a number of years. The ability to use LLVM is probably never going away, because there might always be some things it handles better than Zig's own code generation. However, trying to get there seems a worthy goal; at the very least, we can get our self-hosted codegen backends to a point where they perform relatively well in Debug mode without sacrificing debuggability."
The current interim plan (which I think was developed after the comments that I heard from Andrew, perhaps in recognition of their naivete) is for Zig to generate LLVM binary files that can be passed to a separate LLVM instance as part of the build process. Is that "a first-class supported backend target for compilation"? I suppose it's a matter of semantics, but that certainly won't be the current LLVM backend that does LLVM API calls.
They are not your keys, they're a scoped agreement; using CC keys with OpenClaude meant misrepresenting the client to trick the other party into a discount.
Given the lack of clear communication and the fact that their primary competitor openly supports the use of bespoke harnesses, I highly doubt this is an incorrect announcement.
Anthropic is destroying goodwill that is hard-won in this space. At the end of the day, people just need to do their work in a way that makes sense for them. In my case (someone who has been building ML/AI tools for 25 years @ MS & Apple), I have much better results using my bespoke harness. If I'm paying $200/month for compute, I should be able to use it in a way that works for me. Given the push back, I'm not alone.
It's saying something about the announcement, it's not saying something about the correctness of the announcement.
I used the word hearsay to imply that flip-flopping should only be a judgement on the comms of the entity accused of flip-flopping, not information living on some third party source.
And I was referring to the all of the historical flip-flopping. This new flipping is just proving the point.
Of course, you're simply being pedantic. Everyone knows why they are making this change (which is more important than your silly take on what constitutes flip-flopping).
The point: Anthropic is losing subscribers because it has no idea what it actually wants to be.
> However you will never convince someone anti-automatic resource management from ideological point of view.
It's generally accepted that 'explicit is better than implicit' and what you want in the end is deterministic, machine checked resource management. Automatic resource management is a subset of machine checked resource management. There is a large, somewhat less explored space of possibility (for example seL4 lives in this space) where you have to manually write the resource declarations and either the compiler or some other static analysis checks your work.
Except languages like Rust and C++, are full of implicit behaviour, so it is kind of interesting argument.
Even C has its implicit moments, with type conversions, signal handling, traps, setjmp/longjmp possibly hidden in libraries, thread handling across forks,
> In reality it is a pretty deterministic (modulo compiler options, CPU flags, etc)
IIRC this was not ALWAYS the case, on x86 not too long ago the CPU might choose to put your operation in an 80-bit fp register, and if due to multitasking the CPU state got evicted, it would only be able to store it in a 32-bit slot while it's waiting to be scheduled back in?
It might not be the case now in a modern system if based on load patterns the software decides to schedule some math operations or another on the GPU vs the CPU, or maybe some sort of corner case where you are horizontally load balancing on two different GPUs (one AMD, one Nvidia) -- I'm speculating here.
I was bit by this years ago when our test cases failed on Linux, but worked on macos. pdftotext was behaving differently (deciding to merge two lines or not) on the two platforms - both were gcc and intel at the time. When I looked at it in a debugger or tried to log the values, it magically fixed itself.
Eventually I learned about the 80-bit thing and that macos gcc was automatically adding a -ffloat-store to make == more predictable (they use a floats everywhere in the UI library). Since pdftotext was full of == comparisons, I ended up adding a -ffloat-store to the gcc command line and calling it a day.
> IIRC this was not ALWAYS the case, on x86 not too long ago the CPU might choose to put your operation in an 80-bit fp register, and if due to multitasking the CPU state got evicted, it would only be able to store it in a 32-bit slot while it's waiting to be scheduled back in?
I don't think the CPU was ever allowed to do that, but with your average compiler you were playing with fire.
Did any actual OS mess up state like that? They could and should save the full registers. There's even a bultin instruction for this, FSAVE.
This is the kind of misinformation that makes people more wary of floats than they should be.
The same series of operations with the same input will always produce exactly the same floating point results. Every time. No exceptions.
Hardware doesn't matter. Breed of CPU doesn't matter. Threads don't matter. Scheduling doesn't matter. IEEE floating point is a standard. Everyone follows the standard. Anything not producing indentical results for the same series of operations is *broken*.
What you are referring to is the result of different compilers doing a different series of operations than each other. In particular, if you are using the x87 fp unit, MSVC will round 80-bit floating point down to 32/64 bits before doing a comparison, and GCC will not by default.
Compliers doesn't even use 80-bit FP by default when compiling for 64 bit targets, so this is not a concern anymore, and hasn't been for a very long time.
There's just so many "but"s to this that I can't in good faith recommend people to treat floats as deterministic, even though I'd very much love to do so (and I make such assumptions myself, caveat emptor):
- NaN bits are non-deterministic. x86 and ARM generate different sign bits for NaNs. Wasm says NaN payloads are completely unpredictable.
- GPUs don't give a shit about IEEE-754 and apply optimizations raging from DAZ to -ffast-math.
- sin, rsqrt, etc. behave differently when implemented by different libraries. If you're linking libm for sin, you can get different implementations depending on the libc in use. Or you can get different results on different hardware.
- C compilers are allowed to "optimize" a * b + c to FMA when they wish to. The standard only technically allows this merge within one expression, but GCC enables this in all cases by default on some `-std`s.
You're technically correct that floats can be used right, but it's just impossible to explain to a layman that, yes, floats are fine on CPUs, but not on GPUs; fine if you're doing normal arithmetic and sqrt, but not sin or rsqrt; fine on modern compilers, but not old ones; fine on x86, but not i686; fine if you're writing code yourself, but not if you're relying on linear algebra libraries, unless of course you write `a * b + c` and compile with the wrong options; fine if you rely on float equality, but not bitwise equality; etc. Everything is broken and the entire thing is a mess.
Yes, there are a large number of ways to fall into traps that cause you to do a different series of operations when you didn't realise that you did. But that's still ultimately what all your examples are. (Except the NaN thing!)
I still think it's important to fight the misinformation.
Programmers have been conditioned to be so afraid of floats that many believe that doing a + b has an essentially random outcome when it doesn't work that way at all. It leads people to spend a bunch of effort on things that they don't need to be doing.
I have had Claude read usbpcap to reverse engineer an industrial digital camera link. It was like pulling teeth but I got it done (I would not have been able to do it alone)
You're missing the philosophical principle that the more laws you have the wider the breadth of the domain that laws can interpret becomes, and that laws generally accrue. This is not by design, and there are jurisdictions which explicitly curtail this by having sunset laws.
I think the implications is slightly weaker -- it implies some immutable law of training datasets?
reply