The better part of this is having a local-first AI, particularly because it has tool-calling builtin & structured output.
I haven't pushed out a full version[1] which uses ducklake-wasm + this to make a completely local SQL answering machine, but for now all it does is retype prompts in the browser.
Flickr was the coolest thing Yahoo had when I worked there (Brickhouse was a close second).
I really loved all the places where they snuck in "Game Never Ending" in the product, because they didn't set out to make a photo sharing product, but steered hard into that.
Flickr was the only property which was allowed their own version of PHP and despite having PHP inside, every single URL said ".gne" (Game Never Ending). I worked for the PHP team and that was my only excuse to show up to work in the SF office instead of being stuck in Sunnyvale when visiting the US.
They had all the right bits of architecture built out - rest of Yahoo had great code (like vespa or the graph behind Yahoo 360), but everything was more complex than it should be.
Flickr had the simplest possible approach that worked and they tried it before building anything more complex - the image urls, the resize queues, the way albums were stored, machine-tags, gps co-ordinates.
I also took a lot of photos to put up on flickr, trying to get featured on the explore page up front - it was like getting published in a magazine.
Every presentation I made had CC images backed by flickr, it was a true commons to share and take.
+1 on Flickr being the best acquisition and product Yahoo! had.
I still have my account and old photos there. And because I licensed most of them as CC, a couple of them landed on Wikipedia because of that - felt nice.
I had everything set as CC until I noticed a photo of my very pregnant wife was getting many more views then anything else and I found it cited in a paper on training AI. That was somehow less endearing then someone getting a good use out of my images (which also happened at least once with one of my images)
When I was doing more graphics-rich presentations, the CC photo resource on Flickr was really useful. (In case someone asks, I usually wasn't being paid directly for giving presentations so I convinced myself I could feel comfortable using CC content in general even with strings like non-commercial attached.)
From my point of view Yahoo destroyed Flickr. I was a happy user for many years and lost access to my photos due to authentication changes. At least Google had the decency to just shut down Reader as opposed to Yahoo's enshittification of a product that sparked joy.
Strong agree that Flickr went downhill rapidly when acquired by Yahoo - but also happy to report that it has since bounced back.
The community isn’t the same of course, but the platform itself is a joy to use again - especially as someone who got tired of Instagram when it stopped being about photography.
As the PR clearly points out, you can do this in a register but not inside vectors.
I don't think fastdiv has had an update in years, which what I've used because compilers can't do "this is a constant for the next loop of 1024" like columnar sql needs.
> Multiplication alone requires depth-8 trees with 41+ leaves i.e. minimal operator vocabulary trades off against expression length.
That is sort of comparable to how NAND simplify scaling.
Division is hell on gates.
The single component was the reason scaling went like it did.
There was only one gate structure which had to improve to make chips smaller - if a chip used 3 different kinds, then the scaling would've required more than one parallel innovation to go (sort of like how LED lighting had to wait for blue).
If you need two or more components, then you have to keep switching tools instead of hammer, hammer, hammer.
I'm not sure what you mean by this? It's true that any Boolean operation can be expressed in terms of two-input NAND gates, but that's almost never how real IC designers work. A typical standard cell library has lots of primitives, including all common gates and up to entire flip-flops and RAMs, each individually optimized at a transistor level. Realization with NAND2 and nothing else would be possible, but much less efficient.
Efficient numerical libraries likewise contain lots of redundancy. For example, sqrt(x) is mathematically equivalent to pow(x, 0.5), but sqrt(x) is still typically provided separately and faster. Anyone who thinks that eml() function is supposed to lead directly to more efficient computation has missed the point of this (interesting) work.
Yeah, what you're going to get is more efficient proofs: you can do induction on one case to get results about elementary functions. Not sure where anyone's getting computational efficiency thoughts from this.
adapted from a statement found in Roman author Publius Flavius Vegetius Renatus's tract De Re Militari (fourth or fifth century AD), in which the actual phrasing is Igitur qui desiderat pacem, præparet bellum ("Therefore let him who desires peace prepare for war").
>> It replicates data across multiple, independent DRAM channels with uncorrelated refresh schedules
This is the sort of thing which was done before in a world where there was NUMA, but that is easy. Just task-set and mbind your way around it to keep your copies in both places.
The crazy part of what she's done is how to determine that the two copies don't get get hit by refresh cycles at the same time.
Particularly by experimenting on something proprietary like Graviton.
Access optimization or interleaving at a lower level than linearly mapping DIMMs and channels. x86 cache lane size is 64 bytes, so it must be a multiple. Probably 64*2^n bytes.
EPYC chips have multiple levels of NUMA - one across CCDs on the one chip, and another between chips in different motherboard sockets. As a user under Linux you can treat it as if it was simple SMP, but you’ll get quite a bit less performance.
Home PCs don’t do NUMA as much anymore because of the number of cores and threads you can get on one core complex. The technology certainly still exists and is still relevant.
> Surely those are at least an order of magnitude larger than Tolkien's prose and might still benefit from a RAG.
At some point, this is a distributed system of agents.
Once you go from 1 to 3 agents (1 router and two memory agents), it slowly ends up becoming a performance and cost decision rather than a recall problem.
> Increased speed only gets us where we want to be sooner if we are also heading in the right direction.
This is a real problem when the "direction" == "good feedback" from a customer standpoint.
Before we had a product person for every ~20 people generating code and now we're all product people, the machines are writing the code (not all of it, but enough of it that I will -1 a ~4000 line PR and ask someone to start over, instead of digging out of the hole in the same PR).
Feedback takes time on the system by real users to come back to the product team.
You need a PID like smoothing curve over your feature changes.
Like you said, Speed isn't velocity.
Specifically if you have a decent experiment framework to keep this disclosure progressive in the customer base, going the wrong direction isn't a huge penalty as it used to be.
I liked the PostHog newsletter about the "Hidden dangers of shipping fast", I can't find a good direct link to it.
Don't wait for feedback from "real users", become a user!
This tayloristic idea (which has now reincarnated in "design thinking") that you can observe someone doing a job and then decide better than them what they need is ridiculous and should die.
Good products are built by the people who use the thing themselves. Doesn't mean though that choosing good features (product design and engineering) isn't a skill in itself.
Too often that isn't possible. There is a lot of domain knowledge in making a widget there is a lot of domain knowledge in doing a job. when e complex job needs a complex widget often there isn't enough overlap to be experts in both.
sure 'everyone' drives so you can be a domain expert in cars. However not everyone can be an astronaught - rockets are complex enough to need more people than astronaughts and so most people designing spaceships will never have the opportunity to use one.
I am not asking anybody to be an expert in both (although I am sure such people exist, however rare); I am saying people should ideally have some skill in both. Also, people can collaborate, and learn new skills.
If you're bottle-necked by waiting for the users of your product to give a feedback, you clearly need to spend more time learning how to be a user yourself. Or hire people with some domain skill who can also code.
Have been there, we got pushback from users and we had to back off with releases. Users hunted product owner with pitchforks and torches.
As dev team we were able to crank the speed even more and silly product people thought they are doing something good by demanding even more from us. But that was one of the instances where users were helpful :).
People use dozens of apps every day to do their work. Just think about how are you going to make time to give feedback to each of each.
> Just think about how are you going to make time to give feedback to each of each.
That's pretty much solved by the size of the audiences. You won't give feedback on 12 apps, but 11 other people will probably do so on 11 different apps.
Of course, the issue with my domain is that there's plenty of feedback, and product owners just dismiss it. Burn down your entire portfolio to get that boosted shareholder value for the next earnings report.
And how do you solve that when you are one of those 11 apps when no one wants to talk to you because they have their work to do? Where you don’t have power to say that kind of thing.
Well by asking repeatedly of course but you just piss people off.
Have you ever given feedback to Atlassian, Google, Microsoft?
I haven't pushed out a full version[1] which uses ducklake-wasm + this to make a completely local SQL answering machine, but for now all it does is retype prompts in the browser.
[1] - https://notmysock.org/code/voice-gemini-prompt.html
reply