> I think there is a section of programmer who actually do like the actual typing of letters, numbers and special characters into a computer...
Reminds me of this excerpt from Richard Hamming's book:
> Finally, a more complete, and more useful, Symbolic Assembly Program (SAP) was devised—after more years than you are apt to believe during which most programmers continued their heroic absolute binary programming. At the time SAP first appeared I would guess about 1% of the older programmers were interested in it—using SAP was “sissy stuff”, and a real programmer would not stoop to wasting machine capacity to do the assembly. Yes! Programmers wanted no part of it, though when pressed they had to admit their old methods used more machine time in locating and fixing up errors than the SAP program ever used. One of the main complaints was when using a symbolic system you do not know where anything was in storage—though in the early days we supplied a mapping of symbolic to actual storage, and believe it or not they later lovingly pored over such sheets rather than realize they did not need to know that information if they stuck to operating within the system—no! When correcting errors they preferred to do it in absolute binary addresses.
I think this is beside the point, because the crucial change with LLMs is that you don’t use a formal language anymore to specify what you want, and get a deterministic output from that. You can’t reason with precision anymore about how what you specify maps to the result. That is the modal shift that removes the “fun” for a substantial portion of the developer workforce.
its not not about fun. when I'm going through the actual process of writing a function, I think about design issues. about how things are named, about how the errors from this function flow up. about how scheduling is happening. about how memory is managed. I compare the code to my ideal, and this is the time where I realize that my ideal is flawed or incomplete.
I think alot of us dont get everything specced out up front, we see how things fit, and adjust accordingly. most of the really good ideas I've had were not formulated in the abstract, but realizations had in the process of spelling things out.
I have a process, and it works for me. Different people certainly have other ones, and other goals. But maybe stop telling me that instead of interacting with the compiler directly its absolutely necessary that instead I describe what I want to a well meaning idiot, and patiently correct them, even though they are going to forget everything I just said in a moment.
> ... stop telling me that instead of interacting with the compiler directly its absolutely necessary that instead I describe what I want to a well meaning idiot, and patiently correct them, even though they are going to forget everything I just said in a moment.
This perfectly describes the main problem I have with the coding agents. We are told we should move from explicit control and writing instructions for the machine to pulling the slot lever over and over and "persuading the machine" hoping for the right result.
I'd be interested in learning more about your workflow. I've certainly used plaintext files (and similar such things) to aid in project planning, but I've never used paper beyond taking a few notes here and there.
Not who you’re replying to, but I do this as well. I carry a pocket notebook and write paragraphs describing what I want to write. Sometimes I list out the fields of a data structure. Then I revise. By the time I actually write the code, it’s more like a recitation. This is so much easier than trying to think hard about program structure while logged in to my work computer with all the messaging and email.
> because the crucial change with LLMs is that you don’t use a formal language anymore to specify what you want, and get a deterministic output from that
You don't just code, you also test, and your safety is just as good as your test coverage and depth. Think hard about how to express your code to make it more testable. That is the single way we have now to get back some safety.
But I argue the manual inspection of code and thinking it through in your head is still not strict coding, it is vibe-testing as well, only code backed by tests is not vibe-based. If needed use TLA+ (generated by LLM) to test, or go as deep as necessary to test.
I don't know what book you're talking about, but it seems that you intend to compare the switch to an AI-based workflow to using a higher-level language. I don't think that's valid at all. Nobody using Python for any ordinary purpose feels compelled to examine the resulting bytecode, for example, but a responsible programmer needs to keep tabs on what Claude comes up with, configure a dev environment that organizes the changes into a separate branch (as if Claude were a separate human member of a team) etc. Communication in natural language is fundamentally different from writing code; if it weren't, we'd be in a world with far more abundant documentation. (After all, that should be easier to write than a prompt, since you already have seen the system that the text will describe.)
> Nobody using Python for any ordinary purpose feels compelled to examine the resulting bytecode, for example,
The first people using higher level languages did feel compelled to. That's what the quote from the book is saying. The first HLL users felt compelled to check the output just like the first LLM users.
But there is no reason to suppose that responsible SWEs would ever be able to stop doing so for an LLM, given the reliance on nondeterminism and a fundamentally imprecise communication mechanism.
That's the point. It's not the same kind of shift at all.
Assembly was a "high level" language when it was new -- it was far more abstract than entering in raw bytes. C was considered high level later on too, even though these days it is seen as "low level" -- everything is relative to what else is out there.
The same pattern held through the early days of "high level" languages that were compiled to assembly, and then the early days of higher level languages that were interpreted.
Read the famous "Story of Mel" [1] about Mel Kaye, who refused to use optimizing assemblers in the late 1950s because "you never know where they are going to put things". Even in the 1980s you used to find people like that.
I don't think that does count against the narrative? The narrative is just that each time we've moved up the abstraction chain in generating code, there have been people who have been skeptical of the new level of abstraction. I would say that it's usually the case that highly skilled operators at the previous level remain more effective than the new adopters of the next level. But what ends up mattering more in the long run is that the higher level of abstraction enables a lot more people to get started and reach a basic level of capability. This is exactly what's happening now! Lots of experienced programmers are not embracing these tools, or are, but are still more effective just writing code. But way more people can get into "vibe coding" with some basic level of success, and that opens up new possibilities.
The narrative is that non-LLM adopters will be left behind, lose their jobs, are Luddites, yadda yadda yadda because they are not moving up the abstraction layers by adopting LLMs to improve their output. There is no point in the timeframe of the story at which Mel would have benefitted from a move to a higher abstraction level by adopting the optimizing compiler because its output will always be drastically inferior to his own using his native expertise.
That's not the narrative in this thread. That's a broader narrative than the one in this thread.
And yes, as I said, the point is not that Mel would benefit, it's that each time a new higher level of abstraction comes onto the scene, it is accessible to more people than the previous level. This was the pattern with machine code to symbolic assembly, it was the pattern with assembly to compiled languages, with higher level languages, and now with "prompting".
The comment I originally replied to implied that this current new abstraction layer is totally different than all the previous ones, and all I said is that I don't think so, I think the comparison is indeed apt. Part of that pattern is that a lot of new people can adopt this new layer of abstraction, even while many people who already know how to program are likely to remain more effective without it.
> you intend to compare the switch to an AI-based workflow to using a higher-level language.
That was the comparison made. AI is an eerily similar shift.
> I don't think that's valid at all.
I dont think you made the case by cherry picking what it can't do. This is exactly the same situation, as the time SAP appeared. There weren't symbols for every situation binary programmers were using at the time. This doesn't change the obvious and practical improvement that abstractions provided. Granted, I'm not happy about it, but I can't deny it either.
Contra your other replies, I think this is exactly the point.
I had an inkling that the feeling existed back then, but I had no idea it was documented so explicitly. Is this quote from The Art of Doing Science and Engineering?
I didn't see it referenced directly anywhere in this post. However, for those interested, automatic music transcription (i.e., audio->MIDI) is actually a decently sized subfield of deep learning and music information retrieval.
There have been several successful models for multi-track music transcription - see Google's MT3 project (https://research.google/pubs/mt3-multi-task-multitrack-music...). In the case of piano transcription, accuracy is nearly flawless at this point, even for very low-quality audio:
He's trying to solve a second (also hard ish) problem as well, deriving an accurate musical score from MIDI data. It's a "sounds easy but isn't" problem, especially when audio to MIDI transcribers are great at pitch and onset times, but rather less reliable at duration and velocity.
I agree that the audio->score and MIDI->score problems are quite hard. There has been research in this area too, however it is far less developed than audio->MIDI.
That's because MIDI doesn't contain all the information that was in a score.
Scores are interpreted by musicians to create a performance, and MIDI is a capture of (some of) the data about that performance. Music engraving is full of implicit and explicit cultural rules, and getting it _right_ has parallels with handwritten kanji script in terms of both the importance of correctness to the reader, and the amount of traps for the unwary or uncultured.
All of which can be taken to mean "classical musicians are incredibly picky and anal about this stuff", or, "well-formed music notation conveys all sorts of useful contextual information beyond simply 'what note to play when'".
A lot of modern scores are written with MIDI in mind (whether or not the composer knows it - that's how they hear it the first 50 or so times). That should make it somewhat easier to go MIDI -> score for similar pieces. Current attempts I have seen still make a lot of stupid errors like making note durations too precise and spelling accidentals badly. There's probably still a lot of low-hanging fruit.
This is absolutely not easy, though, given all the cultural context. Things like picking up a "legato" or "cantabile" marking and choosing an accent vs a dagger or a marcato mark are going to be very difficult no matter what.
I ported their colab to runtime so I could use it more easily.
The MIDI output is... puzzling?
I've tried feeding it even simple stems and found the output unusable for some tracks, i.e. the MIDI output and audio were not well aligned and there were timing issues. On other audio it seemed to work fine.
Multi-track transcription has a long way to go before it seriously useful for real-world applications. Ultimately I think that converting audio into MIDI makes a lot more sense for piano/guitar transcription than it does for complex multi-instrument works with sound effects ect...
Luckily for me, audio-to-seq approaches do work very well for piano, which turns out to be an amazing way of getting expressive MIDI data for training generative models.
I developed https://pyaar.ai, it uses MT3 under the hood. I realized that continuous string instruments (guitar) that have things like slides, bends are quite difficult to capture in MIDI. Piano works much better because it's more discrete (the keys abstract away the strings) and so the MIDI file has better representation
> I realized that continuous string instruments (guitar) that have things like slides, bends are quite difficult to capture in MIDI.
It's just pitch bend?
I think trying to transcribe as MIDI is just a fundamentally flawed approach that has too many (well known) pitfalls to be useful.
A trained human can listen to a piece and transcribe it in seconds, but programming it as MIDI could take minutes/hours. If you're not trying to replicate how humans learn by ear, you're probably approaching this wrong.
Essentially, the leading way to do automatic music transcription is to train a neural network on supervised data, i.e., paired audio-MIDI data. In the case of piano recordings, there is a very good dataset for this task which was released by Google in 2018:
Most current research involves refining deep learning based approaches to this task. When I worked on this problem earlier this year, I was interested in adding robustness to these models by training a sort of musical awareness into them. You can see a good example of it in this tweet:
When I was studying Mathematics I initially thought TDA was magic. I remember seeing MATLAB correctly compute the Homology groups of a Torus from a randomly sampled point cloud. As I veer farther into the CS/ML world I've come to appreciate that most of the advertised practical applications of higher Mathematics are quite niche. Having said that, Algebraic Topology is still one of my favorite areas of Mathematics from a purely mathematical viewpoint.
Reminds me of this excerpt from Richard Hamming's book:
> Finally, a more complete, and more useful, Symbolic Assembly Program (SAP) was devised—after more years than you are apt to believe during which most programmers continued their heroic absolute binary programming. At the time SAP first appeared I would guess about 1% of the older programmers were interested in it—using SAP was “sissy stuff”, and a real programmer would not stoop to wasting machine capacity to do the assembly. Yes! Programmers wanted no part of it, though when pressed they had to admit their old methods used more machine time in locating and fixing up errors than the SAP program ever used. One of the main complaints was when using a symbolic system you do not know where anything was in storage—though in the early days we supplied a mapping of symbolic to actual storage, and believe it or not they later lovingly pored over such sheets rather than realize they did not need to know that information if they stuck to operating within the system—no! When correcting errors they preferred to do it in absolute binary addresses.