1. This is impressive debugging work by the author. No individual step is rocket science - especially when the story is the success path and not the forking paths of possible failures - but they kept their eyes on the prize and figured it out.
2. This reminds me why I no longer use linux on the desktop
This is mostly true* and beyond irrelevant to the point GP was making (that they no longer use Linux on the desktop because of annoyances like this).
From the original article "It works great on both Mac and Windows, but on Linux it displays just a black panel" and, amusingly, after the fix: "One issue is that now my second monitor isn't displaying anything. When I go into Display Settings, it seems like my computer thinks that both monitors are sending this EDID so it's put the second monitor into the wrong mode."
Seems like 2023 isn't the Year of the Linux Desktop either.
* I say "mostly" because to 99+% of the people, telling them "this monitor works great on both Mac and Windows and shows nothing but a black screen on Linux, but don't worry; it has nothing to do with Linux" will get you some pretty quizzical looks.
If you tell Linux to replace the EDID firmware with a custom binary, without telling it which output to override, I'm not surprised Linux acts the way it does. I can confirm on my machine that overriding EDID for a specific connector works as expected. https://wiki.archlinux.org/title/kernel_mode_setting#Forcing... says that you can do it for multiple connectors using commas to separate values, though I have not tried. Though one gotcha is that on my RX 570, `ls -ld /sys/class/drm/*/edid` prints HDMI-A-1 and -2, which is also the name of the display outputs on Wayland, but on X11 xrandr identifies these outputs as HDMI-1 and HDMI-2. (I think you have to use HDMI-A-1 in the kernel command line, but I have not tried.)
Yep, as much as we want hardware and software to be conformant to whatever standards exist, in practice an important part of the job of a production OS is to work around the myriad quirks of hw/sw users will likely encounter. Standards are the input, experience is the output, and users judge on output. Related is yesterday's post about ensuring DOS app compatibility in Win95 [1]
There certainly isn't any big philosophical point to make here, Linux doesn't take a hard stance of expecting correct HW.
Ultimately this is a symptom of (1) bugs happen (here, at LG) and (2) Linux has a lower market share and therefore sees less testing. It's certainly fine to say "then I will use Windows to run into statistically fewer problems", so long as you're aware the same argument applies to any entrenched incumbent. As mentioned earlier, it'd have applied to MSIE in 2005 as well.
> As mentioned earlier, it'd have applied to MSIE in 2005 as well.
What's funny is, I was around in 2005 and had already adopted Firefox well before it was called Firefox. (I was also around for the release of IE 4, and spent half a day downloading it on our 56k modem on release day! Exciting times.)
That's because the web is what I work on, and I am OK taking on buggy/beta stuff in the web domain because I learn useful things.
In the OS space, I was a linux user for a decade before I realized that I was wasting tremendous amounts of time and energy debugging stuff very like this monitor issue, and getting no transferable benefit out of it.
I switched to mac at the time, and have experienced vastly less of this sort of configuration nightmare since.
I love linux and I root for it, and occasionally I still try to switch again, before I end up having to figure out this sort of issue that just empirically doesn't exist on my mac, then I get sad and switch back.
I think that's fine and a respectable choice. I think it's even OK to argue that Windows has a value prop as a paid product because MS spends more effort on HW-specific quirks handling, or arguments along those lines.
Still, I think it does matter what's technically going on and where the fault lies. I also think that just like browser diversity, OS diversity is a net positive for the enforcement of good standards that makes things work better for users overall.
For example, I'm willing to bet (and it's because I know cases of it :-) that many PC peripherals work better on Macs because Linux existing has made HW-manufacturers more standards-conscious than a Windows-only world would have, especially since so many HW/embedded engineers run it.
> I was wasting tremendous amounts of time and energy debugging stuff very like this monitor issue, and getting no transferable benefit out of it
You got me there - in my case it lead to a career of making cars, phones, game consoles and other stuff running Linux, so I guess the over-under on the direct utility shakes out a bit differently here :-)
That said, it's been a very long time since I've done any fiddling/debugging to make any HW work privately. Ironically, I've had a lot more issues making HW work correctly on the M1 Mac I also have, e.g. my (quirky) Bluetooth earbuds work a lot better on PipeWire than macOS ...
Linux* actually does have a tendency towards more hardline stances, simply because it's not answerable to market forces in quite the same way. For example, say 0.5% of monitors out there have dodgy edid values like this. For a commercial OS, that's a lot of unhappy people, many of whom have social media accounts, so "When displaying a new resolution, always start at max(60Hz, lowest available refresh rate) so if something's wrong people can at least see what's going on" might be a good rule of thumb for a commercial OS.
For Linux, however, the equivalent might be "report the bug so an exception can be made for this specific monitor". In other words, it's less important that everybody be immediately happy, as long as bugs can be reported and eventually fixed.
* Shorthand for any OSS kernel and its associated ecosystem
I’ve seen many more monitor compatibility issues under MacOS than Linux, and have plenty of older hardware that works with reverse engineered Linux drivers, but no longer works with Windows.
I think the common case is that vendors test with the current version of Windows, MacOS and/or Linux (in decreasing priority order), then hope that it the hardware is EOL’ed before the drivers bit rot.
This reminds me of a recent complaint about some software which didn't work properly with a certain sync service, and the developer response was that the third-party service was buggy so it wasn't their problem.
Ok, but personally quite a lot of my job involves working around the bugs and quirks of old unmaintained Windows OSes. The customer rightly doesn't care, they just want your software to work. It is almost always possible to at least mitigate the problems of whatever buggy garbage hardware/software you have to deal with.
It would be nice to figure out why Windows and MacOS evidently didn't have the problem. If there's some additional probing that they're doing to catch the monitor out on its lies, Linux could do that too.
One thing I could imagine is that Linux is giving preference to using the values in the DisplayID block as it's the newer standard, and since EDID/DisplayID compliance has improved over time the logic may be "the newer one is more likely to be correct". In the meantime perhaps Win/Mac continue to look at the classic EDID data, and if they do, it likely gets less test coverage from manufacturers.
That's odd, I was definitely able to add a DisplayID 1.3 extension block (128 extra bytes appended) to a CRT monitor's 128-byte EDID data using CRU, then have Windows 11 see those extra resolutions (when the CRT was plugged into a DP-to-VGA adapter). Though I ended up switching to CTA-861 (HDMI extension blocks) with all the HDMI YCbCr/audio nonsense turned off, because 010Editor and edid2json could understand CTA-861 but not DisplayID (though I learned from this article that git://linuxtv.org/edid-decode.git, or https://git.linuxtv.org/edid-decode.git, has better support for EDID standards than the other tools).
Another oddity is that when I run edid-decode on the DisplayID 1.3 file created by CRU, edid-decode reports the block as "Version: 1.2" instead. Wonder what's up with that.
If that's also true of MacOS, that would mean LG made the effort of adding extra data that their tested systems didn't actually use and then got it wrong anyway, which would be funny.
I work in the automotive industry, where we often use fancy unusual screen hardware a couple of years before it turns up in home consumer electronics or phones. For example special multi-axis curved stuff, dynamic angular privacy filters, or haptic feedback using electrostatic modulation of resistance instead of vibration motors (that allows you to make the screen feel rough and scaly or glidy, give UI elements a feel-able shape, etc.).
One time, we were told to use earplugs at work for a few days, because of a pre-release firmware bug that could in theory, if other safety mechanisms also failed, cause the haptics to potentially emit an ear-piercing low-frequency tone ...
Temporary EDID bugs, otoh, I've seen so many times. :)
> One time, we were told to use earplugs at work for a few days, because of a pre-release firmware bug that could in theory, if other safety mechanisms also failed, cause the haptics to potentially emit an ear-piercing low-frequency tone ...
Wow, on-demand tinnitus is one hell of a failure mode.
What I imagine is that the engineers assigned to it started off from an EDID from some other display they have, made changes to it, tested on Mac and Windows, never tested on Linux.
Mere speculation on my part, since I have no familiarity with the HDMI or DisplayPort protocols:
1. Windows/macOS might have a "quirks table" that hardcodes fixes for spec-violating devices.
2. Windows/macOS might ignore some reported EDID values and derive their own when they determine that the display behaves differently from what is reported (e.g. by recording response packet timings).
Even modern packetized display interfaces like DisplayPort are still fundamentally a one-way firehose for the pixel data. There's no provision for acknowledging or retransmitting data packets, and forward error correction is optional. The bidirectional sideband channels used for stuff like EDID are not timing-sensitive like the pixel data stream.
It could be the failure mode is identifiable on the OS and Linux didn't add detection and fallback support.
Honestly though it could also be workarounds. Windows is king at making stuff like this work by hard coding overrides. E.g. a custom driver for this display.
I ran into the same problem as TFA in 2017 with an AOC g2460pf monitor. The display port would advertise binary garbage for the EDID that wouldn't checksum, so no OS would try and use it. AOC registered some "drivers" with Windows, so it would automatically download and apply a patch that would make that port usable, but they also included a CD with the stuff on it.
However they patched this on the Windows side was quite fragile, because it kept breaking after OS updates. After I could no longer get it working with the reinstall and pray method, I switched to Linux because of this issue. I fixed it once with an EDID in the initramfs, and haven't had any issues for the last 6 years.
Sure, but this is exactly equivalent to saying "This is why I stopped using Firefox for web browsing" in 2005 and sticking to MSIE 4/5 and its take on standards, because the websites always work.
Reasonable and practical, but which direction did get us more progress for the web?
EDID exists for a reason and is a good thing; monitors providing reliable, useful EDID data is something to strive for.
It's gotten a lot better over the years as the result of operating systems using and enforcing it more. Monitors with bogus/garbage EDID used to be a lot more common 10+ years ago.
It's not at all hard to find EDID bugs that affect Windows and macOS, especially where variable refresh rate or HDR or 10-bit color or DSC are involved. And in those situations, working around the monitor bug tends to be just as hard (Windows) or impossible (Mac).
Indeed my Dell is treated with some absurd sharpen filters because EDID says its a TV and macOS wants to "help". IIRC, there was a similar story posted to HN about it, involving debugging EDID and applying an override
Basically, they tested it on these 2 and then shipped it. If it would fail on win or osx, LG would not ship it.
At first, this seems a reason not to use Linux, but a future upgrade of win/osx will break your hardware. Tons of hardware gets obsoleted this way. Meanwhile, Linux will just keep on working.
> Basically, they tested it on these 2 and then shipped it. If it would fail on win or osx, LG would not ship it.
Yep. I first and foremost look fir Linux support explicitly called out in the system requirements. If I can't find it, I look for reviews mentioning Linux.
But explicitly supported is always the best, ideally backed by reviews that confirm it.
I don’t think that’s actually correct either. The monitor is advertising Freesync/GSync variable refresh rate. The Ultragear monitors are high end gaming monitors so they come with 48-240Hz VRR on some monitors. That’s the 48-144hz extension in the EDID of this monitor.
Based on debugging some issues I was having in macOS, I discovered from various Reddit and forums that Linux doesn’t handle that well depending on your driver stack/how you connect. If you’re on very modern hardware, kernels, and the OEM drivers it’s a better situation. macOS also has partially broken support for Freesync (or ProMotion as Apple brands it) at the moment on Intel machines.
I have an Asus GSync monitor with 48-144hz compatibility. This works over DisplayPort on my M2 Pro MacBook. On my RX6800 equipped Intel Mac I get 100Hz over DisplayPort 1.4 and 120Hz over HDMI 2.0 due to a DSC bug affecting Intel Macs. Even without that bug, 120Hz sometimes presents the OP’s issues between startup and login so I use 100Hz. The betas for the next macOS apparently fix this.
Huh. This reminds me of why the world so needs Linux. That they could just take off the shelf tools & plug around for a bit, learning & understanding a complex situation with crude & fast debugging, knowing only a little of the internals, and in the end improve their own situation clearly. And then they could share that knowledge with others in such a clear manner.
Nothing else in computing is like this. We just cannot help ourselves & each other in most realms of computing: we must be content with what we are given, as it is.
In almost all probability the linux system either doesn't have a GPU capable of running this high pixel clock or the cable/connector can't handle it. I'd love to know what the Mac & windows machines do; do they run 60Hz too or are they pushing all 144Hz here successfully? This seems very likely to be a cable issue, one I don't expect windows nor Mac deal with particularly excellently.
I have experienced pixel clock errors on Linux, but can't say in this case if the monitor, cable, or GPU is unable to handle full resolution at 144hz. The CTA-861 (HDMI metadata) block contains a mode "3440x1440 99.990 Hz", or 1440p100 with a pixel clock of 543.5 MHz. I don't know if this 100hz mode functions on Windows or not. The DIsplayID block instead contains a 144hz mode with a pixel clock of 799.750 MHz. Both of these modes may be within DisplayPort bandwidth limits or not, depending on the link rate and bits per pixel (this EDID says "Bits per primary color channel: 10"), and may also be supported by the display or not.
I do know that Linux X11 (amdgpu kernel driver, modesetting X11 driver) tends to drive my DVI 1080p display with too high of a pixel clock (too large blanking intervals) when connected over a HDMI-to-DVI cable from my GPU. I believe this is because there's actually a duplication of mode selection logic between the amdgpu kernel driver and X11. I've reported another (system hang) amdgpu/X11 resolution bug at https://gitlab.freedesktop.org/xorg/driver/xf86-video-amdgpu... with no progress towards being resolved so far. Neither bug appears on Wayland, but mainstream Wayland desktop environments (KDE/GNOME) do not allow adding custom resolutions through xrandr without overriding EDID files and either rebooting for the kernel to see it, or touching files in /proc/ (untested).
Funny... I am trying to learn how to patch my own debian stable kernel with an (incomplete) kernel patch submitted for adjusting the backlight on an Apple Studio Display. The monitor - to my surprise - works really great with my Debian 12 workstation via a displayport->usb-c converter cable. But (in typical Apple fashion) there is not a single button on the display to adjust brightness and the kernel doesn't support this monitor.
But I haven't been able to make the time to sit down and 1. respond to the feedback (original author hasn't yet) and 2. try and apply it to the otherwise standard Debian kernel.
Aside from this issue linux on the desktop has been quite pleasant, and of course this issue is not the fault of Linux per say. Deb12/KDE/Wayland/AMDGPU
The kernel doesn't necessarily need to know about every monitor specifically, there's lots of standards that divide this space into broad categories. But some hardware does custom things that need individual drivers, e.g. use the USB channels to announce a custom device with its own proprietary protocol.
Sometimes this may be intentional design to lower compatibility and cause these problems intentionally, but often it is just bad engineering, not malice.
> The Apple Studio Display does not have any physical buttons and the only
> way to get or set the brightness is by sending USB control transfers to a
> HID device exposed by the display.
Sounds like an abstraction is appropriate. We could call it a "device driver". Is there a reason so many things get baked into the kernel, rather than using, say, kernel modules?
Problem is, regular DDC utils don't know how to communicate with this particular display.
Joke is on me, I shouldn't have spent $1,500 on a display that is pretty much exclusively designed to be connected to a Mac. But I was using it with a Mac for a while before building this new Linux box.
Relatively few things need to actually be baked into the kernel. Most features in the kernel codebase can be built as a module at will, especially drivers.
But it's the codebase that you patch, so "patching the kernel" doesn't mean the OP ruled out building it as a module in the end. Even though they may well be able to just build it against kernel headers and even load it at runtime.
Also it should be trivial to run this as an out-of-tree module until it's merged.
With DKMS or put hid_bl.c into an empty directory and add the following makefile:
```
ifneq ($(KERNELRELEASE),)
# kbuild part of makefile
obj-m := hid_bl.o
else
# normal makefile
KDIR ?= /lib/modules/`uname -r`/build
Julius has responded to comments as recently as a week ago and is now working on a (hopefully mergable) v4.
You will probably see it in 6.7 and probably backported to stable kernels in a few months.
It will support any monitor which uses this USB control scheme (presumably a few other apple monitors).
So don't worry about making the time to address the comments. The author seems to be on top of it. Two to three months for a medium sized driver patch (such as this) to be merged seems about right to me.
If you’re on a Mac and using a Dell monitor (I know it does it for the Dell Ultrasharp, not sure if it’s for all Dells), check your monitor settings to see if the computer is actually sending RGB to the monitor. Chances are, even over HDMI, it’s using YPbPr. Every mac I’ve ever used (20 or more over the past 10 years), and every Dell monitor (at least 5), the mac is sending the the display signal as YPbPr instead of RGB. Just Google, “Macos Dell YPbPr” and you can find complaints going back 10+ years.
I get a lot of computers to use for a brief period of time (consultant), and fixing that on Macbooks every couple months is non-trivial. Sometimes it’s impossible if it’s an older version of MacOS because older versions required overriding system files and booting into single-user mode.
This debuging and hacking kind of remind me of Windows, and dealing with various versions of drivers, service packs, and so on.
Normal sane Linux user would just force resolution into bootloader, and create global xorg.conf. There are like 10 far easier ways to solve this problem.
Point is - I don't want to do any of this on my personal computer after work. I just want to plug a monitor in and... It works. Don't care why, don't care if windows/Mac has some quirks table. I debug enough stuff at work.
2. This reminds me why I no longer use linux on the desktop