Hacker Newsnew | past | comments | ask | show | jobs | submit | CaseFlatline's commentslogin

I am trying to find how the synthetic data was created (looking through the repo) and didn't find it. Maybe I am missing it - Would love to see the prompts and process on that aspect of the training data generation!


It's here:

https://github.com/arman-bd/guppylm/blob/main/guppylm/genera...

Uses a sort of mad-libs templatized style to generate all the permutations.


Oh my! Thanks for the memories - HPUX was my first workstation class unix operating system (sili-g's were too expensive). I remember downloading and compiling gcc on hpux. THe ideas of compiling a compiler with itself blew my mind!


Very nice. Its great to see how fast it boots, and it can run doom (framebuffer): https://www.youtube.com/watch?v=Ce1pMlZO_mI (also nice to see the dev takes the time to reply to an aspiring CS student on what it takes to grow in this field - comments in youtube)


Nice and short video. Demonstrates Vim, which seems like a sizable piece of software to get compiled with a subset of all Linux syscalls.


A feel-good movie to watch with the kids : Paper Planes - https://www.imdb.com/title/tt3328716/


Could someone explain why/how NBD is better than just using a linux host as an iscsi target? Googling NBD vs iSCSI shows old articles with no real solid conclusion.


It's not really "better" or "worse".

NBD is an extremely simple protocol. Read range, write range, delete range, sync -- that's it. If you want to implement an NBD server from scratch, you can totally do so in an afternoon. I have done this and use it in production: https://github.com/sandstorm-io/blackrock/blob/master/src/bl...

iSCSI is comparatively far more complex. It's a TCP-based adaption of the SCSI protocol, which has existed for decades as a way to talk to hard drives. As I understand it, you can pass arbitrary SCSI commands over iSCSI; see: https://en.wikipedia.org/wiki/SCSI_command iSCSI is enterprise-y and has a bigger ecosystem. You can netboot a diskless machine into Windows over iSCSI (I do this: http://kentonsprojects.blogspot.com/2011/12/lan-party-house-...).

Personally I like NBD a lot better because the simplicity means you can build new, cool things with it. But there are others who would say that NBD is a toy compared to iSCSI.


You might enjoy a bunch of toys I wrote to play around with NBD a little while ago: https://github.com/regularfry/tinynbd

I would apologise for the code, but "how small can I make this" was sort of the point...


Did you encounter any problems with NBD caching when it acknowledges the write to the application but doesn't pass it to your "backend" therefore leaving no room for error handling if that backend goes away?


NBD provides a virtual block device, so all the normal filesystem caching the kernel does above a hard drive applies to NBD as well. This is good: this is what makes it so fast.

Just because `write()` returned successfully does not mean that the data has been written to disk (whether you're using NBD or otherwise). The application needs to call `fsync()` to force writes to disk and get confirmation of success. An `fsync()` will send all pending NBD_CMD_WRITEs followed by NBD_CMD_FLUSH and will only return success when all of these have completed successfully.


Gut feeling: NBD vs. iSCSI is like NFS vs. SMB, it's not that it has magical features, but it is more integrated and more purpose-built, and built-in (as in, in the kernel).


Development wise it's a much more simple protocol, iSCSI has a lot of it's own complexity + the SCSI complexity to implement, NDB has a reasonably short RFC style document.


Don't see it offhand so asking:

1) How/where are you storing the index 2) Have you tried this on large (30+ TB filesystems)?


even without an index, having a way to project declaratively instead of relying on cut/sed is giving me hot flashes.


Truly an impressive feat:

"When you do things right, people won't be sure you've done anything at all." - Futurama


Dear https://en.wikipedia.org/wiki/The_Asylum, please use this department for your next SyFy B end-of-the-world B movie.


Slightly misleading on the source of the profits. It's important to read this section of the article:

"But the company's real profits are derived from a lesser-known side of the business: property development."

and

"Here's how it works: MTR enjoys a special relationship with the Hong Kong government, which is also its majority shareholder. The government provides land -- at no cost -- for use by the train operator, and MTR is then allowed to develop the areas above and around its stations."

So the government loans out land (which is crazy expensive in hong kong) for free and the MTR gets to keep the profits for leasing out that land via malls, etc.


I vote for wierd al to sing the song at the end

http://techreport.com/review/27909/the-ssd-endurance-experim...


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: