Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This seems to be the best here. As a side note: if someone does something more complicated and uses piping find output to xargs, there are very important arguments to find and xargs to delimit names with binary zero -- -print0 and -0 respectively.

Very interesting article: https://www.dwheeler.com/essays/fixing-unix-linux-filenames.....



I've been writing an `sh`-based tool to check up on my local Git repos, and it uses \0-delimited paths and a lot of `find -print0` + `xargs -0`:

https://gitlab.com/willemmali-sh/chegit/blob/master/chegit#L...

I admit the code can look a little weird, but it was because I had some rather tight contrainst: 1 file, all filenames `\0` separated internally and just POSIX `sh`. I still wanted to reuse code and properly quote variables inside `xargs` invocations (because `sh` does not support `\0`-separated read's), so I ended up having to basically paste function definitions into strings and use some fairly expansive quotation sequences.


Nice plug for gitlab ;).

\0 is an insanely useful separator for this sort of thing and yeah, it definitely gets messy. I'm working on a similar project that uses clojure/chef to read proc files in a way that causes as little overhead as possible. \0 makes life so much easier used. The best example I can think of off of the top of my head is something similar to:

  bash -c "export FOO=1 ; export BAR=2 && cat /proc/self/environ | tr '\0' '\n' | egrep 'FOO|BAR'"
  FOO=1
  BAR=2


I was so freaked out at the news, I normally have local backups of my projects but I just happened to be in the middle of a migration where my code was just on Gitlab, and then they went down... Luckily it all turned out OK.

\0 is very useful but I really wish for an updated POSIX sh standard with first-class \0 support.

On your code, why do you replace \0's with newlines? egrep has the -z flag which makes it accept \0-separated input. A potential downside to it is that it automatically also enables the -Z flag (output with \0 separator).

I solved the "caller might use messy newline-separated data"-problem by having an off-by-default flag that makes all input and output \0-separated; this is handled with a function called 'arguments_or_stdin' (which does conversion to the internal \0-separated streams) and 'output_list' (which outputs a list either \0- or \n-separated depending on the flag).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: