You can also tweak that test so that ENOSPC is discovered at close() time. Now you have a system that has thrown away data that PostgreSQL has already evicted from its own buffers, and there is no way to get it back (other than replaying the WAL, which is what PANIC achieves, as unpleasant a solution as it is, especially if it just happens again, and again, ...).
The recent change in 11.2 adds a PANIC on error there. But I'm not sure it's sufficient in Linux NFS, because even on the tip of the master branch of Linux (by my inexpert drive-by reading, at least), the errseq_t stuff doesn't seem to have made it into the NFS client code, so it's still using the old single AS_EIO flag. That probably exposes at least one race that is discussed in this thread:
I think we need to do something to make space allocation eager for NFS clients (a couple of concrete approaches are discussed) so that ENOSPC is excluded as a possibility after we have evicted data from PostgreSQL's buffer, and then I think we need Linux NFS to adopt errseq_t behaviour, and PostgreSQL to adopt the "fd passing" design discussed on the pgsql-hackers mailing list (to make sure the checkpointer's file descriptor is old enough to see all relevant IO errors). Or we need direct IO.
Well, I wouldn't. But people do. It makes more sense to use a SAN IMHO. I'm told it's not uncommon to use NFS for Oracle. One interesting thing is that they have their own NFS client implementation instead of trusting the kernel (they also do direct IO by default, though I'm not actually sure whether their NFS or DIO support came first).
https://www.postgresql.org/message-id/CAEepm=1FGo=ACPKRmAxvb...
You can also tweak that test so that ENOSPC is discovered at close() time. Now you have a system that has thrown away data that PostgreSQL has already evicted from its own buffers, and there is no way to get it back (other than replaying the WAL, which is what PANIC achieves, as unpleasant a solution as it is, especially if it just happens again, and again, ...).
The recent change in 11.2 adds a PANIC on error there. But I'm not sure it's sufficient in Linux NFS, because even on the tip of the master branch of Linux (by my inexpert drive-by reading, at least), the errseq_t stuff doesn't seem to have made it into the NFS client code, so it's still using the old single AS_EIO flag. That probably exposes at least one race that is discussed in this thread:
https://www.postgresql.org/message-id/flat/CA%2BhUKGKa-HtBHJ...
I think we need to do something to make space allocation eager for NFS clients (a couple of concrete approaches are discussed) so that ENOSPC is excluded as a possibility after we have evicted data from PostgreSQL's buffer, and then I think we need Linux NFS to adopt errseq_t behaviour, and PostgreSQL to adopt the "fd passing" design discussed on the pgsql-hackers mailing list (to make sure the checkpointer's file descriptor is old enough to see all relevant IO errors). Or we need direct IO.
TL;DR We are not out of the woods on NFS.