Learning with Not Enough Data: Semi-Supervised Learning

johnsutor · on Dec 7, 2021

Time and time again, this blog does not fail to impress. I especially liked her piece on Diffusion models from earlier this year; It was a very nice, simplified version of a complex topic that named some of the most important papers and contributions over the last few years. All the while, the blog wasn't overly simplified like other blogs seem to do all to often (not providing key derivations of formulas, discussing topics at a glance, reading more like a PR piece than an actual informational blog.)

abhgh · on Dec 7, 2021

Agree. She had a very informative tutorial session yesterday on self-supervised learning at NeurIPS-2021. While I don't think the recording is publicly available [1], the slides are [2].

[1] https://nips.cc/virtual/2021/tutorial/21895

[2] https://nips.cc/media/neurips-2021/Slides/21895.pdf

orzig · on Dec 7, 2021

> Time and time again, this blog does not fail to impress

"This is an impressive blog" (I agree!)

I just wanted to make sure everyone else glancing through gets your intended message because I had to read it twice

spijdar · on Dec 7, 2021

Interesting, I also initially read it with a negative impression, e.g. "this blog constantly fails to impress me", even though that's the opposite of what the sentence says.

Not to derail the topic, but anyone have any insight on why that might me? Pretty sure it's fine, idiomatic English. Am I just primed to expect negative criticism in HN comments? :/

johnsutor · on Dec 8, 2021

https://www.grammarly.com/blog/3-things-you-must-know-about-... Apparently it's an "unnatural aberrations", so you were right in reading it with a negative impression.

malshe · on Dec 8, 2021

I had to re-read that line as well. I have huge respect for Lillian's work so this line just didn't make any sense to me, forcing me to read it carefully!

johnsutor · on Dec 8, 2021

On second glance, the double negative in my comment does make it a bit hard to read. Sorry about that!

sharemywin · on Dec 7, 2021

Here's a list of her other interesting papers.

https://lilianweng.github.io/lil-log/archive.html

perone · on Dec 7, 2021

As someone who worked with these techniques a lot in the past, I can say that SSL definitely makes sense in theory, but in practice, the gain doesn't pay off the complexity, except in rare cases w/ pseudo-labelling for example, which is very simple. Usually you tune a lot of hyperparams and tricks to make it work and the gain are usually minimal if you have a reasonable amount of labeled data.

nshm · on Dec 8, 2021

It is true when your task if relatively simple and you can create labels to cover most of the data cases. For long tail problems like vision or NLP you can't label everything and semi-supervised learning helps a lot.

WAMoezart · on Dec 7, 2021

A simple way to do SSL with not enough data is Lipschitz learning. It avoids the necessity for pseudo-labeling and computes an absolutely minimizing Lipschitz extension of the labels instead.

https://arxiv.org/abs/2111.12370 https://arxiv.org/abs/2012.03772 https://arxiv.org/abs/1901.05031 https://arxiv.org/abs/1710.10364

mkaic · on Dec 7, 2021

I think the part of this that surprised me the most was learning that Self-Teaching actually... works? Not entirely sure why, but my first instinct when I was first getting into AI was that training a model on its own predictions would just... not provide any benefit for some reason. Well, today I learned otherwise! I love being proven wrong about stuff like this.

nshm · on Dec 8, 2021

The simplest form of unsupervised learning or self-teaching is clustering. And yes, it works without labels.

queuebert · on Dec 7, 2021

Very excited to read this series. Semi-supervised learning seems currently under-appreciated, especially in medicine.

ska · on Dec 7, 2021

>Semi-supervised learning seems currently under-appreciated, especially in medicine.

In medicine it would be appreciated more if it were more effective. Many times the right answer to "I don't have enough data to do X" is: don't do X.

I'm not entirely pessimistic on this by the way, I think principled semi-supervised approaches are likely to work much better than some of the hail mary's you see people try in the space with transfer learning and generative models etc. But it's still hard, and often it just isn't going to work with the kind of practical numbers some people want to be able to work with in medicine.

queuebert · on Dec 7, 2021

You're not wrong. My hunch, however, is that semi-supervised learning will help with some human-biased priors that are being implicitly used.

perone · on Dec 7, 2021

It is actually used a lot in biomedical domain, however the gains a minimal, quite different in practice than what you see in papers.

jamesblonde · on Dec 7, 2021

The abstract should read

Semi-supervised learning is one candidate, utilizing a large amount of unlabeled data conjunction with a small amount of labeled data.

malshe · on Dec 8, 2021

Anyone knows what software/package she uses to make the diagrams?