Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Learning with Not Enough Data: Semi-Supervised Learning (lilianweng.github.io)
145 points by picture on Dec 7, 2021 | hide | past | favorite | 19 comments


Time and time again, this blog does not fail to impress. I especially liked her piece on Diffusion models from earlier this year; It was a very nice, simplified version of a complex topic that named some of the most important papers and contributions over the last few years. All the while, the blog wasn't overly simplified like other blogs seem to do all to often (not providing key derivations of formulas, discussing topics at a glance, reading more like a PR piece than an actual informational blog.)


Agree. She had a very informative tutorial session yesterday on self-supervised learning at NeurIPS-2021. While I don't think the recording is publicly available [1], the slides are [2].

[1] https://nips.cc/virtual/2021/tutorial/21895

[2] https://nips.cc/media/neurips-2021/Slides/21895.pdf


> Time and time again, this blog does not fail to impress

"This is an impressive blog" (I agree!)

I just wanted to make sure everyone else glancing through gets your intended message because I had to read it twice


Interesting, I also initially read it with a negative impression, e.g. "this blog constantly fails to impress me", even though that's the opposite of what the sentence says.

Not to derail the topic, but anyone have any insight on why that might me? Pretty sure it's fine, idiomatic English. Am I just primed to expect negative criticism in HN comments? :/


https://www.grammarly.com/blog/3-things-you-must-know-about-... Apparently it's an "unnatural aberrations", so you were right in reading it with a negative impression.


I had to re-read that line as well. I have huge respect for Lillian's work so this line just didn't make any sense to me, forcing me to read it carefully!


On second glance, the double negative in my comment does make it a bit hard to read. Sorry about that!


Here's a list of her other interesting papers.

https://lilianweng.github.io/lil-log/archive.html


As someone who worked with these techniques a lot in the past, I can say that SSL definitely makes sense in theory, but in practice, the gain doesn't pay off the complexity, except in rare cases w/ pseudo-labelling for example, which is very simple. Usually you tune a lot of hyperparams and tricks to make it work and the gain are usually minimal if you have a reasonable amount of labeled data.


It is true when your task if relatively simple and you can create labels to cover most of the data cases. For long tail problems like vision or NLP you can't label everything and semi-supervised learning helps a lot.


A simple way to do SSL with not enough data is Lipschitz learning. It avoids the necessity for pseudo-labeling and computes an absolutely minimizing Lipschitz extension of the labels instead.

https://arxiv.org/abs/2111.12370 https://arxiv.org/abs/2012.03772 https://arxiv.org/abs/1901.05031 https://arxiv.org/abs/1710.10364


I think the part of this that surprised me the most was learning that Self-Teaching actually... works? Not entirely sure why, but my first instinct when I was first getting into AI was that training a model on its own predictions would just... not provide any benefit for some reason. Well, today I learned otherwise! I love being proven wrong about stuff like this.


The simplest form of unsupervised learning or self-teaching is clustering. And yes, it works without labels.


Very excited to read this series. Semi-supervised learning seems currently under-appreciated, especially in medicine.


>Semi-supervised learning seems currently under-appreciated, especially in medicine.

In medicine it would be appreciated more if it were more effective. Many times the right answer to "I don't have enough data to do X" is: don't do X.

I'm not entirely pessimistic on this by the way, I think principled semi-supervised approaches are likely to work much better than some of the hail mary's you see people try in the space with transfer learning and generative models etc. But it's still hard, and often it just isn't going to work with the kind of practical numbers some people want to be able to work with in medicine.


You're not wrong. My hunch, however, is that semi-supervised learning will help with some human-biased priors that are being implicitly used.


It is actually used a lot in biomedical domain, however the gains a minimal, quite different in practice than what you see in papers.


The abstract should read

Semi-supervised learning is one candidate, utilizing a large amount of unlabeled data conjunction with a small amount of labeled data.


Anyone knows what software/package she uses to make the diagrams?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: