Hacker Newsnew | past | comments | ask | show | jobs | submit | weeb_throwaway's commentslogin

This list is similar: https://github.com/dimpurr/awesome-acg-machine-learning

It has a bit more focus on manga datasets like http://www.manga109.org/en/index.html


Tentacles.

You couldn't have discovered it as a fetish through scientific experimentation. It probably wasn't one until it was used to bypass japanese censorship laws which also accidentally conditioned the popular psyche to associate it with porn.

There are still a lot of ongoing breakthroughs of this kind in hentai. For example the genre of monster/anthropomorphic girls is currently exploding. I recently beat my meat to a world war 2 battleship in the shape of a girl.


Please don't do this here.


Selection bias is also apparent in ranking sites for shows where seasons/sequels are ranked individually.

For example in anime, Gintama appears 8 times within the top 50: https://myanimelist.net/topanime.php

It's not because it's a popular show. It's just that people who didn't like the first few episodes have already stopped watching! And it's polarizing enough that the only people who stick around for so many seasons are the ones really love it. So it will get rated a 10 even for mediocre content.


This barely scratches the complexity of IMEs. For example Hong Kong mainly speak a different dialect of chinese, Cantonese. So pinyin (mandarin phonetics) is useless.

Most people type with Cangjie instead: https://en.wikipedia.org/wiki/Simplified_Cangjie


Another issue with CJK is vertical right-to-left writing: https://www.w3.org/International/articles/vertical-text/

It's pretty popular with japanese novels/manga and traditional chinese but really hard to do without bugs on the web.


Hadn't fully registered till this comment, the degree to which the modern web is anchored to horizontal (usually left-to-right) writing and the design patterns of vertical scrolling that come with that assumption.


It's not just the web, it's all of computing, even down to the hardware design. A vertical scroll wheel is standard on all mice. A horizontal scrolling method of some sort is not, and even on mice that include one, it's usually not as good as the vertical one (e.g. leaning the wheel left and right).


I think I refused to believe it too until reddit, a San Francisco company, started banning people for posting fictional high school anime characters in bikinis (not even nudes).


I'm not quite sure what point you're trying to make here. But it sounds like you're trying to say the sexual objectification of children is okay, as long as they are fictional children with a tiny amount of body covering. Is that what you meant, or did you mean something else?


What do we mean by "okay"? How do we differentiate virtual murder (for example) from virtual pedophilia in a sufficiently rigorous way? You can say that rigor isn't required beacuse Reddit can make whatever choices they like, but it's not at all clear that the necessary connection for sexual objectification of children is made when these images are posted - that presupposes that the viewer sees the image and real children in the same light, which current evidence gathered of Japanese fan communities does not support (see Galbraith and McLelland's work on this). This is why researchers in the field are sometimes skeptical about calling this material "child pornography".

Morally, one can differentiate between virtual murder and virtual pedophilia and condemn virtual pedophilia while consistently enjoying games and other media depicting murder - but as Gary Young pointed out in his piece on the Gamer's Dilemma, it requires us to accept moral relativism.


I don't know danbooru, but on another nsfw japanese comic site, 11500 of the 61973 english translated works are tagged as lolicon. Majority of the remaining are teenagers at best (because most anime are in a high school setting).

So you're not crazy. These are really young looking. It is probably just representative of the dataset. The dataset itself wasn't chosen for any malicious reason, the people who like this stuff just happen to be very prolific artists so there's a machine learning scale amount of it.


Danbooru requires one to have a 'gold' account to see pictures tagged as lolicon or shotacon. That's a $20 one-time purchase. It's not clear to me that they are using such an account.


They used an open dataset (danbooru2018) which was scraped with a gold account.


Worth noting that according to the stats listed, neither lolicon nor shotacon make it into the top 19 tags.


I saw the section about not training from scratch (via transfer learning) in https://www.gwern.net/Faces#transfer-learning. The Holo example is really impressive!

How expensive is it in terms of labelled data and compute? Do you know if anyone tried this for just ahegao faces?


> How expensive is it in terms of labelled data and compute?

All the stuff you see in that page (except the BigGAN ones) is unconditional, no labels. You just dump the images in and it figures it out. StyleGAN does support labels via one-hot embedding as I understand it, but I don't know how to use it so none of my experiments use it. A few people have mentioned or used it, but there's no good documentation about how to make it work, so... For unconditional samples, it depends on how many you have and how different they are. You can see in the various examples transfer learning with a few hundred to a few thousand (with and without data augmentation).

> Do you know if anyone tried this for just ahegao faces?

It's funny you ask that because I was corresponding with an anonymous who was using it for just that (and ball gags). He'd run into some issues with the encoder/editing functionality and wanted advice, but the regular transfer learning worked fine. He'd compiled a small dataset of a few hundred to a few thousand examples on his own, and it worked disturbingly well: sufficiently so I didn't want to write it up. (I try to keep my site SFW.)


I am trying out the app at https://waifulabs.com/ and the art style is kind of one-note. Most of the expressions are the same and the face shape skews towards loli.

I am more into "disgusted anime girl that looks at you like you're trash" type and I couldn't find a waifu (even with their refinement steps).

Really impressed that this is even possible though!


I've played with it a few times. It seems like they have you choose features in the space in the order of base -> palette -> art style (loosely) -> pose, but have locked some emotion controlling vector to be happy. Probably a reasonable step for their audience.


I think a large majority of anime art have happy expressions, so it might not have been anything that they had to do.

Edit: "disgusted anime girl that looks at you like you're trash" is the theme of a recent book that got adapted into anime, so it's a bit of a fad currently. I thought that was worth mentioning.


Sounds like there needs to be more tsundere in the data set.


https://imgur.com/a/ZFrliiO Here's one that's somewhat "annoyed" but not quite "disgusted" hm.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: