Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

with SD you can use LoRA, and "artist style" is, to my understanding, the easiest thing to make a model of. With LoRA (Low Rank Adaptation) you describe everything in an image that is not what you want the model to model. With a style, You can just use BLIP/CLIP or deepbooru to describe all of your images. At worst, you might have to remove other styles/artists in the tags. the model will only learn about the style. supposedly. I don't follow art enough to have a favorite style; i don't have enough source material on hand to do a style model, nor do i know enough to go out and grab a dataset.

As an aside, textual inversion also made some great inroads into this sort of thing. So smallest size to largest: textual inversion(megs), Locon/lycoris/lora(tens of megs), full model (gigs!). The accuracy range over all compositions follows the same respective order, as well.



Loras are 1 to hundreds of megs for sd15 (I've seen 1mb Lora and also 700mb Lora merges) and I think around 1gb+ for sdxl.

Hypernetworks exist as well for sd15 at least, they're up to I think around 80mb?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: