Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've been playing with this for a few hours. It's slow going -- you really need a fast GPU with a lot of RAM to make this very usable.

I ended up paying the $10 for Google Colab Pro and that's how I've been using this. Maybe I'll figure out how to get this working on my old 1080 TI to see if it's faster.

Anyway, for the one that I'm using which has a web UI, you can use this Colab link. It's pretty great! https://colab.research.google.com/drive/1KeNq05lji7p-WDS2BL-...

What I really wish was that the img2img tool could be used to take a text2img output and then "refine" it further. As it is, the img2img tool doesn't seem particularly great.

People on Reddit are talking about "I just generate 100 images and pick the best one"... but this is incredibly slow on the P100 GPU that Google has me on. Does this just require a monster GPU like a 3080/3090 in order to get any decent results?



You can feed a txt2img output into the img2img pipeline as an init, that's something that I do quite often, eg https://twitter.com/SteWaterman/status/1563872748161613826

Also how slow is your p100? I'm usually getting around 3 it/s. Maybe it's just because I'm used to disco diffusion where a single image took over an hour, but this is ungodly fast to me


Colab can feel slow for other reasons, such as throttled download speeds making it very slow to download weights on a cold boot.


FWIW I'm using an old gtx 1080 Ti to play around, it takes about 21 seconds per image. You can make it go even faster by lowering the timesteps taken from the default 50 (--ddim_steps), though both lowering and raising the value can result in quite different first-iteration images (though they tend to be similar) and seems to guarantee totally different further iteration images (as counted by --n_iter)... I'm with you on the feeling that it's hard to control, whether in refinement or in other ways, but I suspect that'll get a lot better in the next couple years (if not weeks or dare I say days).


With a 3090 it's about 10s to generate a 512x512 image from another, maybe less.


You're probably using the default PLMS sampler with 50 steps. There are better samplers, the best seem to be Euler (more predictable in regards to the number of steps) and Euler ancestral (gives more variation). Both typically need much less steps to converge, speeding up the generation.


My understanding was that PLMS was the current state-of-the-art. Would be interested if you have a citation for this "Euler" sampling method.


Yes I'm using PLMS. Thanks for the tip, will try.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: