reproduce layout results as in the original stable diffusion paper?

              @CreamyLong, have you reproduce layout results as in the original stable diffusion paper? 
I used your layout config and have trained for 10 epochs （COCO dataset，batch size=16）, but the log images obtained in training phrase are as follows,
which I think are not correct training results. The training loss have awalys keep around 0.27 not decreased during the whole training process.

input bbox
![bbox_image_gs-080000_e-000011_b-002829](https://github.com/user-attachments/assets/7b2bf635-5e15-4da7-9a9e-dcf8ce36cbff)

input image
![inputs_gs-080000_e-000011_b-002829](https://github.com/user-attachments/assets/dc22929d-8db1-4c84-acd5-caea203a74b0)

decoded recontruction directly from first-stage embeded latent
![reconstruction_gs-080000_e-000011_b-002829](https://github.com/user-attachments/assets/2ffc63f8-4fba-4d73-9b7e-cfbeb59c6abc)

sample image from latent diffusion model (ddim_step=200, eta=0.)
![samples_gs-080000_e-000011_b-002829](https://github.com/user-attachments/assets/dd1e966c-1125-4f2c-b710-152b8d135fea)

_Originally posted by @Tonsty in https://github.com/CreamyLong/stable-diffusion/issues/14#issuecomment-2481910762_
            

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

reproduce layout results as in the original stable diffusion paper? #19

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

reproduce layout results as in the original stable diffusion paper? #19

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions