关于GCU、沐曦GPGPU、MLU、0卡V100资源4月7日恢复上架的公告>>> 关于共建具身智能开源数据集的倡议>>> 关于云脑任务中统一路径访问方式的公告>>> 关于将启智集群GPU资源迁移至智算集群的公告>>>

tiantian132 5243bfbb8d 更新 'ldm/models/first_stage_models/vq-f4-noattn/config.yaml'		1 month ago
.ipynb_checkpoints	add combine	8 months ago

Kvasir-SEG	add example training data	8 months ago

assets	add readme	8 months ago

example_data	add example training data	8 months ago

ldm	更新 'ldm/models/first_stage_models/vq-f4-noattn/config.yaml'	1 month ago

README.md	Update README.md	7 months ago

combine.ipynb	add combine	8 months ago

inference_autoencoder.ipynb	add notebook, add main train, add easy dataloader	8 months ago

inference_latent_diffusion_model.ipynb	add notebook, add main train, add easy dataloader	8 months ago

main.py	更新 'main.py'	1 month ago

test.py	添加 'test.py'	1 month ago

README.md

latent-diffusion-inpainting

latent-diffusion-inpainting

This repository is based on CompVis/latent-diffusion, with modifications for classifier conditioning and architecture improvements.

Since the original codebase is very big, complex and lack of documentation to fine-tune the original autoencoder and diffusion model.

It is extremely diffcult to fine tune existing pre trained model to produce good result.

Iusses in the original repository

How to finetune inpainting? #151

how to train Inpainting model using our own datasets? #280

Details about training inpainting model #9

how to train inpainting model with my own datasets #265

Training inpainting model on new datasets #298

Reproduction problem while training inpainting model #159

Hardware requirement

Without pretraining, it would take 8 V100 GPUs to produce satisfactory result.

With finetuning, 1 3090 is enough for transfer learning to medical images( in my case )

This repository made the fine tuning setup and inference easy by fixing some of the bug in the original repo.

Major Changes

Load and Fine tune autoencoder (Very important for transfer learning )
Load and fine tune latent diffusion model
Combine trained autoencoder with latent diffusion model
Inference example for both model
Simplified data and mask loading
Fixed some bug when training inpainting model

The original inpaint is to remove object from the image:

However, we can turn the model into createing object!

Result

Original Image

One polyp

Two polyp

Requirements

If you already have the ldm environment, please skip it

A suitable conda environment named ldm can be created
and activated with:


conda env create -f ldm/environment.yaml
conda activate ldm

Data Loader

From my experiment for medical images, it is better to produce a square mask instead of using polygon mask.

If you want to change it, feel free to modify the /ldm/ldm/data/PIL_data.py to change the data loading format.

All the dataloader used in training are in that .py file and it has simplified.

Usage

For most of the normal image, you DO NOT need to finetune the autoencoder.

You will want to funetune the autoencoder ONLY when your data is very different from the pretrained dataset, for exmaple endoscopic images.

So, you can skip part 1 and 2.

1. Finetune the autoencoder

Since the autoencoder used for the pre-trained inpainting is vq-f4-noattn, we have to stick with it.

First, prepare the images and masks with the same format as in kvasir-seg folder (we DO NOT need any mask to finetune autoencoder)

Second, modify the data path in config.yaml( it should be in ldm/models/first_stage_models/vq-f4-noattn/config.yaml)

Then, run the following command

CUDA_VISIBLE_DEVICES=0 python main.py --base ldm/models/first_stage_models/vq-f4-noattn/config.yaml --resume ldm/models/first_stage_models/vq-f4-noattn/model.ckpt --stage 0 -t --gpus 0,

The model is trained with 50% of the original image and 50% of randomly masked image

2. Comebine the autoencoder with the diffusion model

Please refer to the combine.ipynb

3. Finetune Latent diffusion model

Note that, the mask in here is in square mask, you can disable draw_rectangle_over_mask function in the /ldm/ldm/data/PIL_data.py to use original mask.

First, download the pre trained weight and prepare the images with the same format as in kvasir-seg folder

Download the pre-trained weights

wget -O models/ldm/inpainting_big/last.ckpt https://heibox.uni-heidelberg.de/f/4d9ac7ea40c64582b7c9/?dl=1

Second, modify the data path in config.yaml( it should be in ldm/models/ldm/inpainting_big/config.yaml )

Then, run the following command

CUDA_VISIBLE_DEVICES=0 python main.py --base ldm/models/ldm/inpainting_big/config.yaml --resume ldm/models/ldm/inpainting_big/last.ckpt --stage 1 -t --gpus 0,

4. Load and Inference

Please refer to those inference notebook.

No Description

Jupyter Notebook Python Text other

76567234+nickyisadog@users.noreply.github.com nicky629973@gmail.com

How to access data resources in code