DDPM and Segmentation

date
Nov 27, 2022
Last edited time
Mar 27, 2023 08:41 AM
status
Published
slug
DDPM_and_Segmentation
tags
DL
DDPM
summary
type
Post
Field
Plat

SegDiff: Image Segmentation with Diffusion Probabilistic Models

notion image
将噪声预测网络 建模为 , 其中解码器 是常规的,它的编码器被分成三个网络: , 的输出具有相同的空间维度和通道数。我们对 求和。然后,传递给U-Net编码器

Implement Detail

  1. 输入图像编码器 是由 Residual in Residual Dense Blocks (RRDBs) 构建的,它结合了多级残差连接,没有批处理归一化层。
  1. 是一个具有单通道输入和 通道输出的二维卷积层。
  1. 是标准的 U-Net
  1. 我们采用100个扩散步骤来减少推断时间。
  1. 对于不同的数据集, 使用不同生成的实例的数量以增加mIoU。

Experiments

notion image
notion image

Ablation Study

  1. Diffusion Step
    1. notion image
      notion image
  1. Number of generated Instances
    1. notion image
  1. Number of RRDB blocks
    1. notion image
  1. Variant of fusion
    1. 💡
      The first variant concatenates at the channel dimension. The second variant employs FCHarDNet-70 V2 instead of RRDBs. The third variant, concatenates channel-wise to , without using an encoder. The last alternative method is to propagate through the U-Net module and add it to after the first, third, and fifth downsample blocks (variants four–six), instead of performing .
      Result:
      The summation we introduce as a conditioning approach outperforms concatenation (variant one) on Vaihingen by a large margin, while on Cityscapes "Bus", the difference is small. The RRDB blocks are preferable to the FCHarDNet architecture in both datasets (variant two). Removing the encoder affects the metrics significantly (variant three), slightly more so on Vaihingen. The change in the signal's integration position of variant four leads to a negligible difference on Vaihingen and even outperforms our full method on Cityscapes "Bus". Variants five and six lead to a decrease in performance as the distance from the first layer increases.
       
       
      notion image

MedSegDiff: Medical Image Segmentation with Diffusion Probabilistic Model

notion image
我们注意到,在医学图像分割任务中,病变/器官往往是模糊的,很难与背景区分开来。在这种情况下,自适应校准过程是获得细微结果的关键。
针对自适应区域注意,我们在每一步将当前步骤的 Segment 集成到 Image Condition Encoder 中。具体实现是以多尺度的方式将当前步骤分割掩码与特征级别上的图像先验融合。这样,损坏的当前步骤掩码有助于动态增强条件特征,从而提高重建精度。为了消除此过程中损坏给定掩码中的高频噪声,我们进一步提出了特征频率解析器(FF-Parser)来过滤频域空间中的特征。
💡
In order to achieve the segmentation, we condition the step estimation function by raw image prior, which can be represented as:
where is the conditional feature embedding, in our case, the raw image embedding, is the segmentation map feature embedding of the current step. The two components are added and sent to a UNet decoder D for the reconstruction.

Method

Dynamic Conditional Encoding
💡
In the raw image encoder, we enhance its intermediate feature with the current-step encoding features. Each scale of the conditional feature map is fused with the encoding features with the same shape, is the index of layer. The fusion is implemented by an attentive-like mechanism .
where implies element-wise multiplication, denotes layer normalization.
FF-Parser
notion image
💡
The function of FF-Parser is to constrain the noise-related components in the features. Our main idea is to learn a parameterized attentive (weight) map applying on the Fourier space features.
Different from the spacial attention, it globally adjusts the components of the specific frequencies. Thus it can be learn to constrain the high-frequency component for the adaptive integration.

Experiment

notion image
notion image

Ablation Study

notion image

Diffusion Models for Implicit Image Segmentation Ensembles

notion image
💡
Let be the given brain MR image of dimension , where denotes the number of channels, and denote the image height and image width. The ground truth segmentation of the tumor for the input image is denoted as , and is of dimension . We train a DDPM for the generation of segmentation masks. We induce the anatomical information present in by adding it as an image prior to . We do this by concatenating and , and define . Consequently, has dimension .

Experiments

notion image

Ablation Study

  1. Number of ensample
    1. 💡
      we implicitly generate an ensemble of segmentation masks without having to train a new model. This ensemble can then be used to boost the segmentation performance.
      notion image
       
       

© Lazurite 2021 - 2024