1. Introduction

  1. Much research has been devoted to improving imag inpainting either by image self-similarity or deep generative models.

    这些方法从non-hole区域获取语义信息或者从大量图片中学习。

    failed in cases when holes are large, or the expected contents inside hole regions have complicated semantic depth, texture.

  2. These problems can be addressed if there happens to be a second reference image of the same scene that exposes some desired image content.

    reffered to as reference-guided image inpainting.

    • target image: image with holes
    • source image: used as references
  3. Why reference-guided problem remains challenging?

    • the hole regions could be very large

    • uncalibrated camera to freely translate from src image to tgt image.

      induce large parallax

    • assumption: no more than two photos

    • there may exist regions in the source image that do not exist in target image

      因为通过网络或是其它方式采集到的图片曝光时间、光照条件都不一样

  4. multi-homography fusion pipeline

    • Assumption: there may be multiple depth planes inside the hole.

    Proposal

    Given a target and a source image:

    1. estimate the matched feature points between the 2 images
    2. cluster the inliers according to their estimated depths in the target image
    3. for each cluster estimate a single homography

3. Method

system pipeline

Note that M indicates the hole regions with value one, and elsewhere with zero.

  1. target图片打上掩码

  2. propose multiple global homographies using the multi-homography proposal module and locally adjust color and spatial misalignments in each pro

    posal using our Color-Spatial Transformer (CST)

  3. Then we merge each proposal with the output Ig from a single-image inpainting model using Single-Proposal Fusion (SPF), and finally selectively blend all the proposals.

3.1 multi-homography proposals

multi-homography

  • compute the monocular depth DtD_t of the non-hole region ItMI_t^M, and cluster the feature matching points into N sub-groups using the depth values.

  • Eacah estimated homography HiH_i will align different regions within the hole.

  • SIFT: extract features

  • OANet: outlier rejection

  • estimate the depth map DtD_t from ItMI_t ^M using a deep learning based monocular depth estimator.

  • We then cluster those points into a partitin of N subsets {Ptj}\{P_t^j\}​ by their depth.

  • RANSAC 对每个子集和全集计算homography matrices, 得到了N+1个homography matrices.

    然后warp得到了一系列转换后的source images

3.2 Color-Spatial Transformation Module

image-20220126171309242

  • we propse to learn the transformations in a lower resolution, and obtain the full-resolution coeffcientss using up-sampling.

  • Color Transformation

    • 学习一个仿射变换IsiI_s^i to IsciI_{sc}^i

    Aci=[Kcibci]RW×H×3×4A_{c}^{i}=\left[\begin{array}{ll} K_{c}^{i} & b_{c}^{i} \end{array}\right] \in \mathbb{R}^{W \times H \times 3 \times 4}

    ​ Formally, for each pixel at location p,

    Isci(p)=Kci(p)Isi(p)+bci(p)I_{s c}^{i}(p)=K_{c}^{i}(p) I_{s}^{i}(p)+b_{c}^{i}(p)

    • deep bilateral filtering
  • Spatial Transfromation(ST)