Learned Image Signal Processing Pipeline for Mobile Cameras

Elezabi, Omar

Elezabi, Omar

Master thesis

Åpne

no.ntnu:inspera:147335080:95485277.pdf (69.49Mb)

Permanent lenke

https://hdl.handle.net/11250/3092861

Utgivelsesdato

2023

Metadata

Vis full innførsel

Samlinger

Institutt for datateknologi og informatikk [6778]

Sammendrag

The image signal processing (ISP) pipeline is a crucial part of the image creation process. This pipeline consists of a handcrafted and complex sequence of image-processing tasks that are used to process the raw image from the camera sensor and produce the final RGB image. Because of the hardware limitation in mobile cameras from their compact size, the ISP of mobile phones became more advanced and complex to overcome these limitations. In previous years a new research direction proposed to replace this complex hand-crafted pipeline with an end-to-end learned-based ISP using deep learning. They achieved that by training a deep learning network to process the raw image of a phone camera by imitating the output of a DSLR camera. This approach showed promising results without the need for the long and complex process of handcrafted conventional ISP. But this approach is still a research direction that has a lot of limitations and problems compare to the conventional ISP used in mobile cameras nowadays. In order to reach production-level accuracy and robustness with this approach a lot of work needs to be done to address its issues.

In this work, we tried to improve the current state of learned-based ISP by addressing some of its main problems. We worked on night image rendering by using a learned-based ISP Network. We proposed an efficient network that was trained without the need for annotated data. Our proposed approach was one of the top 10 solutions on the NTIRE 2023 Challenge on Night Photography Rendering.

We also worked on the problems of the ISP datasets like alignment and availability. We proposed a novel idea to create a fully aligned high-quality synthetic ISP dataset with a weakly aligned ISP dataset. Our experiments show that We get better performance by training on our synthetic dataset than directly training on the weakly aligned dataset which shows the effectiveness of our pipeline. We also showed the ability of our pipeline to generate a new synthetic dataset from just DSLR RGB images.

Lastly, we addressed the problem of missed global information in the learned ISP networks. We proposed a novel color module that utilizes the global information from the full raw image in addition to local information from the input raw patch. Our module is a general module that can be integrated with any ISP Network to improve its color reproduction accuracy. We achieved state-of-the-art performance by utilizing our simple and efficient color module with a simple ISP network. We showed that by just utilizing the global information from the full image we can immensely improve the performance of ISP Networks.

Utgiver

NTNU