UDC-VIT

Overview

Despite extensive research on UDC images and their restoration models, studies on videos have yet to be significantly explored. While two UDC video datasets exist, they primarily focus on unrealistic or synthetic UDC degradation rather than real-world UDC degradation. In this paper, we propose a real-world UDC video dataset called . Unlike existing datasets, only exclusively includes human motions that target facial recognition.

Ideally, we would like to compare with two existing UDC video datasets, PexelsUDC and VidUDC33K. However, since PexelsUDC is not publicly available, we use the P-OLED dataset used to create it. Table below gives a summary of the eight previous UDC datasets.

read more

Transmittance decrease and digital noise

Transmittance decrease and digital noise in the UDC setting

The camera sensor amplifies the desired signal and unwanted noise in low-light conditions. In the UDC setting, where the sensor is beneath the display panel, the transmittance decreases, leading to amplified noise. The P-OLED dataset, captured in a controlled setting, exhibits unrealistic noise and excessive transmittance decrease. Similarly, in the VidUDC33K dataset, the degraded frame’s noise level is somewhat lower than the ground truth. In contrast, UDC-VIT accurately shows actual transmittance decrease and digital noise resulting from quantizing digital image signals.

Visual comparison

VidUDC33K (GT) UDC-VIT (GT)
VidUDC33K (Input) UDC-VIT (Input)
read more

Flares

The UDC’s unique flare characteristics

UDC flares arise from light diffraction as it passes through the display panel above the digital camera lens. Thus, it is essential for each frame in the UDC video dataset to precisely depict the UDC’s unique flare characteristics, including spatially variant flares, light source variant flares, and temporally variant flares. The P-OLED dataset rarely exhibits flares as it captures images displayed on a monitor in a controlled environment.

Visual comparison

VidUDC33K (GT) UDC-VIT (GT)
VidUDC33K (Input) UDC-VIT (Input)
read more

Face recognition

UDC-VIT stands out from other datasets by featuring videos tailored for face recognition. Some datasets, such as T-OLED/P-OLED, SYNTH, and VidUDC33K, only include limited human representations, often too small or from unrecognizable angles for face recognition. Zhifeng et al. introduce still image datasets for face recognition. However, these datasets are generated using a GAN-based model trained on the P-OLED dataset, which does not adequately simulate realistic UDC degradation, notably the lack of flare. Additionally, these datasets are not publicly available. Conversely, UDC-VIT prominently features humans in 64.6% of its videos (approved by the Institutional Review Board (IRB)), featuring various motions (e.g., hand waving, thumbs-up, body-swaying, and walking) by 22 carefully selected subjects from different angles. Users of the UDC-VIT dataset must secure IRB approval per their country’s laws and use it solely for research.

UDC-VIT (GT): hand waving UDC-VIT (Input): hand waving
UDC-VIT (GT): thumbs-up UDC-VIT (Input): thumbs-up
read more

Less meaningful and strange scenes in a synthetic dataset

The VidUDC33K dataset often presents unrealistic scenarios. Below are the examples of unrealistic scenarios:

For a detailed explanation, please see the supplementary material (pdf).

VidUDC33K (GT): Case 1 VidUDC33K (Input): Case 1
VidUDC33K (GT): Case 2 VidUDC33K (Input): Case 2
read more