NPTEL Deep Learning for Computer Vision Week 12 Assignment Answers 2024

1. What is the effect of increasing the guidance scale in classifier-free guidance?

The generated images become more random
The generated images become more realistic
The generated images become more stylized
The generated images become less diverse

Answer :- For Answers Click Here

2. In the forward process of a diffusion model, the data is incrementally corrupted by adding noise across T timesteps. If the variance of the noise added at each step t is β_t, which decreases linearly from 0.03 at the first step to 0.02 at the last step over 150 timesteps, calculate the total variance of the noise added over 150 timesteps. (Hint: Calculate average variance per step to calculate total variance)

3.25
4.5
7.5
3.75

Answer :- For Answers Click Here

3. Consider a reverse process in a diffusion model where the goal is to reconstruct the original data from the noise. If the model correctly reduces the variance of the noise by 0.02 in each reverse step, and starts with a noise variance of 1.0 at timestep T=50, how many steps are required to reduce the noise variance to 0.1?

Answer :-

4. Classifier-free guidance is a technique used in diffusion models to improve sample quality without the explicit use of a classifier. This technique involves modifying the sampling process based on a control parameter. If the control parameter, denoted as γ, is set to zero, what effect does this have on the generation process?

It fails to generate realistic samples
It removes all guidance, effectively making the process equivalent to the unconditional generation
It maximizes the influence of the classifier, leading to highly detailed generations
It can lead to more diverse samples compared to higher values of γ, as the generation process is less constrained by the conditional information

Answer :-

5. Which of the following are FALSE for self-supervised learning (SSL) techniques?
(Select ALL possible correct options)

Bootstrap Your Own Latent (BYOL) method does not depend on negative samples to achieve state-of-the-art results
MoCo maintains the dictionary as a stack of data samples, this enabling use of encoded keys from the immediately preceding mini-batches
In SimCLR, the number of negative samples is limited by batch size
In image rotation-based SSL, the task typically involves generating the correct image for the given rotated input image
In the image inpainting task, the goal is to fill the gaps of an image based on surrounding information