NPTEL Computer Vision Week 12 Assignment Answers 2024

1. Which of the following statements are true for Pooling?

a) It is common to use zero padding in pooling layers.
b) It progressively reduces the spatial size.
c) It operates over each activation map independently.
d) It partitions the image into overlapping rectangles, taking the maximum (MaxPool) or average (AvgPool) value from each of these sub-regions.

Answer :- For Answers Click Here

2. Up-convolve the following input x using the filter h. What is the value at (2, 2) in the up-convolved feature map? We assume index of first cell of any 2D-matrix is (0,0).

Given: x(0, 0) = 1, x(0, 1) = 2, x(1, 0) = 3, x(1, 1) = 1.
h(0, 0) = 0, h(0, 1) = 1, h(0, 2) = 0, h(1, 0) = 2, h(1, 1) = 2, h(1, 2) = 1, h(2, 0) = 0, h(2, 1) = 1, h(2, 2) = 0.

Answer :- For Answers Click Here

3. The softmax predictions of four different classes are [0.1 0.3 0.2 0.4]. Calculate the cross entropy loss for this sample (use natural logarithm) if the true labels are [0 0 0 1]. Round off the answer to 4 places of decimals.

Answer :- For Answers Click Here

4. What is the main problem in stacking multiple layers to learn a large network with limited data?

a) over-fitting
b) high bias of the model
c) vanishing/exploding gradients
d) possibility of data augmentation

Answer :-

5. What is the dimension of the activation map?

a) 15 × 11 × 64
b) 15 × 11 × 3
c) 9 × 7 × 64
d) 9 × 7 × 3

Answer :-

6. Calculate the number of parameters in the convolutional layer.

Answer :- For Answers Click Here

7. Assume that the convolutional layer is followed by a MaxPool layer with kernel size 2 × 2. Calculate the number of parameters in the pooling layer.

Answer :-

8. Consider a convolutional layer with input channel equal to 16. A kernel of size 3 × 3 is used to generate feature maps. The number of kernels used in this layer is 32. Calculate the number of parameters.

Answer :-

9. Which of the following statements are true?

a) Fast-RCNN selects lesser number of region proposals from a given image compared to RCNN.
b) In Fast-RCNN, region proposals are prepared from the convolutional feature map of a given image.
c) RCNN computes CNN features on the whole input image.
d) Region proposal network (RPN) is trained to produce region proposals directly after the last convolutional layer in Faster-RCNN.

Answer :-

10. In an object detection problem, the time taken to generate all proposals is 0.75. There are 4 number of proposals generated. What is the inference time of a Faster RCNN if the convolution time and the time taken by the fully connected layer are 2.25 and 1.75, respectively? Round off the answer to 2 places of decimals.