## NPTEL Deep Learning – IIT Ropar Week Assignment Answers 2024

Common Data Q1-Q2

Consider two models:

f^_{1}(x)=w_{0}+w_{1}x

f^_{2}(x)=w_{0}+w_{1}x^{2}+w_{2}x^{2}+w_{4}x^{4}+w_{5}x^{5}

1. Which of these models has higher complexity?

- f^
_{1}(x) - f^
_{2}(x) - It is not possible to decide without knowing the true distribution of data points in the dataset.

Answer :-For Answers Click Here

2. We generate the data using the following model:

y=5x^{3}+2x+x+3.

We fit the two models f^_{1}(x) and f^_{2}(x) on this data and train them using a neural network.

- f^
_{1}(x) has a higher bias than f^_{2}(x). - f^
_{1}(x) has a higher variance than f^_{2}(x). - f^
_{2}(x) has a higher bias than f^_{1}(x). - f^
_{2}(x) has a higher variance than f^_{1}(x).

Answer :-For Answers Click Here

Common Data Q3-Q6

Consider a function L(w,b)=0.5w^{2}+5b^{2}+1 and its contour plot given below:

3. What is the value of L(w^{∗},b∗) where w^{∗} and b∗ are the values that minimize the function.

Answer :-For Answers Click Here

4. What is the sum of the elements of ∇L(w^{∗},b∗)?

Answer :-

5. What is the determinant of *H _{L}*(w

^{∗},b∗), where

*H*is the Hessian of the function?

Answer :-

6. Compute the Eigenvalues and Eigenvectors of the Hessian. According to the eigen-values of the Hessian, which parameter is the loss more sensitive to?

- b
- w

Answer :-For Answers Click Here

7. Suppose that a model produces zero training error. What happens if we use L_{2} regularization, in general?

- It might increase training error
- It might decrease test error
- It might decrease training error
- Reduce the complexity of the model by driving less important weights to close to zero

Answer :-

8. Suppose that we apply Dropout regularization to a feed forward neural network. Suppose further that mini-batch gradient descent algorithm is used for updating the parameters of the network. Choose the correct statement(s) from the following statements.

- The dropout probability p can be different for each hidden layer
- Batch gradient descent cannot be used to update the parameters of the network
- Dropout with p=0.5 acts as a ensemble regularize
- The weights of the neurons which were dropped during the forward propagation at tth iteration will not get updated during t+1th iteration

Answer :-

9. We have trained four different models on the same dataset using various hyperparameters. The training and validation errors for each model are provided below. Based on this information, which model is likely to perform best on the test dataset?

- Model 1
- Model 2
- Model 3
- Model 4

Answer :-

10. Consider the problem of recognizing an alphabet (in upper case or lower case) of English language in an image. There are 26 alphabets in the language. Therefore, a team decided to use CNN network to solve this problem. Suppose that data augmentation technique is being used for regularization. Then which of the following transformation(s) on all the training images is (are) appropriate to the problem

- Rotating the images by ±10
^{∘} - Rotating the images by ±180
^{∘} - Translating image by 1 pixel in all direction
- Cropping

Answer :-For Answers Click Here