## NPTEL Deep Learning – IIT Ropar Week 8 Assignment Answers 2024

1. Which of the following activation functions is not zero-centered?

- Sigmoid
- Tanh
- ReLU
- Softmax

Answer :-For Answers Click Here

2. We have observed that the sigmoid neuron has become saturated. What might be the possible output values at this neuron?

- 0.02
- 0.5
- 1
- 0.97

Answer :-For Answers Click Here

3. What is the gradient of the sigmoid function at saturation?

Answer :-

4. Which of the following are common issues caused by saturating neurons in deep networks?

- Vanishing gradients
- Slow convergence during training
- Overfitting
- Increased model complexity

Answer :-For Answers Click Here

5. What are the challenges associated with using the Tanh(x) activation function?

- It is not zero centered
- Computationally expensive
- Non-differentiable at 0
- Saturation

Answer :-

6. Which of the following activation functions is preferred to avoid the vanishing gradient problem?

- Sigmoid
- Tanh
- ReLU
- None of these

Answer :-

7. Given a neuron initialized with weights w_{1}=1.5, w_{2}=0.5, and inputs x_{1}=0.2, x_{2}=−0.5, calculate the output of a ReLU neuron.

Answer :-For Answers Click Here

8. What makes batch normalization effective in deep networks?

- It reduces the covariance shift
- It accelerates training
- It introduces regularization
- It reduces the internal shift in activations

Answer :-

9. How does pre-training prevent overfitting in deep networks?

- It adds regularization
- It initializes the weights near local minima
- It constrains the weights to a certain region
- It eliminates the need for fine-tuning

Answer :-

10. We train a feed-forward neural network and notice that all the weights for a particular neuron are equal. What could be the possible causes of this issue?

- Weights were initialized randomly
- Weights were initialized to high values
- Weights were initialized to equal values
- Weights were initialized to zero

Answer :-

11. hich of the following best describes the concept of saturation in deep learning?

- When the activation function output approaches either 0 or 1 and the gradient is close to zero.
- When the activation function output is very small and the gradient is close to zero.
- When the activation function output is very large and the gradient is close to zero.
- None of the above.

Answer :-For Answers Click Here

12. Which of the following methods can help to avoid saturation in deep learning?

- Using a different activation function.
- Increasing the learning rate.
- Increasing the model complexity
- All of the above.

Answer :-

12. Which of the following is true about the role of unsupervised pre-training in deep learning?

- It is used to replace the need for labeled data
- It is used to initialize the weights of a deep neural network
- It is used to fine-tune a pre-trained model
- It is only useful for small datasets

Answer :-

14. Which of the following is an advantage of unsupervised pre-training in deep learning?

- It helps in reducing overfitting
- Pre-trained models converge faster
- It improves the accuracy of the model
- It requires fewer computational resources

Answer :-For Answers Click Here

15. What is the main cause of the Dead ReLU problem in deep learning?

- High variance
- High negative bias
- Overfitting
- Underfitting

Answer :-

16. How can you tell if your network is suffering from the Dead ReLU problem?

- The loss function is not decreasing during training
- The accuracy of the network is not improving
- A large number of neurons have zero output
- The network is overfitting to the training data

Answer :-

17. What is the mathematical expression for the ReLU activation function?

- f(x) = x if x < 0, 0 otherwise
- f(x) = 0 if x > 0, x otherwise
- f(x) = max(0,x)
- f(x) = min(0,x)

Answer :-For Answers Click Here

18. What is the main cause of the symmetry breaking problem in deep learning?

- High variance
- High bias
- Overfitting
- Equal initialization of weights

Answer :-

19. What is the purpose of Batch Normalization in Deep Learning?

- To improve the generalization of the model
- To reduce overfitting
- To reduce bias in the model
- To ensure that the distribution of the inputs at different layers doesn’t change

Answer :-

20. In Batch Normalization, which parameter is learned during training?

- Mean
- Variance
- γ
- ϵ

Answer :-For Answers Click Here