NPTEL Deep Learning - IIT Ropar Week 8 Assignment Answers 2024

NPTEL Deep Learning – IIT Ropar Week 8 Assignment Answers 2024

1. Which of the following activation functions is not zero-centered?

Sigmoid
Tanh
ReLU
Softmax

Answer :- For Answers Click Here

2. We have observed that the sigmoid neuron has become saturated. What might be the possible output values at this neuron?

0.02
0.5
1
0.97

Answer :- For Answers Click Here

3. What is the gradient of the sigmoid function at saturation?

Answer :-

4. Which of the following are common issues caused by saturating neurons in deep networks?

Vanishing gradients
Slow convergence during training
Overfitting
Increased model complexity

Answer :- For Answers Click Here

5. What are the challenges associated with using the Tanh(x) activation function?

It is not zero centered
Computationally expensive
Non-differentiable at 0
Saturation

Answer :-

6. Which of the following activation functions is preferred to avoid the vanishing gradient problem?

Sigmoid
Tanh
ReLU
None of these

Answer :-

7. Given a neuron initialized with weights w₁=1.5, w₂=0.5, and inputs x₁=0.2, x₂=−0.5, calculate the output of a ReLU neuron.

Answer :- For Answers Click Here

8. What makes batch normalization effective in deep networks?

It reduces the covariance shift
It accelerates training
It introduces regularization
It reduces the internal shift in activations

Answer :-

9. How does pre-training prevent overfitting in deep networks?

It adds regularization
It initializes the weights near local minima
It constrains the weights to a certain region
It eliminates the need for fine-tuning

Answer :-

10. We train a feed-forward neural network and notice that all the weights for a particular neuron are equal. What could be the possible causes of this issue?

Weights were initialized randomly
Weights were initialized to high values
Weights were initialized to equal values
Weights were initialized to zero

Answer :-

11. hich of the following best describes the concept of saturation in deep learning?

When the activation function output approaches either 0 or 1 and the gradient is close to zero.
When the activation function output is very small and the gradient is close to zero.
When the activation function output is very large and the gradient is close to zero.
None of the above.

Answer :- For Answers Click Here

12. Which of the following methods can help to avoid saturation in deep learning?

Using a different activation function.
Increasing the learning rate.
Increasing the model complexity
All of the above.

Answer :-

12. Which of the following is true about the role of unsupervised pre-training in deep learning?

It is used to replace the need for labeled data
It is used to initialize the weights of a deep neural network
It is used to fine-tune a pre-trained model
It is only useful for small datasets

Answer :-

14. Which of the following is an advantage of unsupervised pre-training in deep learning?

It helps in reducing overfitting
Pre-trained models converge faster
It improves the accuracy of the model
It requires fewer computational resources

Answer :- For Answers Click Here

15. What is the main cause of the Dead ReLU problem in deep learning?

High variance
High negative bias
Overfitting
Underfitting

Answer :-

16. How can you tell if your network is suffering from the Dead ReLU problem?

The loss function is not decreasing during training
The accuracy of the network is not improving
A large number of neurons have zero output
The network is overfitting to the training data

Answer :-

17. What is the mathematical expression for the ReLU activation function?

f(x) = x if x < 0, 0 otherwise
f(x) = 0 if x > 0, x otherwise
f(x) = max(0,x)
f(x) = min(0,x)

Answer :- For Answers Click Here

18. What is the main cause of the symmetry breaking problem in deep learning?

High variance
High bias
Overfitting
Equal initialization of weights

Answer :-

19. What is the purpose of Batch Normalization in Deep Learning?

To improve the generalization of the model
To reduce overfitting
To reduce bias in the model
To ensure that the distribution of the inputs at different layers doesn’t change