## NPTEL Introduction to Machine Learning Week 5 Assignment Answers 2024

1. Given a 3 layer neural network which takes in 10 inputs, has 5 hidden units and outputs 10 outputs, how many parameters are present in this network?

- 115
- 500
- 25
- 100

Answer :-For Answers Click Here

2. Recall the XOR(tabulated below) example from class where we did a transformation of features to make it linearly separable. Which of the following transformations can also work?

- Rotating x
_{1}and x_{2}by a fixed angle. - Adding a third dimension z=x∗y
- Adding a third dimension z=x
^{2}+y^{2} - None of the above

Answer :-For Answers Click Here

3. We use several techniques to ensure the weights of the neural network are small (such as random initialization around 0 or regularisation). What conclusions can we draw if weights of our ANN are high?

(a) Model has overfitted.

(b) It was initialized incorrectly.

At least one of (a) or (b).

None of the above.

Answer :-For Answers Click Here

4. In a basic neural network, which of the following is generally considered a good initialization strategy for the weights?

- Initialize all weights to zero
- Initialize all weights to a constant non-zero value (e.g., 0.5)
- Initialize weights randomly with small values close to zero
- Initialize weights with large random values (e.g., between -10 and 10)

Answer :-

5. Which of the following is the primary reason for rescaling input features before passing them to a neural network?

- To increase the complexity of the model
- To ensure all input features contribute equally to the initial learning process
- To reduce the number of parameters in the network
- To eliminate the need for activation functions

Answer :-

6.

Answer :-For Answers Click Here

7. Why do we often use log-likelihood maximization instead of directly maximizing the likelihood in statistical learning?

- Log-likelihood provides a different optimal solution than likelihood maximization
- Log-likelihood is always faster to compute than likelihood
- Log-likelihood turns products into sums, making computations easier and more numerically stable
- Log-likelihood allows us to avoid using probability altogether

Answer :-

8. In machine learning, if you have an infinite amount of data, but your prior distribution is incorrect, will you still converge to the right solution?

- Yes, with infinite data, the influence of the prior becomes negligible, and you will converge to the true underlying solution.
- No, the incorrect prior will always affect the convergence, and you may not reach the true solution even with infinite data.
- It depends on the type of model used; some models may still converge to the right solution, while others might not.
- The convergence to the right solution is not influenced by the prior, as infinite data will always lead to the correct solution regardless of the prior.

Answer :-

9. Statement: Threshold function cannot be used as activation function for hidden layers.

Reason: Threshold functions do not introduce non-linearity.

- Statement is true and reason is false.
- Statement is false and reason is true.
- Both are true and the reason explains the statement.
- Both are true and the reason does not explain the statement.

Answer :-

10. Choose the correct statement (multiple may be correct):

- MLE is a special case of MAP when prior is a uniform distribution.
- MLE acts as regularisation for MAP.
- MLE is a special case of MAP when prior is a beta disrubution .
- MAP acts as regularisation for MLE.

Answer :-For Answers Click Here