## NPTEL Deep Learning Week 4 Assignment Answers 2023

**1. Which of the following cannot be realized with single layer perceptron (only input and output layer)?**a. AND

b. OR

C. NAND

d. XOR

Answer :-For AnswerClick Here

**2. For a function f (0o, 01), if 0o and 01 are initialized at a local minimum, then what should be the values of 0o and 01 after a single iteration of gradient descent:**a. 0o and 01 will update as per gradient descent rule

b. 0o and 0, will remain same

c. Depends on the values of 0o and 01

d. Depends on the learning rate

Answer :-

**3. Choose the correct option:**i) Inability of a model to obtain sufficiently low training error is termed as Overfitting

ii) Inability of a model to reduce large margin between training and testing error is termed as Overfitting

iii) Inability of a model to obtain sufficiently low training error is termed as Underfitting

iv) Inability of a model to reduce large margin between training and testing error is termed as Underfitting

a. Only option (i) is correct

b. Both Options (ii) and (ili) are correct

c. Both Options (¡i) and (iv) are correct

d. Only option (iv) is correct

Answer :-For AnswerClick Here

**4. **

Answer :-

**5. Choose the correct option. Gradient of a continuous and differentiable function is:**i) is zero at a minimum

ii) is non-zero at a maximum

iii) is zero at a saddle point

iv)magnitude decreases as you get closer to the minimum

a. Only option (i) is corerct

b. Options (1), (ili) and (iv) are correct

c. Options (i) and (iv) are correct

d. Only option (ii) is correct

Answer :-

**6. Input to SoftMax activation function is [3,1,2]. What will be the output?**a. [0.58,0.11, 0.31]

b. [0.43,0.24, 0.33]

c. [0.60,0.10,0.301

d. [0.67, 0.09,0.24]

Answer :-For AnswerClick Here

**7. **

Answer :-

**8. Which of the following options is true?**a. In Stochastic Gradient Descent, a small batch of sample is selected randomly instead of the whole data set for each iteration. Too large update of weight values leading to faster convergence

b. In Stochastic Gradient Descent, the whole data set is processed together for update in each iteration.

c. Stochastic Gradient Descent considers only one sample for updates and has noisier updates.

d. Stochastic Gradient Descent is a non-iterative process

Answer :-

**9. What are the steps for using a gradient descent algorithm?**

- Calculate error between the actual value and the predicted value
- Re-iterate until you find the best weights of network
- Pass an input through the network and get values from output layer
- Initialize random weight and bias
- Go to each neurons which contributes to the error and change its respective values to redu the error

a. 1, 2, 3, 4, 5

b. 5, 4, 3, 2, 1

c. 3, 2, 1, 5, 4

d. 4, 3, 1, 5, 2

Answer :-

**10.**

Answer :-For AnswerClick Here

## NPTEL Deep Learning Week 3 Assignment Answers 2023

**1. What is the shape of the loss landscape during optimization of SVM?**a. Linear

b. Paraboloid

c. Ellipsoidal

d. Non-convex with multiple possible local minimum

Answer :-b. Paraboloid

**2. For a 2-class problem what is the minimum possible number of support vectors. Assume there are more than 4 examples from each class**a. 4

b. 1

c. 2

d. 8

Answer :-c. 2

**3. Choose the correct option regarding classification using SVM for two classes****Statement i:** While designing an SVM for two classes, the equation y (a*x; + b) ≥ 1 is used to choose

the separating plane using the training vectors.

**Statement ii:**During inference, for an unknown vector x;, if y;(ax; + b) ≥ 0, then the vector can be

assigned class 1.

**Statement iii**: During inference, for an unknown vector x;, if (ax; + b) > 0, then the vector can be

assigned class 1.

a. Only Statement i is true

b. Both Statements i and it are true

c. Both Statements i and i are true

d. Both Statements ii and ili are true

Answer :-For AnswerClick Here

**4. Find the scalar projection of vector b = <-4, 1> onto vector a = <1,2>?**

Answer :-For AnswerClick Here

**6. Suppose we have the below set of points with their respective classes as shown in the table. Answer the following question based on the table.**

Answer :-For AnswerClick Here

**7. Suppose we have the below set of points with their respective classes as shown in the table. Answer the following question based on the table.**

Answer :-For AnswerClick Here

**8. Suppose we have the below set of points with their respective classes as shown in the table. Answer the following question based on the table.**

Answer :-For AnswerClick Here

**9. Suppose we have the below set of points with their respective classes as shown in the table. Answer the following question based on the table.**

Answer :-For AnswerClick Here

**10. Which one of the following is a valid representation of hinge loss (of margin = 1) for a two-class problem?**y = class label (+1 or -1).

p = predicted (not normalized to denote any probability) value for a class.?

a. L(y, p) = max(0, 1 – yp)

b. L(y, p) = min(0, 1 – yp)

c. L(y, p) = max(0, 1 + yp)

d. None of the above

Answer :-For AnswerClick Here

## NPTEL Deep Learning Week 2 Assignment Answers 2023

**1. Choose the correct option regarding discriminant functions g(x) for multiclass classification (x is the feature vector to be classified) Statement i : Risk value R a; x) in Bayes minimum risk classifier can be used as a discriminant function. Statement ii: Negative of Risk value R (at|×) in Bayes minimum risk classifier can be used as a discriminant function. Statement iii: Aposteriori probability P(w; x) in Bayes minimum error classifier can be used as a discriminant function. Statement iv : Negative of Aposteriori probability P(w; x) in Bayes minimum error classifier can be used as a discriminant function. **

a. Only Statement i is true

b. Both Statements ii and ili are true

c. Both Statements i and iv are true

d. Both Statements i and iv are true

Answer :-b. Both Statements ii and ili are true

**2. Which of the following is regarding functions of discriminant functions gi(x) i.e., f(g(x)) **

a. We can not use functions of discriminant functions f(g(x)), as discriminant functions for multiclass classification.

b. We can use functions of discriminant functions, f(g(x)), as discriminant functions for multiclass classification provided, they are constant functions i.e., f(g(x)) = C where C is a constant.

c. We can use functions of discriminant functions, f(g(x)), as discriminant functions for multiclass classification provided, they are monotonically increasing functions.

d. None of the above is true.

Answer :-c. We can use functions of discriminant functions, f(g(x)), as discriminant functions for multiclass classification provided, they are monotonically increasing functions.

**3. The class conditional probability density function for the class w _{i}; i.e., P(x| w_{i}) for a multivariate normal (or Gaussian) distribution (where x is a d dimensional feature vector) is given by **

Answer :-a.

**4. There are some data points for two different classes given below. Class 1 points: {(2, 6), (3, 4), (3, 8), (4, 6)} Class 2 points: {(3, 0), (1, -2), (5, –2), (3, -4)} Compute the mean vectors μ**

_{1}and μ

_{2}for these two classes and choose the correct option.

a. μ_{1} = [2 6] and μ_{2} = [3 -1]

b. μ_{1} = [3 6] and μ_{2} = [2 -2]

c. μ_{1} = [3 6] and μ_{2} = [3 -2]

d. μ_{1} = [3 5] and μ_{2} = [2 -3]

Answer :-c

**5. There are some data points for two different classes given below. ****Class 1 points: {(2, 6), (3, 4), (3, 8), (4, 6)} ****Class 2 points: {(3, 0), (1, -2), (5, -2), (3, -4)}****Compute the covariance matrices Σ1 and Σ2 and choose the correct option.**

Answer :-b

**6. There are some data points for two different classes given below.Class 1 points: {(2, 6), (3, 4), (3, 8), (4, 6)}Class 2 points: {(3, 0), (1, -2), (5, -2), (3, -4)}**

Answer :-b

**7. Let ∑ _{i}; represents the covariance matrix for i^{th} class. Assume that the classes have the same co-variance matrix. Also assume that the features are statistically independent and have same co-variance. Which of the following is true? **

a. ∑

_{i}; = ∑, (diagonal elements of ∑ are zero)

b. ∑

_{i}; = ∑, (diagonal elements of 2 are non-zero and different from each other, rest of the elements are zero)

C. ∑

_{i}; =∑, (diagonal elements of 2 are non-zero and equal to each other, rest of the elements are zero)

d. None of these

Answer :-c

**8. The decision surface between two normally distributed class w1 and w2 is shown on the figure. Can you comment which of the following is true?**

Answer :-c

**9. **

Answer :-d

**10. You are given some data points for two different class. Class 1 points: {(11, 11), (13, 11), (8, 10), (9, 9), (7, 7), (7, 5), (15, 3)} Class 2 points: {(7, 11), (15, 9), (15, 7), (13, 5), (14, 4), (9, 3), (11, 3)} Assume that the points are samples from normal distribution and a two class Bayesian classifier is used to classify them. Also assume the prior probability of the classes are equal i.e., P(w1) =P(wz) Which of the following is true about the corresponding decision boundary used in the classifier? (Choose correct option regarding the given statements) Statement i: Decision boundary passes through the midpoint of the line segment joining the means of two classes Statement ii: Decision boundary will be orthogonal bisector of the line joining the means of two classes. **

a. Only Statement i is true

b. Only Statement ii is true

c. Both Statement i and i are true

d. None of the statements are true

Answer :-a

## NPTEL Deep Learning Week 1 Assignment Answers 2023

1. Signature descriptor of an unknown shape is given in the figure, can you identify the unknown shape?

- a. Circle
- b. Square
- c. Straight line
- d. Rectangle

Answer:-d. Rectangle

2. Signature descriptor of an unknown shape is given in the figure, If d (0) is measured in cm., what is the area of the unknown shape?

- a. 120 sq. cm.
- c. 240 sq. cm.
- d. 100 sq. cm.
- b. 144 sq. cm.

Answer:-c

3. To measure the Smoothness, coarseness and regularity of a region we use which of the transformation to extract feature?

- Gabor Transformation
- Wavelet Transformation
- Both Gabor, and Wavelet Transformation.
- None of the Above.

Answer:-Both Gabor, and Wavelet Transformation.

4. Given the 5 x 5 image I (fig 1), we can compute the gray co-occurrence matrix C (fig 2) by specifying the displacement vector d = (dx, dy). Let the position operator be specified as (1, 1), which has the interpretation: one pixel to the right and one pixel below. (Both the image and the partial gray co-occurrence is given in the figure 1, and 2 respectively. Blank values and ‘x’ value in gray co-occurrence matrix are unknown.)

What is the value of ‘x’?

- a. 0
- b. 1
- c. 2
- d. 3

Answer:-a

5. Given the 5 x 5 image I (fig 1), we can compute the gray co-occurrence matrix by specifying the displacement vector d = (dx, dy). Let the position operator be specified as (1, 1), which has the interpretation: one pixel to the right and one pixel below. What is the value of maximum probability descriptor?

- a. 3/16
- b. 1/4
- c. 3/12
- d. 1/3

Answer:-b

6. Which of the following is a region descriptor?

- a. Polygonal Representation
- b. Fourier descriptor
- c. Signature
- d. Intensity histogram.

Answer:-d. Intensity histogram.

7. We use gray co-occurrence matrix to extract which type of information?

- a. Boundary
- b. Texture
- c. MFCC
- d. Zero Crossing rate.

Answer:-b. Texture

8. A single card is drawn from a standard deck of playing cards. What is the probability of that a heart is drawn or a 5? (Hints: A standard deck of 52 cards has 4 suits namely heart, spades, diamonds and clubs)

- a. 3/13
- b. 4/13
- c. 17/52
- d. 19/52

Answer:-b. 4/13

9. which of following is strictly true for a two-class problem Bayes minimum error classifier? (The two different classes are w1 and w2, and input feature vector is x)

- a. Choose w1 if P(x/wi) > P(x/w2)
- b. Choose w1 if P(w1)>P(w2)
- c. Choose w2 if P(w1/x)>P(w2/x)
- d. Choose w1 if P(w1/x)>P(w2/x)

**Answer:- **d

10. Consider two class Bayes’ Minimum Risk Classifier. Probability of classes W1 and W2 are, P (w1) =0.2 and P (w2) =0.8 respectively. P (x| w1) = 0.75, P (x| w2) = 0.5 and the loss matrix values are

Answer:-d