## NPTEL Introduction to Machine Learning Week 2 Assignment Answers 2024

1. State True or False:

Typically, linear regression tend to underperform compared to k-nearest neighbor algorithms when dealing with high-dimensional input spaces.

- True
- False

Answer :-For Answers Click Here

2. Given the following dataset, find the uni-variate regression function that best fits the dataset.

- f(x)=1×x+4
- f(x)=1×x+5
- f(x)=1.5×x+3
- f(x)=2×x+1

Answer :-For Answers Click Here

3. Given a training data set of 500 instances, with each input instance having 6 dimensions and each output being a scalar value, the dimensions of the design matrix used in applying linear regression to this data is

- 500×6
- 500×7
- 500×6
^{2} - None of the above

Answer :-

4. Assertion A: Binary encoding is usually preferred over One-hot encoding to represent categorical data (eg. colors, gender etc)

Reason R: Binary encoding is more memory efficient when compared to One-hot encoding

- Both A and R are true and R is the correct explanation of A
- Both A and R are true but R is not the correct explanation of A
- A is true but R is false
- A is false but R is true

Answer :-

5. Select the TRUE statement

- Subset selection methods are more likely to improve test error by only focussing on the most important features and by reducing variance in the fit.
- Subset selection methods are more likely to improve train error by only focussing on the most important features and by reducing variance in the fit.
- Subset selection methods are more likely to improve both test and train error by focussing on the most important features and by reducing variance in the fit.
- Subset selection methods don’t help in performance gain in any way.

Answer :-

6. Rank the 3 subset selection methods in terms of computational efficiency:

- Forward stepwise selection, best subset selection, and forward stagewise regression.
- Forward stepwise selection, forward stagewise regression and best subset selection.
- Best subset selection, forward stagewise regression and forward stepwise selection.
- Best subset selection, forward stepwise selection and forward stagewise regression.

Answer :-For Answers Click Here

7. Choose the TRUE statements from the following: (Multiple correct choice)

- Ridge regression since it reduces the coefficients of all variables, makes the final fit a lot more interpretable.
- Lasso regression since it doesn’t deal with a squared power is easier to optimize than ridge regression.
- Ridge regression has a more stable optimization than lasso regression.
- Lasso regression is better suited for interpretability than ridge regression.

Answer :-

8. Which of the following statements are TRUE? Let x_{i} be the i− th datapoint in a dataset of N points. Let v

represent the first principal component of the dataset. (Multiple answer questions)

- v=argmax ∑
^{N}_{i=1}(vTxi)^{2}s.t.|v|=1 - v=argmin ∑
^{N}_{i=1}(v^{T}xi)^{2}s.t.|v|=1 - Scaling at the start of performing PCA is done just for better numerical stability and computational benefits but plays no role in determining the final principal components of a dataset.
- The resultant vectors obtained when performing PCA on a dataset can vary based on the scale of the dataset.

Answer :-For Answers Click Here