# NPTEL Introduction to Machine Learning Week 2 Assignment Answers 2024

## NPTEL Introduction to Machine Learning Week 2 Assignment Answers 2024

1. State True or False:
Typically, linear regression tend to underperform compared to k-nearest neighbor algorithms when dealing with high-dimensional input spaces.

• True
• False
`Answer :- For Answers Click Here`

2. Given the following dataset, find the uni-variate regression function that best fits the dataset.

• f(x)=1×x+4
• f(x)=1×x+5
• f(x)=1.5×x+3
• f(x)=2×x+1
`Answer :- For Answers Click Here`

3. Given a training data set of 500 instances, with each input instance having 6 dimensions and each output being a scalar value, the dimensions of the design matrix used in applying linear regression to this data is

• 500×6
• 500×7
• 500×62
• None of the above
`Answer :- `

4. Assertion A: Binary encoding is usually preferred over One-hot encoding to represent categorical data (eg. colors, gender etc)
Reason R: Binary encoding is more memory efficient when compared to One-hot encoding

• Both A and R are true and R is the correct explanation of A
• Both A and R are true but R is not the correct explanation of A
• A is true but R is false
• A is false but R is true
`Answer :- `

5. Select the TRUE statement

• Subset selection methods are more likely to improve test error by only focussing on the most important features and by reducing variance in the fit.
• Subset selection methods are more likely to improve train error by only focussing on the most important features and by reducing variance in the fit.
• Subset selection methods are more likely to improve both test and train error by focussing on the most important features and by reducing variance in the fit.
• Subset selection methods don’t help in performance gain in any way.
`Answer :- `

6. Rank the 3 subset selection methods in terms of computational efficiency:

• Forward stepwise selection, best subset selection, and forward stagewise regression.
• Forward stepwise selection, forward stagewise regression and best subset selection.
• Best subset selection, forward stagewise regression and forward stepwise selection.
• Best subset selection, forward stepwise selection and forward stagewise regression.
`Answer :- For Answers Click Here`

7. Choose the TRUE statements from the following: (Multiple correct choice)

• Ridge regression since it reduces the coefficients of all variables, makes the final fit a lot more interpretable.
• Lasso regression since it doesn’t deal with a squared power is easier to optimize than ridge regression.
• Ridge regression has a more stable optimization than lasso regression.
• Lasso regression is better suited for interpretability than ridge regression.
`Answer :- `

8. Which of the following statements are TRUE? Let xi be the i− th datapoint in a dataset of N points. Let v
represent the first principal component of the dataset. (Multiple answer questions)

• v=argmax ∑Ni=1(vTxi)2s.t.|v|=1
• v=argmin ∑Ni=1(vTxi)2s.t.|v|=1
• Scaling at the start of performing PCA is done just for better numerical stability and computational benefits but plays no role in determining the final principal components of a dataset.
• The resultant vectors obtained when performing PCA on a dataset can vary based on the scale of the dataset.
`Answer :- For Answers Click Here`