NPTEL Introduction to Machine Learning Week 2 Assignment Answers 2024
1. State True or False:
Typically, linear regression tend to underperform compared to k-nearest neighbor algorithms when dealing with high-dimensional input spaces.
- True
- False
Answer :- For Answers Click Here
2. Given the following dataset, find the uni-variate regression function that best fits the dataset.
- f(x)=1×x+4
- f(x)=1×x+5
- f(x)=1.5×x+3
- f(x)=2×x+1
Answer :- For Answers Click Here
3. Given a training data set of 500 instances, with each input instance having 6 dimensions and each output being a scalar value, the dimensions of the design matrix used in applying linear regression to this data is
- 500×6
- 500×7
- 500×62
- None of the above
Answer :-
4. Assertion A: Binary encoding is usually preferred over One-hot encoding to represent categorical data (eg. colors, gender etc)
Reason R: Binary encoding is more memory efficient when compared to One-hot encoding
- Both A and R are true and R is the correct explanation of A
- Both A and R are true but R is not the correct explanation of A
- A is true but R is false
- A is false but R is true
Answer :-
5. Select the TRUE statement
- Subset selection methods are more likely to improve test error by only focussing on the most important features and by reducing variance in the fit.
- Subset selection methods are more likely to improve train error by only focussing on the most important features and by reducing variance in the fit.
- Subset selection methods are more likely to improve both test and train error by focussing on the most important features and by reducing variance in the fit.
- Subset selection methods don’t help in performance gain in any way.
Answer :-
6. Rank the 3 subset selection methods in terms of computational efficiency:
- Forward stepwise selection, best subset selection, and forward stagewise regression.
- Forward stepwise selection, forward stagewise regression and best subset selection.
- Best subset selection, forward stagewise regression and forward stepwise selection.
- Best subset selection, forward stepwise selection and forward stagewise regression.
Answer :- For Answers Click Here
7. Choose the TRUE statements from the following: (Multiple correct choice)
- Ridge regression since it reduces the coefficients of all variables, makes the final fit a lot more interpretable.
- Lasso regression since it doesn’t deal with a squared power is easier to optimize than ridge regression.
- Ridge regression has a more stable optimization than lasso regression.
- Lasso regression is better suited for interpretability than ridge regression.
Answer :-
8. Which of the following statements are TRUE? Let xi be the i− th datapoint in a dataset of N points. Let v
represent the first principal component of the dataset. (Multiple answer questions)
- v=argmax ∑Ni=1(vTxi)2s.t.|v|=1
- v=argmin ∑Ni=1(vTxi)2s.t.|v|=1
- Scaling at the start of performing PCA is done just for better numerical stability and computational benefits but plays no role in determining the final principal components of a dataset.
- The resultant vectors obtained when performing PCA on a dataset can vary based on the scale of the dataset.
Answer :- For Answers Click Here