NPTEL Introduction to Machine Learning Week 6 Assignment Answers 2024

Sanket
By Sanket

NPTEL Introduction to Machine Learning Week 6 Assignment Answers 2024

1. Entropy for a 90-10 split between two classes is:

  • 0.469
  • 0.195
  • 0.204
  • None of the above
Answer :- For Answers Click Here

2. Consider a dataset with only one attribute(categorical). Suppose, there are 8 unordered values in this attribute, how many possible combinations are needed to find the best split-point for building the decision tree classifier?

  • 511
  • 1023
  • 512
  • 127
Answer :- For Answers Click Here

3. Having built a decision tree, we are using reduced error pruning to reduce the size of the tree. We select a node to collapse. For this particular node, on the left branch, there are three training data points with the following outputs: 5, 7, 9.6, and for the right branch, there are four training data points with the following outputs: 8.7, 9.8, 10.5, 11. The average value of the outputs of data points denotes the response of a branch. The original responses for data points along the two branches (left & right respectively) were response−left and, response−right and the new response after collapsing the node is response−new. What are the values for response−left, response−right and response−new (numbers in the option are given in the same order)?

  • 9.6, 11, 10.4
  • 7.2; 10; 8.8
  • 5, 10.5, 15
  • depends on the tree height.
Answer :- For Answers Click Here

4. Which of the following is a good strategy for reducing the variance in a decision tree?

  • If improvement of taking any split is very small, don’t make a split. (Early Stopping)
  • Stop splitting a leaf when the number of points is less than a set threshold K.
  • Stop splitting all leaves in the decision tree when any one leaf has less than a set threshold K points.
  • None of the Above.
Answer :- 

5. Which of the following statements about multiway splits in decision trees with categorical features is correct?

  • They always result in deeper trees compared to binary splits
  • They always provide better interpretability than binary splits
  • They can lead to overfitting when dealing with high-cardinality categorical features
  • They are computationally less expensive than binary splits for all categorical features
Answer :- 

6. Which of the following statements about imputation in data preprocessing is most accurate?

  • Mean imputation is always the best method for handling missing numerical data
  • Imputation should always be performed after splitting the data into training and test sets
  • Missing data is best handled by simply removing all rows with any missing values
  • Multiple imputation typically produces less biased estimates than single imputation methods
Answer :- For Answers Click Here

7. Consider the following dataset:

w6q7

Which among the following split-points for feature2 would give the best split according to the misclassification error?

  • 186.5
  • 188.6
  • 189.2
  • 198.1
Answer :- For Answers Click Here
Share This Article
Leave a comment