**NPTEL Introduction to Machine Learning IITKGP** ** **The course is a concise introduction to the fundamentals of data science and popular algorithms. There will be a focus on the implementation of core algorithms, drawing from popular implementations in python. Understanding how to implement such code allows students to demonstrate mastery over algorithms at a deeper level, pursue independent study or research using these methods directly, and increases the likelihood of being competitive for top jobs involving rigorous technical expertise.

** Introduction to Machine Learning IITKGP** is a MOOC course offered by IIT Kharagpur on the NPTEL platform. This course helps the students to achieve the basic clustering algorithms. The course is developed by

**By Prof. Sudeshna Sarkar**is a Professor and currently the Head of the Department of Computer Science and Engineering at IIT Kharagpur.

Elective course**INTENDED AUDIENCE:****Requirements/Prerequisites:**Basic programming skills**INDUSTRY SUPPORT:**Data science companies and many other industries value machine learning skills.

**CRITERIA TO GET A CERTIFICATE**

Average assignment score = 25% of the average of the best 6 assignments out of the total 8 assignments given in the course.

Exam score = 75% of the proctored certification exam score out of 100 Final score = Average assignment score + Exam score

Students will be eligible for CERTIFICATE ONLY IF AVERAGE ASSIGNMENT SCORE >=10/25 AND EXAM SCORE >= 30/75. If any of the 2 criteria are not met, the student will not get the certificate even if the Final score >= 40/100.

## Introduction to Machine Learning IITKGP ASSIGNMENT WEEK 8 ANSWERS:-

Contents

**Q1.** For two runs of K-Mean clustering is it expected to get same clustering results?

**Answer:-** **a**

**Q2.** Which of the following can act as possible termination conditions in K-Means?

I. For a fixed number of iterations.

II. Assignment of observations to clusters does not change between iterations. Except for cases with a bad local minimum.

III. Centroids do not change between successive iterations.

IV. Terminate when RSS falls below a threshold

**Answer:-** **d**

**Q3.** After performing K-Means Clustering analysis on a dataset, you observed the following dendrogram. Which of the following conclusion can be drawn from the dendrogram?

**Answer:-** **d**

**Q4.** What should be the best choice of no. of clusters based on the following results:

**Answer:-** **c**

**All the best for the final exam on 26 Sept, for extra preparation, take our membership for better score in exam read more here:- Final Exam Membership **

**Q5.** Given, six points with the following attributes:

**Answer:-** **a**

**Q6.** Which of the following algorithms are most sensitive to outliersWhatWh?

**Answer:-** **a**

**Q7.** What is the possible reason(s) for producing two different dendograms using agglomerative clustering for the same data set?

**Answer:-** **d**

## Introduction to Machine Learning IITKGP ASSIGNMENT WEEK 7 ANSWERS:-

**Q1.** Which of the following option is / are correct regarding the benefits of ensemble model?

- Better performance
- More generalized model
- Better interpretability

**Answer:-** **c**

**Q2.** In AdaBoost, we give more weights to points having been misclassified in previous iterations. Now, if we introduced a limit or cap on the weight that any point can take (for example, say we introduce a restriction that prevents any point’s weight from exceeding a value of 10). Which among the following would be an effect of such a modification?

**Answer:-** **b,c**

**Q3.** Which among the following are some of the differences between bagging and boosting?

**Answer:- b,c,d**

**Q4.** What is the VC-dimension of the class of circle in a 4-dimensional plane?

**Answer:-** **a**

**Q5.** Considering the AdaBoost algorithm, which among the following statements is true?

**Answer:-** **b,d**

**Q6.** Suppose the VC dimension of a hypothesis space is 6. Which of the following are true?

**Answer:-** **a,d**

**Q7.** Ensembles will yield bad results when there is a significant diversity among the models. Write True or False.

**Answer:-** **b**

**Q8.** Which of the following algorithms are not an ensemble learning algorithm?

**Answer:-** **d**

## Introduction to Machine Learning IITKGP ASSIGNMENT WEEK 6 ANSWERS:-

**Q1.** In training a neural network, we notice that the loss does not increase in the first few starting epochs: What is the reason for this?

**Answer:-** **d**

**Q2.** What is the sequence of the following tasks in a perceptron?

I) Initialize the weights of the perceptron randomly.

II) Go to the next batch of data set.

III) If the prediction does not match the output, change the weights.

IV) For a sample input, compute an output.

**Answer:-** **d**

**Q3.** Suppose you have inputs as x, y, and z with values -2, 5, and -4 respectively. You have a neuron ‘q’ and neuron ‘f’ with functions:

**Answer:-** **c**

**Q4.** A neural network can be considered as multiple simple equations stacked together. Suppose we want to replicate the function for the below-mentioned decision boundary.

**Answer:-** **a**

**Q5.** Which of the following is true about model capacity (where model capacity means the ability of neural network to approximate complex functions)?

**Answer:-** **a**

**Q6.** First Order Gradient descent would not work correctly (i.e. may get stuck) in which of the following graphs?

**Answer:-** **b**

**Q7.** Which of the following is true?

Single layer associative neural networks do not have the ability to

**Answer:-** **a**

**Q8.** The network that involves backward links from outputs to the inputs and hidden layers is called as

**Answer:-** **c**

## Introduction to Machine Learning IITKGP ASSIGNMENT WEEK 5 ANSWERS:-

**Q1.** What would be the ideal complexity of the curve which can be used for separating the two classes shown in the image below?

**Answer:-** A

**Q2.** I. Logistic Regression is used for regression purposes.

II. Logistic Regression is used for classification purposes.

**Answer:-** C

**Q3.** Which of the following methods do we use to best fit the data in Logistic Regression?

**Answer:-** B

**Q4.** Consider a following model for logistic regression: P(y=1|x,w)=g(w0+w1x) where g(z) is the logistic function.

In the above equation the P(y =1|x; w), viewed as a function of x, that we can get by changing the parameters w.

What would be the range of P in such a case?

**Answer:-** B

**Q5.** After training an SVM, we can discard all examples which are not support vectors and can still classify new examples.

**Answer:-** A

**Q6.** Suppose you are dealing with 3 class classification problem and you want to train a SVM model on the data for that you are using One-vs-all method.

**Answer:-** C

**Q7.** What is/are true about kernel in SVM?

- Kernel function map low dimensional data to high dimensional space
- It’s a similarity function

**Answer:-** C

**Q8.** Suppose you are using RBF kernel in SVM with high Gamma value. What doesthissignify?

**Answer:-** B

**Q9.** Below are the labelled instances of 2 classes and hand drawn decision boundaries for logistic regression. Which of the following figure demonstrates overfitting of the training data?

**Answer:-** C

**Q10.** What do you conclude after seeing the visualization in previous question?

**Answer:-** A

## Introduction to Machine Learning IITKGP ASSIGNMENT WEEK 4 ANSWERS:-

**Q1.** A spam filtering system has a probability of 0.95 to correctly classify a mail as spam and 0.10 probability of giving false positives. It is estimated that 1% of the mails are actual spam mails. Suppose that the system is now given a new mail to be classified as spam/ not-spam, what is the probability that the mail will be classified as spam?

**Answer:-** **c**

**Q2.** Bag I contains 4 white and 6 black balls while another Bag II contains 4 white and 3 black balls. One ball is drawn at random from one of the bags and it is found to be black. Find the probability that it was drawn from Bag I.

**Answer:-** **c**

**Q3.** Consider the following Bayesian network, where F = having the flu and C = coughing:

**Answer:-** **a**

**Q4.** Consider the following Bayesian network.

**Answer:-** **a**

**Q5.** Consider the following graphical model, mark which of the following pair of random variables are independent given no evidence?

**Answer:-** **a**

**Q6.** In a Bayesian network a node with only outgoing edge(s) represents

**Answer:-** **a**

**Q7.** it is given that

**Answer:-** **still solving will update soon** and notify on telegram

## Introduction to Machine Learning IITKGP ASSIGNMENT WEEK 3 ANSWERS:-

**Q1.** Suppose, you have given the following data where x and y are the 2 input variables and Class is the dependent variable. Suppose, you want to predict the class of new data point x=1 and y=1 using euclidean distance in 7-NN. To which class the data point belongs to?

**Answer:-** **B**

**Q2.** Imagine you are dealing with 15 class classification problem. What is the maximum number of discriminant vectors that can be produced by LDA?

**Answer:-** **B**

**Q3.** ‘People who bought this, also bought…’ recommendations seen on amazon is a result of which algorithm?

**Answer:-** **C**

**Q4.** Which of the following is/are true about PCA?

- PCA is a supervised method
- It identifies the directions that data have the largest variance
- Maximum number of principal components <= number of features
- All principal components are orthogonal to each other

**Answer:-** **D**

**Q5.** Consider the figures below. Which figure shows the most probable PCA component directions for the data points?

**Answer:-** **A**

**Q6.** When there is noise in data, which of the following options would improve the performance of the KNN algorithm?

**Answer:-** **A**

**Q7.** Which of the following statements is True about the KNN algorithm?

**Answer:-** **A**

**Q8.** Find the value of the Pearson’s correlation coefficient of X and Y from the data in the following table.

**Answer:-** **YET SOLVING ONCE DONE WILL UPDATE ON TELEGRAM, CLICK ON BELOW IMAGE FOR LINK**

## Introduction to Machine Learning IITKGP ASSIGNMENT WEEK 2 ANSWERS:-

**Q1.** Identify whether the following statement is true or false?

“Overfitting is more likely when the set of training data is small”

**Answer:-** **A. True**

**Q2.** Which of the following criteria is typically used for optimizing in linear regression.

**Answer:-** **C. Minimize the squared distance from the points.**

**Q3.** Which of the following is false? A. Bias is the true error of the best classifier in the concept class B. Bias is high if the concept class cannot model the true data distribution well C. High bias leads to overfitting D. For high bias both train and test error will be high

**Note:- WE NEVER PROMOTE COPYING AND We do not claim 100% surety of answers, these answers are based on our sole knowledge, and by posting these answers we are just trying to help students to reference, so we urge do your assignment on your own.**

**Answer:-** **C. High bias leads to overfitting**

**Q4.** The following dataset will be used to learn a decision tree for predicting whether a person is happy (H) or sad (S), based on the color of shoes, whether they wear a wig, and the number of ears they have. Which attribute should you choose as the root of the decision tree?

**Answer:-** **A. Color**

**Q5.** Consider applying linear regression with the hypothesis as h(x) = 0 + 1x. The training data is given in the table, what is the value of

**Answer:-** **B**

**Q6.** In a binary classification problem, out of 64 data points 29 belong to class I and 35 belong to class II. What is the entropy of the data set?

**Answer:-** **D. 0.99**

**Q7.** Decision trees can be used for the following type of datasets:

I. The attributes are categorical

II. The attributes are numeric valued and continuous

III. The attributes are discrete valued numbers

**Answer:- D. In cases I, II and III**

**Q8.** What is true for Stochastic Gradient Descent?

**Answer:-** **B**

**Note:- WE NEVER PROMOTE COPYING AND We do not claim 100% surety of answers, these answers are based on our sole knowledge, and by posting these answers we are just trying to help students to reference, so we urge do your assignment on your own.**

## Introduction to Machine Learning IITKGP ASSIGNMENT WEEK 1 ANSWERS:-

**Q1.** Which of the following is not a type of supervised learning?

**Answer:- C. Clustering**

**Q2.** As the amount of training data increases

**Answer:- C. Training error usually increases and generalization error usually decreases**

**Q3.** Suppose I have 10,000 emails in my mailbox out of which 300 are spams. The spam detection system detects 150 mails as spams, out of which 50 are actually spams. What is the precision and recall of my spam detection system ?

**Answer:- B. Precision = 25%, Recall = 33.33%**

**Q4.** Which of the following are not classification tasks ?

**Answer:- B. Predict the price of a house based on floor area, number of rooms etc.**

**NOTE:- IF THERE IS ANY CHANGE IN ANSWERS OF Introduction to Machine Learning IITKGP WILL UPDATE BEFORE LAST DATE AND NOTIFY ON TELEGRAM OR WHATSAPP. SO KINDLY JOIN US, CLICK ON BELOW IMAGE AND JOIN US.**

**Q5.** Occam’s razor is an example of:

**Answer:- A. Inductive bias**

**Q6.** A feature F1 can take certain value: A, B, C, D, E, F and represents grade of students from a college. Which of the following statements is true in the following case?

**Answer:- B. Feature F1 is an example of ordinal variables.**

**Q7.** Which of the following is a categorical feature?

**Answer:- C. Mother tongue of a person**

**Q8.** Which of the following tasks is NOT a suitable machine learning task?

**Answer:- A. Finding the shortest path between a pair of nodes in a graph**

**Q9.** Which of the following is correct for reinforcement learning?

**Answer:- A. The algorithm plans a sequence of actions from the current state.**

**Q10.** What is the use of Validation dataset in Machine Learning?

**Answer:- C. To tune the hyperparameters of the machine learning model**

**ALSO CHECK:- INTERNSHIP OPPORTUNITIES**