# NPTEL Deep Learning Assignment 4 Answers 2022

Are you looking for the Answers to NPTEL Deep Learning Assignment 4? This article will help you with the answer to the National Programme on Technology Enhanced Learning (NPTEL) Course “ NPTEL Deep Learning Assignment 4

## What is Deep Learning?

The objective of the course is to impart the knowledge and understanding of causes and effects of air pollution and their controlling mechanisms. The course will provide a deeper understanding of air pollutants, pollution inventory and modelling. The course also imparts knowledge on the impacts of air pollution on different aspects such as policy, human health and various contemporary technological innovation for betterment of air quality.

## CRITERIA TO GET A CERTIFICATE

Average assignment score = 25% of the average of best 8 assignments out of the total 12 assignments given in the course.
Exam score = 75% of the proctored certification exam score out of 100

Final score = Average assignment score + Exam score

YOU WILL BE ELIGIBLE FOR A CERTIFICATE ONLY IF THE AVERAGE ASSIGNMENT SCORE >=10/25 AND EXAM SCORE >= 30/75. If one of the 2 criteria is not met, you will not get the certificate even if the Final score >= 40/100.

Below you can find the answers for NPTEL Deep Learning Assignment 4

## NPTEL Deep Learning Assignment 4 Answers:-

Q1. Which of the following is True with movement in a contours map?
i smaller gradient for gentle slope
ii smaller gradient for a steep slope
iii larger gradient for gentle slope
iv larger gradient for a steep slope

Q2. Choose the learning rate method that needs tuning of two
hyperparameters.
I exponential decay
II step decay
III 1/t decay

Q3. Pick out the number of steps in one epoch for Mini-batch gradient descent algorithm.

Q4. Which of the following is an advantage of Nesterov accelerated gradient descent?

???? Next Week Answers: Assignment 05 ????

Q5. What are the reasons for taking a running average of the gradients at mt?

Q6. The update rule for Momentum-based Gradient descent is given by updatet=γ.updatet−1+ηwt. Which of the following is true for the given rule?

Q7. What is the number of updates for the Vanilla gradient descent algorithm after going over the entire set of 1 million data points thrice?

Q8. Consider a stochastic gradient descent algorithm run over a million data points. We observe that there is oscillation. In order to reduce this oscillation we try to implement Minibatch gradient descent. Comment on the accuracy of the estimates when the batch size is increased.

Q9. Which of the following is a drawback of Line search technique?

Q10. For the toy example discussed for Adagrad, it happened that it got stuck before it could reach convergence. This was because of the decayed learning rate. How does RMSProp overcome this?