Gradient Descent Practice questions.

Computer Science
Sep 24
2 min read

📝 Gradient Descent Quiz (20 Questions)

Multiple Choice Questions (MCQs).

Topic Covered: Gradient Descent,Types of Gradient Descent, Advantage and Disadvantage of there types.

Gradient Descent is primarily used for:

(a) Classification

(b) Optimization

(d) Visualization

Answer: (b) Optimization

Which of the following update rules corresponds to Batch Gradient Descent?

(a) θ = θ − η ∇J(θ; x(i), y(i))

(b) θ = θ − η ∇J(θ)

(d) None of these

Answer: (b) θ = θ − η ∇J(θ)

In Stochastic Gradient Descent (SGD), parameter updates are done using:

(a) Entire dataset

(b) Mini-batch of data

(d) No data

Answer: (c) Single training example

Which type of Gradient Descent is most memory intensive?

(a) Batch GD

(b) Stochastic GD

(d) All are equal

Answer: (a) Batch GD

Which Gradient Descent is most suitable for online learning?

(a) Batch GD

(b) Mini-Batch GD

(d) None

Answer: (c) Stochastic GD

What is the role of the learning rate (η) in Gradient Descent?

(a) Controls step size of updates

(b) Determines model accuracy

(d) Increases dataset size

Answer: (a) Controls step size of updates

A very high learning rate can cause:

(a) Faster convergence

(b) Overshooting the minimum

(d) Slow convergence

Answer: (b) Overshooting the minimum

Which type of Gradient Descent achieves a balance between SGD and Batch GD?

(a) Mini-Batch GD

(b) Stochastic GD

(d) Adaptive GD

Answer: (a) Mini-Batch GD

The gradient ∇J(θ) represents:

(a) Loss value

(b) Slope of the loss function with respect to parameters

(d) Prediction error only

Answer: (b) Slope of the loss function with respect to parameters

Which is a major advantage of Batch GD?

(a) Faster convergence

(b) Stable updates

(d) Very low memory need

Answer: (b) Stable updates

One-Word / Short Answer Questions

Fill in the blank: Gradient Descent minimizes the __________ function.

Answer: Loss (or Error) function

In SGD, updates are __________ compared to Batch GD.

Answer: Noisy / Frequent

The symbol θ usually represents __________ in Gradient Descent.

Answer: Model parameters (weights, biases)

Which Gradient Descent requires storing the entire dataset in memory?

Answer: Batch Gradient Descent

Give one use case of Batch GD.

Answer: Smaller datasets / Convex optimization problems

Which Gradient Descent is often used when data comes in a streaming form?

Answer: Stochastic Gradient Descent

Mini-batch size is denoted by the letter __________.

Answer: m

Too small mini-batch size can lead to __________ performance.

Answer: Suboptimal

The step taken in each iteration is determined by the __________.

Answer: Learning rate

Name two hyperparameters that need careful tuning in Gradient Descent.

Answer: Learning rate, Mini-batch size

✨ This set should give you enough practice to test concepts + formulas + pros & cons of each type of Gradient Descent.

Gradient Descent Practice questions.

Recent Posts

Comments

Subscribe Form