Table of Contents
BEU Special Examination – 2023
1. Decision tree is a ………… algorithm.
- (i) supervised learning
- (ii) unsupervised learning
- (iii) Both
- (iv) None of theseAnswer: (i) supervised learning
Reasoning: Decision trees use labeled data to learn mapping rules.
2. Decision tree is a flowchart like …………
- (i) leaf structure
- (ii) tree Structure
- (iii) steam
- (iv) none of theseAnswer: (ii) tree Structure
Reasoning: As the name suggests, it models decisions in a tree-like structure with nodes and branches.
3. Machine learning is a subset of …………
- (i) Deep Learning
- (ii) Artificial intelligence (AI)
- (iii) Internet of Things (IoT)
- (iv) Computer ScienceAnswer: (ii) Artificial intelligence (AI)
4. What is the output of Machine Learning?
- (i) Algorithm
- (ii) Memory
- (iii) Data
- (iv) CPUAnswer: (i) Algorithm (or Model)
Reasoning: ML processes data to produce a model/algorithm that can make predictions, as opposed to traditional programming which outputs data.
5. Which programming Language is Best for Machine Learning?
- (i) C, C++
- (ii) Java
- (iii) Python
- (iv) HTMLAnswer: (iii) Python
Reasoning: Python is the industry standard due to its extensive libraries (Pandas, Scikit-Learn, PyTorch).
6. What is a support vector?
- (i) The distance between any two data points
- (ii) The average distance between all data points
- (iii) The distance between any two boundary data points
- (iv) The minimum distance between any two data pointsAnswer: (iii) The distance between any two boundary data points
Note: This question is phrased poorly in the exam. Support vectors are actually the data points themselves that lie closest to the decision boundary. Option (iii) likely refers to the concept of the margin defined by these points.
7. What area of CLT tells, “How much computational power we need to find a good hypothesis?”
- (i) Sample Complexity
- (ii) Computational Complexity
- (iii) Mistake Bound
- (iv) None of theseAnswer: (ii) Computational Complexity
8. Application of Case Based Reasoning
- (i) Design
- (ii) Planning
- (iii) Diagnosis
- (iv) All of theseAnswer: (iv) All of these
9. Neural networks are complex ………… with many parameters.
- (i) Linear functions
- (ii) Nonlinear functions
- (iii) Discrete functions
- (iv) Exponential functionsAnswer: (ii) Nonlinear functions
Reasoning: Activation functions introduce non-linearity, allowing NNs to learn complex patterns.
10. Which of the following is a lazy learning algorithm?
- (i) SVM
- (ii) KNN
- (iii) Decision tree
- (iv) All of the aboveAnswer: (ii) KNN
Reasoning: K-Nearest Neighbors stores the training dataset and performs computation only during prediction (lazy).
B.Tech 8th Semester Examination – 2024
1. The problem of finding hidden structure in unlabeled data is called…
- (i) Supervised learning
- (ii) Unsupervised learning
- (iii) Reinforcement learning
- (iv) None of the aboveAnswer: (ii) Unsupervised learning
2. If machine learning model output involves target variable then that model is called as
- (i) Descriptive model
- (ii) Predictive model
- (iii) Reinforcement learning
- (iv) All of the aboveAnswer: (ii) Predictive model
3. K-Nearest Neighbors (KNN) is classified as what type of machine learning algorithm?
- (i) Instance-based learning
- (ii) Parametric learning
- (iii) Non-parametric learning
- (iv) Model-based learningAnswer: (i) Instance-based learning
Note: It is also (iii) Non-parametric, but “Instance-based” describes its fundamental mechanism of comparing new instances to stored ones.
4. Which of the following is not a supervised machine learning algorithm?
- (i) K-means
- (ii) Naïve Bayes
- (iii) Decision tree
- (iv) SVM for classificationAnswer: (i) K-means
Reasoning: K-means is a clustering algorithm (Unsupervised).
5. Which algorithm is best suited for a binary classification problem?
- (i) K-nearest Neighbors
- (ii) Decision Trees
- (iii) Random Forest
- (iv) Linear RegressionAnswer: (iv) Linear Regression (Likely typo in Exam, intended Logistic Regression)
Context: Strictly speaking, Linear Regression is for regression. However, in multiple-choice questions where “Logistic Regression” is missing, exams sometimes confuse the two. If forced to choose a classifier from the others, Random Forest (iii) is excellent, but Logistic Regression is the textbook answer for “Binary”. Given the options, (iii) is the safest technical answer, but be aware of the potential typo in the paper for (iv).
6. An artificially intelligent car decreases its speed based on its distance from the car in front of it. Which algorithm is used?
- (i) Naïve-Bayes
- (ii) Decision Tree
- (iii) Linear Regression
- (iv) Logistic RegressionAnswer: (iii) Linear Regression
Reasoning: The output (speed) is a continuous numerical value based on input (distance).
7. PCA is
- (i) Forward feature selection
- (ii) Backward feature selection
- (iii) Feature extraction
- (iv) All of the aboveAnswer: (iii) Feature extraction
8. Which of the following sentence is FALSE regarding regression?
- (i) It relates inputs to outputs
- (ii) It is used for prediction
- (iii) It may be used for interpretation
- (iv) It discovers casual relationshipsAnswer: (iv) It discovers casual relationships
Reasoning: Correlation/Regression does not imply causation.
9. A person trained to interact with a human expert in order to capture their knowledge is
- (i) Knowledge programmer
- (ii) Knowledge developer
- (iii) Knowledge engineer
- (iv) Knowledge extractorAnswer: (iii) Knowledge engineer
10. Which is a better algorithm than gradient descent for optimization?
- (i) Conjugate gradient
- (ii) Cost Function
- (iii) ERM rule
- (iv) PAC LearningAnswer: (i) Conjugate gradient
B.Tech 6th Semester Examination – 2024
1. Which of the following is true about the relationship between model complexity and bias-variance?
- (i) Increasing model complexity decreases bias but increases variance
- (ii) Increasing model complexity increases bias and variance
- (iii) Increasing model complexity decreases both bias and variance
- (iv) Increasing model complexity increases bias and decreases varianceAnswer: (i) Increasing model complexity decreases bias but increases variance
2. In cross-validation, what is the purpose of “k” in k-fold cross-validation?
- (i) The number of features used in the model
- (ii) The number of folds used to split the data
- (iii) The number of output layers in a neural network
- (iv) The number of classes in the classification problemAnswer: (ii) The number of folds used to split the data
3. What does Lasso regression primarily do compared to Ridge regression?
- (i) Performs feature selection by shrinking coefficients to zero
- (ii) Enhances the linearity of the model
- (iii) Minimizes the sum of absolute errors
- (iv) Reduces the dimensionality of the input spaceAnswer: (i) Performs feature selection by shrinking coefficients to zero
4. Which evaluation metric is most suitable for classification tasks with imbalanced class distribution?
- (i) Mean Squared Error (MSE)
- (ii) F1-score
- (iii) Accuracy
- (iv) R-squaredAnswer: (ii) F1-score
5. Which machine learning algorithm is suitable for solving regression problems?
- (i) Random Forest
- (ii) K-Means clustering
- (iii) K-Nearest Neighbors (KNN)
- (iv) Apriori algorithmAnswer: (i) Random Forest
Note: KNN can also be used for regression, but Random Forest is a more powerful and standard example of a regressor in this list.
6. Which algorithm is used for hierarchical clustering?
- (i) K-Means clustering
- (ii) DBSCAN
- (iii) Principal Component Analysis
- (iv) Agglomerative clusteringAnswer: (iv) Agglomerative clustering
7. What is the key idea behind ensemble methods like bagging and boosting?
- (i) To use a single strong learner to improve performance
- (ii) To combine the predictions of multiple weak models to create a strong model
- (iii) To increase the number of features in the dataset
- (iv) To reduce dimensionality for faster computationAnswer: (ii) To combine the predictions of multiple weak models to create a strong model
8. Which optimization algorithm is commonly used to update the weights of neural networks during training?
- (i) Decision Trees
- (ii) K-Means
- (iii) Apriori algorithm
- (iv) Gradient DescentAnswer: (iv) Gradient Descent
9. In the Expectation-Maximization (EM) algorithm, what is the goal of the E-step?
- (i) To estimate the maximum likelihood parameters of the model
- (ii) To initialize the parameters of the model
- (iii) To compute the posterior probabilities of the latent variables given the current parameters
- (iv) To maximize the log-likelihood of the observed dataAnswer: (iii) To compute the posterior probabilities of the latent variables given the current parameters
10. In reinforcement learning, what does the term “reward” refer to?
- (i) A value that indicates how far the agent is from its goal
- (ii) A value given to the agent based on its actions, which it tries to maximize over time
- (iii) The set of actions available to the agent
- (iv) The policy that the agent followsAnswer: (ii) A value given to the agent based on its actions, which it tries to maximize over time
B.Tech 6th Semester Examination – 2022
1. Which of the following is false regarding regression?
- (i) It relates inputs to outputs.
- (ii) It is used for prediction.
- (iii) It may be used for interpretation.
- (iv) It discovers causal relationships.Answer: (iv) It discovers causal relationships.
2. What is the gradient of mean-square error (MSE)with respect to beta1 (given beta0 = 0 and beta1 = 1)
Data: (1, 22), (1, 3), (2, 3)
Model: y = beta1 * x (because beta0 = 0)
Current beta1 = 1
Step 1: Predictions
ŷ values: 1, 1, 2
Step 2: Errors (y – ŷ)
e1 = 21
e2 = 2
e3 = 1
Step 3: Gradient formula
gradient = -(2/N) * Σ[x * (y – ŷ)]
Compute the sum:
121 + 12 + 2*1 = 25
Now compute gradient:
gradient = -(2/3) * 25
gradient = -16.666…
Final Answer: -16.67
3. As you increase the amount of training data, the test error decreases and the training error increases. The train error is quite low, while the test error is much higher than the train error. What is the main reason?
- (i) High variance
- (ii) High model bias
- (iii) High estimation bias
- (iv) None of the aboveAnswer: (i) High variance (Overfitting).
4. The square of the correlation coefficient $r^2$ will always be positive and is called the
- (i) regression
- (ii) coefficient of determination
- (iii) KNN
- (iv) algorithmAnswer: (ii) coefficient of determination
5. The parameter $\beta_0$ is termed as intercept term and the parameter $\beta_1$ is termed as slope parameter. These parameters are usually called as
- (i) regressionists
- (ii) coefficients
- (iii) regressive
- (iv) regression coefficientsAnswer: (iv) regression coefficients
6. In order to calculate confidence intervals and hypothesis tests, it is assumed that the errors are independent and normally distributed with mean zero and
- (i) mean
- (ii) variance
- (iii) SD
- (iv) KNNAnswer: (ii) variance (specifically, constant variance).
7. Which of the following is true about residuals?
- (i) Lower is better
- (ii) Higher is better
- (iii) (i) or (ii) depending on the situation
- (iv) None of the aboveAnswer: (i) Lower is better
8. You found that correlation coefficient for one of its variables (Say X1) with Y is -0.95. Which of the following is true for X1?
- (i) Relation between X1 and Y is weak
- (ii) Relation between X1 and Y is strong
- (iii) Relation between X1 and Y is neutral
- (iv) Correlation cannot judge the relationshipAnswer: (ii) Relation between X1 and Y is strong
9. Suppose you plotted a scatter plot between the residuals and predicted values in linear regression and you found that there is a relationship between them. Which of the following conclusions would you make?
- (i) Since there is a relationship, it means our model is not good
- (ii) Since there is a relationship, it means our model is good
- (iii) Cannot say
- (iv) None of the aboveAnswer: (i) Since there is a relationship, it means our model is not good
Reasoning: Residuals should be random (white noise). A pattern indicates the model failed to capture some signal.
10. What would be the root mean square training error for this data if you run a linear regression model of the form ($Y=A_0+A_1X$)?
- (i) Less than zero
- (ii) Greater than zero
- (iii) Equal to zero
- (iv) None of the aboveAnswer: (ii) Greater than zero
Reasoning: RMSE is always non-negative. For real-world data, a perfect fit (zero error) is extremely rare, so it will be greater than zero.