-
Notifications
You must be signed in to change notification settings - Fork 2
MultipleChoice
Let’s suppose we model the Indian map boundary using the following equation: y = f(x) = ax + b, and learn the values of scalars a and b using 100,000 coordinates sampled uniformly from the Indian map boundary. Which of the following problems is our model likely to suffer from?
- High bias
- High variance
- Low entropy
- High entropy
Answer - High Bias
Consider an A versus B two-class classifier. Recall(A) = 100% implies which of the following:
- Precision(A) < 100%
- Recall(B) = 100%
- Precision(B) < 100%
- None of the above
Answer - Precision(A) < 100%
Normalization is one of the parts of data pre-processing. We do it in order to:
- Features scaling
- Improve performance of machine learning algorithm to be applied further
- Remove elements from set with missing value.
- Filling missing values in input data by average value
Advantages of Naive Bayes classifier are
- It perform well in multi class prediction.
- It perform well in case of dependent predictors.
- It perform well in case of categorical input variables compared to numerical variable(s)
- It perform well in case of categorical variable which was not observed in training data set.
Answer - Multi-Class / Prediction, Also, well on Categorical (1 & 3)
Decision trees have the following properties
- Over-fitting does not occur when decision trees are used
- They work well with continuous variables
- Type of input data doesn't matter
- It's not a parametric method
Answer - Non Parametric, 2-3-4
k-Means clustering method has the following properties
- Each new element refers to a class based on the distance to this class
- It's not a parametric method
- Each element belongs just to one class.
- It's a supervised machine learning method
Answer - 2-3
Support vector machine is used in the following cases
- Binary classification for linear separable classes
- Multiple classification for linear separable classes.
- Binary classification for classes that are not linear separable.
- Finding number of clusters that input points belong to
Answer - 3
Logistic regression is used in the following cases
- There are set of vectors. Each of them belongs to one of the two classes. First class has label 'Yes' second 'No'. For new vector we should say for which class it belongs to.
- For input vector predict target value in case nonlinear dependency between input and output variables.
- There are set of vectors. Each of them belongs to one of the two classes. First class has label 'Yes' second 'No'. For new vector we should calculate probability that it belongs to class with label 'Yes'.
- For detecting most significant feature(attribute) in the input vector of attributes.
Linear regression is used in the following cases
- For input vector predict continuous target value
- It is allow to detect degree of influence input vector attribute x_i on the target value y
- When we make assumption that dependency between input and output data is linear
- For linear separable classes detect if input vector belongs to first or to the second class
Answer = 1, 3
You have two attributes Attribute1 and Attribute2 in a dataset. Attribute2 is always multiple folds larger than Attribute1. What kind of data pre-processing would you apply to ensure that both the attributes are given equal importance by the learning algorithm? Choose all that apply
- Standardization
- Normalization
- Min-Max Transformation
- Log transformation
Answer 2-3-4
You are learning a logistic regression model with 1000 attributes. You would like to get a sparse solution for the coefficients. How could you achieve this?
- By Applying PCA
- Chi-square feature selection
- L1 Regularization
- L2 Regularization
When applied to different validation dataset composed of 100,000 records, your final model has the following confusion matrix when predicting the value of the churned field: True positive: 15,000 True negative: 79,000 False positive: 1,000 False negative: 5,000
What is the accuracy and the recall of this model?
- Accuracy: 84% Recall: 75%
- Accuracy: 94% Recall: 98.75%
- Accuracy: 94% Recall: 75%
- Accuracy: 84% Recall: 98.75%
What can you try to do to improve this new model?
- Use regularization in your cost function
- Remove one or more feature from the dataset
- Acquire or engineer more features in the dataset
- Collect more data
You are learning a logistic regression model with 1000 attributes. You would like to avoid giving too much weightage to very few attributes. How could you achieve this?
- By Applying PCA
- Chi-square feature selection
- L1 Regularization
- L2 Regularization
I have a 3-layer neural network with seven inputs, two hidden layers of 4 neurons each and one neuron in the output layer. When we say N-layer neural network, we do not count the input layer. The layers are fully connected and a bias is used for each node in the hidden layers. What is the total number neurons and total of learnable parameters in this network?
- 16, 57
- 8, 48
- 10, 51
- 9, 57
Answer - 9,57
Some recommended ways to initialize the weights of a neural network are
- Random initialization
- Small Random initialization
- Zero initialization
- Sparse Initialization
- Batch Normalization
- Random initialization with variance calibration
Answer - Random
Some cases where the problem of vanishing gradients could occur are
- Relu activation
- CNN
- Zero initialization
- Sigmoid activation
- RNN
Answer - Sigmoid
A box contains 731 black balls and 2000 white balls. The following process is to be repeated as long as possible.
Arbitrarily select two balls from the box. If they are of the same colour, throw them out and put a black ball into the box
(enough extra black balls are available to do this). If they are of different colours, place th white ball back into the box and throw the black ball away.
Which of the following is correct?
- The process can be applied indefinitly without any a prior bound
- The process will stop with a single white ball in the box
- The process will stop with a single black ball in the box
- The proccess will stop with the box emplty
- None of the above
Answer - The process will stop with a single black ball in the box
The hour hand and the minute hands of clock meet at noon and again at midnight. In between they meet N times, where N is:
- 6
- 11
- 12
- 13
- None of the above
Answer - None of the above - N = 10
Given 10 tosses of a coin with probability of head = 0.4 and tail = (1- head probabolity). The probability of at least one head is
- (0.4)^10
- 1 - (0.4)^10
- 1 - (0.6)^10
- (0.6)^10
- 10(0.4)(0.6)^9
Answer - 1 - (0.6)^10
A cube whose faces are colored is split into 1000 small cubes of equal size. The cubes thus obtained are mixed thoroughly. The probability that a cube drawn at random will have exactly two colored facs is
- 0.096
- 0.12
- 0.104
- 0.24
- None of the above
Answer - 1 - (8 * 12 / 1000 = 0.096)