Artificial Intelligence Intermediate Level Interview Questions

Q1. How does Reinforcement Learning work? Explain with an example.

Generally, a Reinforcement Learning (RL) system is comprised of two main components:

  1. An agent
  2. An environment

The environment is the setting that the agent is acting on and the agent represents the RL algorithm.

  • The RL process starts when the environment sends a state to the agent, which then based on its observations, takes an action in response to that state.
  • In turn, the environment sends the next state and the respective reward back to the agent. The agent will update its knowledge with the reward returned by the environment to evaluate its last action.
  • The loop continues until the environment sends a terminal state, which means the agent has accomplished all his tasks.

To understand this better, let’s suppose that our agent is learning to play counterstrike. The RL process can be broken down into the below steps:

The RL Agent (Player1) collects state S⁰ from the environment (Counterstrike game)

  1. Based on the state S⁰, the RL agent takes an action A⁰, (Action can be anything that causes a result i.e. if the agent moves left or right in the game). Initially, the action is random
  2. The environment is now in a new state S¹ (new stage in the game)
  3. The RL agent now gets a reward R¹ from the environment. This reward can be additional points or coins
  4. This RL loop goes on until the RL agent is dead or reaches the destination, and it continuously outputs a sequence of state, action, and reward.

Q2. Explain Markov’s decision process with an example.

The mathematical approach for mapping a solution in Reinforcement Learning is called Markov’s Decision Process (MDP).

The following parameters are used to attain a solution using MDP:

  • Set of actions, A
  • Set of states, S
  • Reward, R
  • Policy, π
  • Value, V

To briefly sum it up, the agent must take an action (A) to transition from the start state to the end state (S). While doing so, the agent receives rewards (R) for each action he takes. The series of actions taken by the agent, define the policy (π) and the rewards collected define the value (V). The main goal here is to maximize rewards by choosing the optimum policy.

To better understand the MDP, let’s solve the Shortest Path Problem using the MDP approach:

Given the above representation, our goal here is to find the shortest path between ‘A’ and ‘D’. Each edge has a number linked with it, this denotes the cost to traverse that edge. Now, the task at hand is to traverse from point ‘A’ to ‘D’, with minimum possible cost.

In this problem,

  • The set of states are denoted by nodes i.e. {A, B, C, D}
  • The action is to traverse from one node to another {A -> B, C -> D}
  • The reward is the cost represented by each edge
  • The policy is the path taken to reach the destination

You start off at node A and take baby steps to your destination. Initially, only the next possible node is visible to you, thus you randomly start off and then learn as you traverse through the network. The main goal is to choose the path with the lowest cost.

Since this is a very simple problem, I will leave it for you to solve. Make sure you mention the answer in the comment section.

Q3. Explain reward maximization in Reinforcement Learning.

The RL agent works based on the theory of reward maximization. This is exactly why the RL agent must be trained in such a way that, he takes the best action so that the reward is maximum.

The collective rewards at a particular time with the respective action is written as:

The above equation is an ideal representation of rewards. Generally, things don’t work out like this while summing up the cumulative rewards.

Let me explain this with a small game. In the figure you can see a fox, some meat and a tiger.

  • Our RL agent is the fox and his end goal is to eat the maximum amount of meat before being eaten by the tiger.
  • Since this fox is a clever fellow, he eats the meat that is closer to him, rather than the meat which is close to the tiger, because the closer he is to the tiger, the higher are his chances of getting killed.
  • As a result, the rewards near the tiger, even if they are bigger meat chunks, will be discounted. This is done because of the uncertainty factor, that the tiger might kill the fox.

The next thing to understand is, how discounting of rewards work?
To do this, we define a discount rate called gamma. The value of gamma is between 0 and 1. The smaller the gamma, the larger the discount and vice versa.

So, our cumulative discounted rewards is:

Q4. What is exploitation and exploration trade-off?

An important concept in reinforcement learning is the exploration and exploitation trade-off.

Exploration, like the name suggests, is about exploring and capturing more information about an environment. On the other hand, exploitation is about using the already known exploited information to heighten the rewards.

Consider the fox and tiger example, where the fox eats only the meat (small) chunks close to him but he doesn’t eat the bigger meat chunks at the top, even though the bigger meat chunks would get him more rewards.

  • If the fox only focuses on the closest reward, he will never reach the big chunks of meat, this is called exploitation.
  • But if the fox decides to explore a bit, it can find the bigger reward i.e. the big chunk of meat. This is exploration.

Q5. What is the difference between parametric & non-parametric models?

Q6. What is the difference between Hyperparameters and model parameters?

Q7. What are hyperparameters in Deep Neural Networks?

  • Hyperparameters are variables that define the structure of the network. For example, variables such as the learning rate, define how the network is trained.
  • They are used to define the number of hidden layers that must be present in a network.
  • More hidden units can increase the accuracy of the network, whereas a lesser number of units may cause underfitting.

Q8. Explain the different algorithms used for hyperparameter optimization.

Grid Search
Grid search trains the network for every combination by using the two set of hyperparameters, learning rate and the number of layers. Then evaluates the model by using Cross Validation techniques.

Random Search
It randomly samples the search space and evaluates sets from a particular probability distribution. For example, instead of checking all 10,000 samples, randomly selected 100 parameters can be checked.

Bayesian Optimization
This includes fine-tuning the hyperparameters by enabling automated model tuning. The model used for approximating the objective function is called surrogate model (Gaussian Process). Bayesian Optimization uses Gaussian Process (GP) function to get posterior functions to make predictions based on prior functions.

Q9. How does data overfitting occur and how can it be fixed?

Overfitting occurs when a statistical model or machine learning algorithm captures the noise of the data. This causes an algorithm to show low bias but high variance in the outcome.

Overfitting can be prevented by using the following methodologies:

Cross-validation: The idea behind cross-validation is to split the training data in order to generate multiple mini train-test splits. These splits can then be used to tune your model.

More training data: Feeding more data to the machine learning model can help in better analysis and classification. However, this does not always work.

Remove features: Many times, the data set contains irrelevant features or predictor variables that are not needed for analysis. Such features only increase the complexity of the model, thus leading to possibilities of data overfitting. Therefore, such redundant variables must be removed.

Early stopping: A machine learning model is trained iteratively, this allows us to check how well each iteration of the model performs. But after a certain number of iterations, the model’s performance starts to saturate. Further training will result in overfitting, thus one must know where to stop the training. This can be achieved by a mechanism called early stopping.

Regularization: Regularization can be done in n number of ways, the method will depend on the type of learner you’re implementing. For example, pruning is performed on decision trees, the dropout technique is used on neural networks and parameter tuning can also be applied to solve overfitting issues.

Use Ensemble models: Ensemble learning is a technique that is used to create multiple Machine Learning models, which are then combined to produce more accurate results. This is one of the best ways to prevent overfitting. An example is Random Forest, it uses an ensemble of decision trees to make more accurate predictions and to avoid overfitting.

Q10. Mention a technique that helps to avoid overfitting in a neural network.

Dropout is a type of regularization technique used to avoid overfitting in a neural network. It is a technique where randomly selected neurons are dropped during training.

The Dropout value of a network must be chosen wisely. A value too low will result in a minimal effect and a value too high results in under-learning by the network.

Q11. What is the purpose of Deep Learning frameworks such as Keras, TensorFlow, and PyTorch?

  • Keras is an open source neural network library written in Python. It is designed to enable fast experimentation with deep neural networks.
  • TensorFlow is an open-source software library for dataflow programming. It is used for machine learning applications like neural networks.
  • PyTorch is an open source machine learning library for Python, based on Torch. It is used for applications such as natural language processing.

Q12. Differentiate between NLP and Text mining.

Q13. What are the different components of NLP?

Natural Language Understanding includes:

  • Mapping input to useful representations
  • Analyzing different aspects of the language

Natural Language Generation includes:

  • Text Planning
  • Sentence Planning
  • Text Realization

Q14. What is Stemming & Lemmatization in NLP?

Stemming algorithms work by cutting off the end or the beginning of the word, taking into account a list of common prefixes and suffixes that can be found in an inflected word. This indiscriminate cutting can be successful on some occasions, but not always.

Lemmatization, on the other hand, takes into consideration the morphological analysis of the words. To do so, it is necessary to have detailed dictionaries which the algorithm can look through to link the form back to its lemma.

Q15. Explain Fuzzy Logic architecture.

Fuzzification Module − The system inputs are fed into the Fuzzifier, which transforms the inputs into fuzzy sets.

  • Knowledge Base − It stores analytic measures such as IF-THEN rules provided by experts.
  • Inference Engine − It simulates the human reasoning process by making fuzzy inference on the inputs and IF-THEN rules.
  • Defuzzification Module − It transforms the fuzzy set obtained by the inference engine into a crisp value.

Q16. Explain the components of Expert Systems.

Knowledge Base
It contains domain-specific and high-quality knowledge.

  • Inference Engine
    It acquires and manipulates the knowledge from the knowledge base to arrive at a particular solution.
  • User Interface
    The user interface provides interaction between the user and the Expert System itself.

Q17. How is Computer Vision and AI related?

Computer Vision is a field of Artificial Intelligence that is used to obtain information from images or multi-dimensional data. Machine Learning algorithms such as K-means is used for Image Segmentation, Support Vector Machine is used for Image Classification and so on.

Therefore Computer Vision makes use of AI technologies to solve complex problems such as Object Detection, Image Processing, etc.

Q18. Which is better for image classification? Supervised or unsupervised classification? Justify.

  • In supervised classification, the images are manually fed and interpreted by the Machine Learning expert to create feature classes.
  • In unsupervised classification, the Machine Learning software creates feature classes based on image pixel values.

Therefore, it is better to choose supervised classification for image classification in terms of accuracy.

Q19. Finite difference filters in image processing are very susceptible to noise. To cope up with this, which method can you use so that there would be minimal distortions by noise?

Image Smoothing is one of the best methods used for reducing noise by forcing pixels to be more like their neighbors, this reduces any distortions caused by contrasts.

Q20. How is Game theory and AI related?

“In the context of artificial intelligence(AI) and deep learning systems, game theory is essential to enable some of the key capabilities required in multi-agent environments in which different AI programs need to interact or compete in order to accomplish a goal.”

Q21. What is the Minimax Algorithm? Explain the terminologies involved in a Minimax problem.

Minimax is a recursive algorithm used to select an optimal move for a player assuming that the other player is also playing optimally.

A game can be defined as a search problem with the following components:

  • Game Tree: A tree structure containing all the possible moves.
  • Initial state: The initial position of the board and showing whose move it is.
  • Successor function: It defines the possible legal moves a player can make.
  • Terminal state: It is the position of the board when the game ends.
  • Utility function: It is a function which assigns a numeric value for the outcome of a game.




Aspiring on Computer vision, Data science , NLP , IoT

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

K Nearest Neighbours — Step By Step Explanation In 5 Minutes

Multilayered Neural Network from scratch using python

WSDM — KKBox’s Music Recommendation Challange

Open Domain Question Answering Series — (Part 2: Machine Reading Comprehension at Scale)

Bart: How deep learning can improve medical image analysis and help save lives

Applications in Response to COVID-19: Mask Detection

AI in the real world — 3. Make your own Grammarly

Learning to Compare: Relation Network for Few-shot Learning

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Sooraj S

Sooraj S

Aspiring on Computer vision, Data science , NLP , IoT

More from Medium

Using Kaggle for capstone project

Machine Learning

[01] My learning notes of Hands-On Unsupervised Learning Using Python

Approaching Kaggle Competition step by step par