Naive Bayes Algorithm
Definition:
The Naive Bayes algorithm is a probabilistic classifier based on Bayes' Theorem with the assumption that the features are conditionally independent given the class label. Despite the "naive" assumption of feature independence, it is highly effective for various real-world applications such as spam filtering, text classification, and recommendation systems.
Characteristics:
-
Probabilistic Model:
Naive Bayes predicts the class label by calculating the posterior probability of each class based on the input features and selecting the class with the highest probability. -
Conditional Independence Assumption:
Naive Bayes assumes that each feature is independent of others given the class label, which simplifies the calculation of probabilities but may not always hold in practice. -
Efficient and Scalable:
Naive Bayes is computationally efficient and can scale well to large datasets with multiple features.
Types of Naive Bayes:
-
Gaussian Naive Bayes:
Used when the features are continuous and follow a Gaussian (normal) distribution. Commonly applied in cases where the data can be assumed to be normally distributed. -
Multinomial Naive Bayes:
Used for discrete data, often applied in text classification, where the features represent counts or frequencies of words (e.g., spam detection). -
Bernoulli Naive Bayes:
Applied to binary data, where the features take on binary values (e.g., presence or absence of a word in text classification).
Steps Involved:
-
Input the Data:
The algorithm receives labelled training data, where each example consists of a set of features and a corresponding class label. -
Calculate Prior Probabilities:
The prior probability of each class is computed based on the frequency of each class in the training data. -
Calculate Likelihood:
For each feature and class, the likelihood is calculated by determining how likely it is to observe a particular feature value given the class. -
Apply Bayes' Theorem:
Using Bayes' Theorem, the posterior probability of each class is calculated based on the priors and likelihoods:where is the posterior probability of the class given the feature vector, is the likelihood, is the prior, and is the evidence.
-
Classify New Data:
For a new data point, the algorithm computes the posterior probability for each class and assigns the label of the class with the highest probability.
Problem Statement:
Given a dataset with multiple features and corresponding class labels, the objective is to train a Naive Bayes classifier that can predict the class label for new, unseen data based on the calculated probabilities.
Key Concepts:
-
Bayes' Theorem:
Bayes' Theorem is a mathematical formula used to calculate conditional probabilities. It is expressed as: