It is a technique used in ML to reduce the number of dimensions such that it retains only those most important components.
- Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional
space so that the low-dimensional representation retains some meaningful properties of the original data, ideally close to its intrinsic
dimension. Working in high-dimensional spaces can be undesirable for many reasons; raw data are often sparse as a consequence of the curse
of dimensionality, and analyzing the data is usually computationally intractable. Dimensionality reduction is common in fields that deal
with large numbers of observations and/or large numbers of variables, such as signal processing, speech recognition, neuroinformatics,
and bioinformatics.Methods are commonly divided into linear and non-linear approaches.[1] Approaches can also be divided into feature
selection and feature extraction.[2] Dimensionality reduction can be used for noise reduction, data visualization, cluster analysis,
or as an intermediate step to facilitate other analyses.

- Reduces computational complexity
- Reduces overfitting
- Helps in visualizing by reducing the number of high dimensions.
- Dimensionality reduction is used both in supervised and unsupervised learning techniques.
PCA can be used for both supervised and unsupervised learning techniques LDA can be used only for supervised learning technique
-
PCA:- Principal Component Analysis
- PCA preserves the correlation between features
- The principal components in PCA are created by linear combination of original variables.(Calculated with concepts like eigen values)
- The principal components are orthogonal to each other.
- The first principal component represents the direction of maximum variance.
- PCA always performs well in a normalized dataset.
-
LDA -- Linear Discriminant Analysis
- LDA tries to reduce the dimensions of the feature set while retaining the information that discriminates the output class label as well.
- LDA tries to find a decision boundary around each cluster of a class
- It will then project these data points in a new dimension such that all the clusters are separate from each other as much as possible. hence, the individual points in a cluster are closer to the centroid of that particular cluster.
- These new dimensions form the linear discriminants of the feature set.
- Choose PCA --> When the data is highly irregular in terms of distribution (skewed)
- Choose LDA --> Uniform distribution