When analyzing datasets, it’s not enough to know measures of central tendency (mean, median, mode) and variability (variance, standard deviation).
Skewness: The Measure of Asymmetry
Definition: Skewness measures the degree and direction of asymmetry in a distribution around its mean.
Formula:
Skewness=(n1∑i=1n(xi−xˉ)2)3/2n1∑i=1n(xi−xˉ)3
Real-Life Example:
Income distribution → Often positively skewed because most people earn average wages, but a small number of high earners stretch the tail to the right.
Exam scores → If most students score high but a few fail badly, the distribution is negatively skewed.
Kurtosis: The Measure of Tailedness
Definition: Kurtosis measures the heaviness of tails in a distribution compared to a normal distribution.
Formula:
Kurtosis=(n1∑i=1n(xi−xˉ)2)2n1∑i=1n(xi−xˉ)4
- A normal distribution has kurtosis ≈ 3 (called mesokurtic).
- To make interpretation easier, analysts often use excess kurtosis = kurtosis – 3.
Real-Life Example:
- Stock returns → Usually leptokurtic (heavy-tailed). This means extreme ups and downs occur more frequently than in a normal curve.
- Heights of people → Typically close to mesokurtic, since extreme deviations are rare.
- Uniform distribution → Often platykurtic (light-tailed), with fewer outliers.
Key Difference
- Skewness → Tells us about the direction of data spread (left, right, or symmetric).
- Kurtosis → Tells us about the intensity of tails (normal, heavy, or light).
Why It Matters
- Understanding skewness and kurtosis helps analysts:
- Detect outliers and anomalies.
- Choose suitable statistical models (many assume normality).
- Improve preprocessing before applying machine learning.
by gregory.tech
Top comments (0)