Data analysis Part 1: Mean and Median. When Median is recommended?
Data analysis Part 1: Mean and Median
Mean, Median are two important parameters in data analysis to calculate the spread of data
Let's calculate the two parameters with India's historical inflation data
Table 1. India's historical inflation data
| Sl | Year | % inflation |
| 1 | 2021 | 5.13 |
| 2 | 2020 | 6.62 |
| 3 | 2019 | 3.73 |
| 4 | 2018 | 3.94 |
| 5 | 2017 | 3.33 |
| 6 | 2016 | 4.95 |
| 7 | 2015 | 4.91 |
| 8 | 2014 | 6.67 |
| 9 | 2013 | 11.06 |
| 10 | 2012 | 9.31 |
| 11 | 2011 | 8.86 |
| 12 | 2010 | 11.99 |
| 13 | 2009 | 10.88 |
| 14 | 2008 | 8.35 |
| 15 | 2007 | 6.37 |
| 16 | 2006 | 5.8 |
| 17 | 2005 | 4.25 |
| 18 | 2004 | 3.77 |
| 19 | 2003 | 3.81 |
| 20 | 2002 | 4.3 |
| 21 | 2001 | 3.78 |
| 22 | 2000 | 4.01 |
| 23 | 1999 | 4.67 |
| 24 | 1998 | 13.23 |
| 25 | 1997 | 7.16 |
| 26 | 1996 | 8.98 |
| 27 | 1995 | 10.22 |
| 28 | 1994 | 10.25 |
| 29 | 1993 | 6.33 |
| 30 | 1992 | 11.79 |
| 31 | 1991 | 13.87 |
| 32 | 1990 | 8.97 |
| 33 | 1989 | 7.07 |
| 34 | 1988 | 9.38 |
| 35 | 1987 | 8.8 |
| 36 | 1986 | 8.73 |
| 37 | 1985 | 5.56 |
| 38 | 1984 | 8.32 |
| 39 | 1983 | 11.87 |
| 40 | 1982 | 7.89 |
| 41 | 1981 | 13.11 |
| 42 | 1980 | 11.35 |
| 43 | 1979 | 6.28 |
| 44 | 1978 | 2.52 |
| 45 | 1977 | 8.31 |
| 46 | 1976 | -7.63 |
| 47 | 1975 | 5.75 |
| 48 | 1974 | 28.6 |
| 49 | 1973 | 16.94 |
| 50 | 1972 | 6.44 |
| 51 | 1971 | 3.08 |
| 52 | 1970 | 5.09 |
| 53 | 1969 | -0.58 |
| 54 | 1968 | 3.24 |
| 55 | 1967 | 13.06 |
| 56 | 1966 | 10.8 |
| 57 | 1965 | 9.47 |
| 58 | 1964 | 13.36 |
| 59 | 1963 | 2.95 |
| 60 | 1962 | 3.63 |
| 61 | 1961 | 1.7 |
| 62 | 1960 | 1.78 |
1. Mean
1.1. Significance
It represents the average value of the data. It is the most commonly used parameter for estimating central tendency of the data.
1.2. Calculation
It is measured by dividing sum of the individual measurements by the number of measurements
As per Table 1
Mean is 7.39
1.3. Drawback
If the data is not uniformly distributed, it does not represent middle value of the data.
2. Median
2.1. Significance of median
To overcome the drawbacks of mean, i.e. if the data is not uniformly distributed, the term median can represent central tendency. This can be understood from an example.
2.2. Median vs Mean
Example 1
Suppose in an office, there are 6 employees. Employees are categorized in 6 groups based upon their salary.
Lowest salary is Rs 10,000, whereas highest salary is Rs 2,00,000/- and others salaries fall in between both the extremes
- ₹ 10,000/-
- ₹ 20,000/-
- ₹ 25,000/-
- ₹ 35,000/-
- ₹ 50,000/-
- ₹ 2,00,000/-
It can be seen that 5 out of 6 people's salaries are in between ₹ 10,000-50,000/- but due to one person's salary of ₹ 2,00,000/-, the mean gets shifted to higher range and no longer represents central tendency. But median of ₹ 30,000/- still represents central tendency. That is the beauty of median. Median represents central tendency better than mean for non uniformly distributed data, whereas for uniformly distributed data, both mean and median values shall be pretty close with each other.
Example 2
Suppose in an Ph.D. course work class, there are 10 students. There ages are as following
- 22
- 22.1
- 22.3
- 22.4
- 22.5
- 22.9
- 23
- 23.1
- 23.4
- 23.9
But if one student, lets say an in-service person in deputation of age 50 joins the class,
- 22
- 22.1
- 22.3
- 22.4
- 22.5
- 22.9
- 23
- 23.1
- 23.4
- 23.9
- 50
Here mean (25.2) no longer represents the central tendency but median (22.9) still does.
Median in comparison to mean is less susceptible to change due to extreme values and is more reliable to represent central tendency for either uniform or for non uniform data.
Uniform data means data with similar magnitude. Non uniform data means in addition to data with similar magnitude, few data with extreme values. Extreme values means either very high or very low values.
2.3. Calculation
It is measured by first arranging the data in increasing or decreasing trend and then middle value of the data is identified.
If the total no. of measurements is odd no., measurement is simple. Exactly middle value of the properly arranged data is the median.
If the total no. of measurements is even no., two middle values are added and are divided by 2.
Table 1a: India's historical inflation values arranged in increasing trend (1960-2021)
| New Sl | Old Sl | Year | % Inflation |
| 1 | 46 | 1976 | -7.63 |
| 2 | 53 | 1969 | -0.58 |
| 3 | 61 | 1961 | 1.7 |
| 4 | 62 | 1960 | 1.78 |
| 5 | 44 | 1978 | 2.52 |
| 6 | 59 | 1963 | 2.95 |
| 7 | 51 | 1971 | 3.08 |
| 8 | 54 | 1968 | 3.24 |
| 9 | 5 | 2017 | 3.33 |
| 10 | 60 | 1962 | 3.63 |
| 11 | 3 | 2019 | 3.73 |
| 12 | 18 | 2004 | 3.77 |
| 13 | 21 | 2001 | 3.78 |
| 14 | 19 | 2003 | 3.81 |
| 15 | 4 | 2018 | 3.94 |
| 16 | 22 | 2000 | 4.01 |
| 17 | 17 | 2005 | 4.25 |
| 18 | 20 | 2002 | 4.3 |
| 19 | 23 | 1999 | 4.67 |
| 20 | 7 | 2015 | 4.91 |
| 21 | 6 | 2016 | 4.95 |
| 22 | 52 | 1970 | 5.09 |
| 23 | 1 | 2021 | 5.13 |
| 24 | 37 | 1985 | 5.56 |
| 25 | 47 | 1975 | 5.75 |
| 26 | 16 | 2006 | 5.8 |
| 27 | 43 | 1979 | 6.28 |
| 28 | 29 | 1993 | 6.33 |
| 29 | 15 | 2007 | 6.37 |
| 30 | 50 | 1972 | 6.44 |
| 31 | 2 | 2020 | 6.62 |
| 32 | 8 | 2014 | 6.67 |
| 33 | 33 | 1989 | 7.07 |
| 34 | 25 | 1997 | 7.16 |
| 35 | 40 | 1982 | 7.89 |
| 36 | 45 | 1977 | 8.31 |
| 37 | 38 | 1984 | 8.32 |
| 38 | 14 | 2008 | 8.35 |
| 39 | 36 | 1986 | 8.73 |
| 40 | 35 | 1987 | 8.8 |
| 41 | 11 | 2011 | 8.86 |
| 42 | 32 | 1990 | 8.97 |
| 43 | 26 | 1996 | 8.98 |
| 44 | 10 | 2012 | 9.31 |
| 45 | 34 | 1988 | 9.38 |
| 46 | 57 | 1965 | 9.47 |
| 47 | 27 | 1995 | 10.22 |
| 48 | 28 | 1994 | 10.25 |
| 49 | 56 | 1966 | 10.8 |
| 50 | 13 | 2009 | 10.88 |
| 51 | 9 | 2013 | 11.06 |
| 52 | 42 | 1980 | 11.35 |
| 53 | 30 | 1992 | 11.79 |
| 54 | 39 | 1983 | 11.87 |
| 55 | 12 | 2010 | 11.99 |
| 56 | 55 | 1967 | 13.06 |
| 57 | 41 | 1981 | 13.11 |
| 58 | 24 | 1998 | 13.23 |
| 59 | 58 | 1964 | 13.36 |
| 60 | 31 | 1991 | 13.87 |
| 61 | 49 | 1973 | 16.94 |
| 62 | 48 | 1974 | 28.6 |
It can be seen as per Table 1a that, total no. of measurements is even no., so median is (6.62+6.67)/ 2 = 6.645.
Table 1b: India's historical inflation values arranged in increasing trend with India's projected inflation of 2022 (1960-2022)
| New Sl. | Old Sl. | Year | % Inflation |
| 1 | 47 | 1976 | -7.63 |
| 2 | 54 | 1969 | -0.58 |
| 3 | 62 | 1961 | 1.7 |
| 4 | 63 | 1960 | 1.78 |
| 5 | 45 | 1978 | 2.52 |
| 6 | 60 | 1963 | 2.95 |
| 7 | 52 | 1971 | 3.08 |
| 8 | 55 | 1968 | 3.24 |
| 9 | 6 | 2017 | 3.33 |
| 10 | 61 | 1962 | 3.63 |
| 11 | 4 | 2019 | 3.73 |
| 12 | 19 | 2004 | 3.77 |
| 13 | 22 | 2001 | 3.78 |
| 14 | 20 | 2003 | 3.81 |
| 15 | 5 | 2018 | 3.94 |
| 16 | 23 | 2000 | 4.01 |
| 17 | 18 | 2005 | 4.25 |
| 18 | 21 | 2002 | 4.3 |
| 19 | 24 | 1999 | 4.67 |
| 20 | 8 | 2015 | 4.91 |
| 21 | 7 | 2016 | 4.95 |
| 22 | 53 | 1970 | 5.09 |
| 23 | 2 | 2021 | 5.13 |
| 24 | 38 | 1985 | 5.56 |
| 25 | 48 | 1975 | 5.75 |
| 26 | 17 | 2006 | 5.8 |
| 27 | 44 | 1979 | 6.28 |
| 28 | 30 | 1993 | 6.33 |
| 29 | 16 | 2007 | 6.37 |
| 30 | 51 | 1972 | 6.44 |
| 31 | 3 | 2020 | 6.62 |
| 32 | 9 | 2014 | 6.67 |
| 33 | 1 | 2022 | 6.7 |
| 34 | 34 | 1989 | 7.07 |
| 35 | 26 | 1997 | 7.16 |
| 36 | 41 | 1982 | 7.89 |
| 37 | 46 | 1977 | 8.31 |
| 38 | 39 | 1984 | 8.32 |
| 39 | 15 | 2008 | 8.35 |
| 40 | 37 | 1986 | 8.73 |
| 41 | 36 | 1987 | 8.8 |
| 42 | 12 | 2011 | 8.86 |
| 43 | 33 | 1990 | 8.97 |
| 44 | 27 | 1996 | 8.98 |
| 45 | 11 | 2012 | 9.31 |
| 46 | 35 | 1988 | 9.38 |
| 47 | 58 | 1965 | 9.47 |
| 48 | 28 | 1995 | 10.22 |
| 49 | 29 | 1994 | 10.25 |
| 50 | 57 | 1966 | 10.8 |
| 51 | 14 | 2009 | 10.88 |
| 52 | 10 | 2013 | 11.06 |
| 53 | 43 | 1980 | 11.35 |
| 54 | 31 | 1992 | 11.79 |
| 55 | 40 | 1983 | 11.87 |
| 56 | 13 | 2010 | 11.99 |
| 57 | 56 | 1967 | 13.06 |
| 58 | 42 | 1981 | 13.11 |
| 59 | 25 | 1998 | 13.23 |
| 60 | 59 | 1964 | 13.36 |
| 61 | 32 | 1991 | 13.87 |
| 62 | 50 | 1973 | 16.94 |
| 63 | 49 | 1974 | 28.6 |
It can be seen as per Table 1b that, total no. of measurements is odd no., so median is 6.67.
Conclusion
Median in comparison to mean is less susceptible to
change due to extreme values and is more reliable to represent central
tendency for either uniform or for non uniform data.

Comments
Post a Comment