## Sample essay on miss-use of statistical analysis methods

## Introduction

The use of statistic in articles has been under scrutiny in the recent years. Though there has been an improvement in the statistical use in articles, more than half of social science articles adopt statistical methods that contain numerous statistical errors. Some of the common errors seen include failing to record the statistical methods used or using an inappropriate method in to test the statistical hypothesis. The article analyzed is the absence of democracy and gender inequality in education for the miss-use of statistical analysis and reporting. In the review, the things that were considered are looking at the statistical analyses used and identifying the errors in analysis. Errors are defined with respect to the present author’s instructions, professional judgment and accepted statistical approach. The common analysis errors are caused by failure to account or adjust for multiple comparisons use of statistical tests that assume a normal distribution of data, data that take a skewed distribution and failure to describe the statistical tests performed. In the study the dataset obtained over the 1991-2008 period from 66 countries from Asia, Africa, the Middle East and South America.

The hypothesis of the research was, “ The more limited is democracy, the higher is the bias against educating girls, and, therefore, the greater is gender inequality in education (Darity, 5).”

Errors in study result in the misinterpretation of data that will lead to faulty conclusions. Most common of such errors may be caused by failure to account for the number of comparisons used. When more than two experimental groups are compared with general control, it is recommended that a change be made to the p-values for dummies or the significant level to account for the comparisons used. In this particular study there is large sample such that the central limit theorem holds and the mean is normally distributed then to test the null hypothesis

H0: μ = Δ z= (xBar – Δ) / (sx / sqrt(n))

Comparing multiple comparisons without making the adjustments on the p-values and making comparison of two treatments on various deferent points of time are the likely scenarios that can be witnessed, caused by statistical methods miss-use (Good, Phillip & Hardin, 236). Table 1 show regional girls to boy’s ratio difference, the panel data model has the following form on showimg the empirical strategy.

EFMit = α + β Democracyijt+ γ Time Trendt +Σk δ Regionik + Σl ε Religionil +Σm ζ ximt + η Colonyi+ uit

Where; i = 1,, 78; j= 1,, 3; k= 1,, 5; l= 1,, 6; m= 1,, 3; t= 1,, 19

Separate t-tests were calculated by the author for each comparison using the same significant level for each comparison of 0. 005. In this case, the overall error will be 1- (1-0. 005)k where k is the number of comparisons. This would lead to the following fact if three of such independent comparisons were used for a single experiment, for example, control for three times point versus attending school for boys and girls or three chances of a boy and a girl attending school to the general control, the chances of getting wrong significant results for any of the performed comparison will increase from 0. 005 to 0. 0014. In this matter, if the number of groups increase or more of the time points, there will be even more increase of the false significance to the higher levels (Good, Phillip & Hardin, 239).

Where, $ar{x_{1}}$ = Mean of first set of values

$ar {x_ {2}} $ = Mean of second set of values

S1 = Standard deviation of first set of values

S2 = Standard deviation of second set of values

n1 = Total number of values in first set

n2 = Total number of values in second set.

In order to correct this wrong importance value, decision concerning how or whether to make a change for multiple comparisons will rely on many factors which include planning the comparison before the beginning of the study, determining the number of comparisons to be made and including it on the design of the study. The Bonferroni adjustment (1 – (1 – a)C Ã‚ £ C a; 0 Ã‚ £ a Ã‚ £ 1) also called simple adjustment should be made, which incorporate multiplying the p-values by the total number of the comparisons made. However, the Bonferroni adjustment becomes conservative when the comparisons are not independent or if it involves comparing boys and girls attending school in the selected countries, and thus leads to reduced ability in discovering the difference on the significance. For example, if comparing control at eight points to the gender, statistical significance at 5% level will be achieved when unadjusted p-value for every comparison is less than 0. 00625 under the Bonferroni adjustment. However, other changes that can be performed using standard statistical software can be employed. Such adjustments include Student-Newman- Keuls or Tukey’s.

Many variables of absence of democracy and gender inequality in education research, in many countries are not normally distributed; they can be positively or negatively skewed depending on the social-economic factors. Therefore, during the analysis, the variable should not be analyzed using than standard parametric tests such as analysis of variance or the t-test, unless there is an evident from the collected data that show the standard distribution (Darity, 35). In this case, the researcher does no show how the data was distributed. Two alternatives can be used if possible. If using another transformation or taking a log of data that result to the standard distribution, transformed data can be performed using parametric analysis, otherwise parametric test should not be used. From the study, different countries showed normal distribution, and positively or negatively skewed data on the absence of democracy and gender inequality in education, but the authors reported their analysis using parametric tests even on the non-normal variables. In addition, the study did not provide sufficient information on the other methods or alternatives that could be used during the study. Also, the variable log transformation in the study is not clear.

Log transformation and analysis on variables are only applied on the variables that are transformed, the geometrical mean can be obtained when the antilog of the results is calculated (Darity, 50).

The frequencies used in different countries varied. However, when reporting geometrical mean, reporting of the antilog of the standard error of the mean of the logged data as variability measure is not recommended. In the study, Different statistical means are not statistically significant based on a two-sided t-test for the samples that are independent, however, the unequal variance used (t= 0. 0077 and p= 0. 0034), mistaken outcome were made by the researcher because of the absolute difference between the means, and this caused this caused a significant effect on the overall outcome.

In the research, lack of democracy and gender inequality in education, the author does not summarize each experiment performed. The data is presented in a single experiment form. This is the appropriate way of presenting statistical data. Absolute value of t statistics in brackets; significant at 10%; significant at 5%; significant at 1%. However, the combination of the data does not account for the variability among the experiments, given the fact that democracy varies depending on the countries legal structure and cultural aspects (Hills, 87). This is a problem because there is no evidence or the basis that the author chose which experiment to present and which one to omit. The author is supposed to choose the best results that support the hypothesis.

## Conclusions

The statistics reported in this article about the absence of democracy and gender inequality in education are straightforward comparisons. When studying various groups and the levels of education, simple comparison is incorrect incorrectly. It is recommended for the statistics analysts avoid simple and common pitfalls through recognizing and understanding them, and thus allowing them to adopt methods that suit the study and the findings. Inappropriate use of statistical data leads to incorrect reporting of the data conclusion and findings. It is also important for the researcher to decide on analytical tools during the planning stage of the study. Moreover, the statistical method should be chosen prior to the data analysis, and this choice should be based on the chosen study design, the data distribution, and not the results.

## Work cited

Darity A. William. International encyclopedia of the social sciences. 2nd Edition Macmillan Reference USA, 2008. Print.

Hills, Michael. Statistical methods. Open University Press, 1986. Print.

Good, Phillip I., and James W. Hardin. Common errors in statistics (and how to avoid them). Hoboken, NJ: Wiley-Interscience, 2003. Print.