Our Statistical Methods & Tests

Statistical Methods & Tests

Depending upon where you are in your research, we can handle all of the statistical considerations for your dissertation methods or results chapter.

  • We can utilize SPSS, Minitab and SAS software to perform statistical analyses for you
  • We can perform virtually any conventional statistical analysis that may be needed
  • We offer ongoing telephone and email support to ensure that you understand all of the statistical methods and tests used in your dissertation

Common Statistical Methods

Some of the more common statistical methods that we use are listed below. Follow the links for tutorial topics on these tests.

  • T-Test
  • Paired T-test
  • Chi-Square Test
  • ANOVA
  • ANCOVA
  • MANOVA
  • Repeated Measures ANOVA
  • Factor Analysis
  • Cluster Analysis
  • Linear Regression
  • Logistic Regression
  • Correlation
  • Mann-Whitney test
  • Kruskal-Wallis test
  • Wilcoxon Signed-Ranks test
  • McNemar’s Test
  • Friedman’s Test
  • Survival Analysis

T-Test

Hypothesis Testing and the Statistics T-Test

The t-test is probably the most commonly used Statistical Data Analysis procedure for hypothesis testing.

Actually, there are several kinds of t-tests, but the most common is the “two-sample t-test” also known as the “Student’s t-test” or the “independent samples t-test”.

T-Test Example

The two sample t-test simply tests whether or not two independent populations have different mean values on some measure.

For example, we might have a research hypothesis that rich people have a different quality of life than poor people. We give a questionnaire that measures quality of life to a random sample of rich people and a random sample of poor people. The null hypothesis, which is assumed to be true until proven wrong, is that there is really no difference between these two populations.

We gather some sample data and observe that the two groups have different average scores. But does this represent a real difference between the two populations, or just a chance difference in our samples?

T-Test Statistic

The statistics t-test allows us to answer this question by using the t-test statistic to determine a p-value that indicates how likely we could have gotten these results by chance. By convention, if there is a less than 5% chance of getting the observed differences by chance, we reject the null hypothesis and say we found a statistically significant difference between the two groups. See Statistical Data Analysis for more information about hypothesis testing.

Chi-Square Test

Pearson Chi-Square Test

There are several kinds of chi-square tests but the most common is the Pearson chi-square test which allows us to test the independence of two categorical variables. All chi-square tests are based upon a chi-square distribution, similar to the way a t-test is based upon a t distribution or an F-test is based upon an F distribution.

Chi-Square Example

Suppose we have a hypothesis that the pass/fail rate in a particular mathematics class is different for male and female students. Say we take a random sample of 100 students and measure both gender (male/female) and class status (pass/fail) as categorical variables.

The data for these 100 students can be displayed in a contingency table, also known as a cross-classification table. A chi-square test can be used to test the null hypothesis (i.e., that the pass/fail rate is not different for male and female students).

Chi-Square Statistic

Just as in a t-test, or F-test, there is a particular formula for calculating the chi-square test statistic. This statistic is then compared to a chi-square distribution with known degrees of freedom in order to arrive at the p-value.

We use the p-value to decide whether or not we can reject the null hypothesis. If the p-value is less than “alpha” which is typically set at .05, then we can reject the null hypothesis, and in this case, we say that our data indicates that the likelihood of passing the class is related to the student’s gender. See Statistical Data Analysis for more about statistical inference.

MANOVA

MANOVA – Multivariate Analysis of Variance

The MANOVA (multivariate analysis of variance) is a type of multivariate analysis used to analyze data that involves more than one dependent variable at a time. MANOVA allows us to test hypotheses regarding the effect of one or more independent variables on two or more dependent variables.

A MANOVA analysis generates a p-value that is used to determine whether or not the null hypothesis can be rejected.  See Statistical Data Analysis for more information.

MANOVA Example

Suppose we have a hypothesis that a new teaching style is better than the standard method for teaching math. We may want to look at the effect of teaching style (independent variable) on the average values of several dependent variables such as student satisfaction, number of student absences and math scores. A MANOVA procedure allows us to test our hypothesis for all three dependent variables at once.

More About MANOVA

Like the example above, a MANOVA is often used to detect differences in the average values of the dependent variables between the different levels of the independent variable. Interestingly, in addition to detecting differences in the average values, a MANOVA test can also detect differences in correlations among the dependent variables between the different levels of the independent variable.

MANOVA is simply one of many multivariate analyses that can be performed using SPSS. The SPSS MANOVA procedure is a standard, well accepted means of performing this analysis.

Linear Regression

Linear regression is a common Statistical Data Analysis technique.  It is used to determine the extent to which there is a linear relationship between a dependent variable and one or more independent variables. There are two types of linear regression, simple linear regression and multiple linear regression.

In simple linear regression a single independent variable is used to predict the value of a dependent variable. In multiple linear regression two or more independent variables are used to predict the value of a dependent variable. The difference between the two is the number of independent variables. In both cases there is only a single dependent variable.

Linear Regression – Data Considerations

The dependent variable must be measured on a continuous measurement scale (e.g. 0-100 test score) and the independent variable(s) can be measured on either a categorical (e.g. male versus female) or continuous measurement scale. There are several other assumptions that the data must satisfy in order to qualify for linear regression.

Correlation and Regression

Simple linear regression is similar to correlation in that the purpose is to measure to what extent there is a linear relationship between two variables. The major difference between the two is that correlation makes no distinction between independent and dependent variables while linear regression does. In particular, the purpose of linear regression is to “predict” the value of the dependent variable based upon the values of one or more independent variables.

Correlation

Understanding the Correlation Coefficient

In statistical data analysis we sometimes use a correlation coefficient to quantify the linear relationship between two variables.

The most commonly used correlation statistic is the Pearson correlation coefficient. This statistic measures both the strength and direction of the linear relationship between two variables.

Correlation Example

Suppose we want to look at the relationship between age and height in children. We select a group of children for study, and for each child we record their age in years and their height in inches. We could plot these values on a graph so that the child’s age would be on the horizontal axis and the child’s height would be on the vertical axis. Each dot on the plot represents a single child’s age and height. This is called a scatter plot.

Since older children are generally taller than younger children, we would expect the dots on the plot to roughly approximate a straight line (a linear relationship between the variables) and that the line will slope upward (since age and height tend to increase at the same time).

Correlation Coefficient

The Pearson correlation coefficient is a number between -1 and +1 that measures both the strength and direction of the linear relationship between two variables.

The magnitude of the number represents the strength of the correlation. A correlation coefficient of zero represents no linear relationship (the scatter plot does not resemble a straight line at all), while a correlation coefficient of -1 or +1 means that the relationship is perfectly linear (all of the dots fall exactly on a straight line).

The sign (+/-) of the correlation coefficient indicates the direction of the correlation. A positive (+) correlation coefficient means that as values on one variable increase, values on the other variable tend to also increase; a negative (-) correlation coefficient means that as values on one variable increase, values on the other tend to decrease, that is, they tend to go in opposite directions.