Hypothesis Testing In Research

main content

Hypothesis Testing In Research

This topic will discuss what a hypothesis is, the place of hypothesis testing in research and what the steps for hypothesis testing are. 

Hypothesis: A prediction of the outcome of a study. Hypotheses are drawn from theories and research questions or from direct observations.

In fact, a research problem can be formulated as a hypothesis. To test the hypothesis we need to formulate it in terms that can actually be analysed with statistical tools.

As an example, if we want to explore whether using a specific teaching method at school will result in better school marks (research question), then the hypothesis could be that the mean school marks of students being taught with that specific teaching method will be higher than of those being taught using other methods. In this example, we stated a hypothesis about the expected differences between groups. Other hypotheses may refer to correlations between variables.

Thus, to formulate a hypothesis, we need to refer to the descriptive statistics (such as the mean final marks), and specify a set of conditions about these statistics (such as a difference between the means, or in a different example, a positive or negative correlation). The hypothesis we formulate applies to the population of interest.

Null and Alternative Hypotheses

In order to be able to test the hypothesis, we need to formulate what is called the Null and the Alternative Hypothesis, as all further testing is based on accepting the one and rejecting the other.

The example below shows how to formulate the null and alternative hypotheses from a research question.

Research Questions

  • Does teaching method influence students’ performance at school?
  • Do students being taught using Teaching Method A perform better at school than those being taught using Teaching Method B? 

Note: Many different hypotheses can come from a research question. Hypotheses are usually more specific than research questions; in this example, we would specify:

  • Which students (e.g. S1)
  • Which performance (e.g. as reflected in their final exam marks of the year)
  • Which teaching methods

A good hypothesis needs to be specific enough and to carry implications for testing the expected relations. 

Null Hypothesis H 0 

There is no difference in the final exam marks between S1 students taught using Teaching Method A and those taught using Teaching Method B.

NOTE: The H 0 is formulated in such a way that no differences or relationships are expected. Participants (or groups of participants) are expected to perform in similar ways.

Alternative Hypothesis (Experimental / Research Hypothesis)

  • H1: (Non-directonal, two-tailed hypothesis):  There is a difference in the final exam marks between S1 students who are taught using Teaching Method A and taught using Teaching Method B.
  • H2: (Directional, one-tailed hypothesis): S1 students taught using Teaching Method A perform worse in their final exam than those taught using Teaching Method B.

Note: Alternative hypotheses can be non-directional (H1), in which case we do not make predictions about the direction of the difference (whether they will do better or worse).

They can also be directional (H2), as we predict in which direction (better or worse) our two groups of participants (those taught using method A and those using method B) will differ (as shown by their final exam marks). Alternative hypotheses similar to H2 above are made based on prior evidence or theoretical argument or direct observations. 

Task 3 - Reflection

How can you “translate” Research Questions to Hypothesis? Have a look at this link.

Now try to formulate your own Research Questions and “convert” them to relevant Hypotheses including variables such as: performance at school, extracurricular activities, socioeconomic status, motivation, attitude towards homework, birth order, and number of siblings. If you still need some help, look at the relevant slides referring to Quantitative methods.

Why State Hypothesis?

The hypothesis guides us on the selection of a certain design, observations and methods of researching over others.

Based on previous theory and research, research questions are formulated, which are “translated” into hypothesis, which, by turn, are tested using a sample in order to make inferences for the whole population.
 
If we could test the whole population directly, we would not need to formulate hypothesis, conduct inferential statistics and make inferences for the population based on a sample. However, it is often impossible to test the whole population, and we need to make our observations based on a sample.

If differences (or relationships) between variables are revealed, then the null hypothesis is tested for significance. This test may determine whether these differences (or relationships) are “real”, in other words, if they are due to true differences between the groups instead of due to, say, sampling error.

Sample results are often subject to sampling fluctuations. These fluctuations could account for the differences between the mean exam scores the students had in our example. Since we are researching a sample drawn from a population, we should always expect some variation in the sample statistics, such as the mean exam scores, in our example, between the groups of students being taught using different methods.

Steps in Hypothesis Testing

As we have seen earlier in this Unit, hypothesis testing is all about populations and using a sample based on which we make inferences about the population. We have seen so far how to formulate hypothesis, what is the place of hypothesis testing in research, and some important concepts such as sampling distributions, confidence intervals, critical regions and  significance levels. In this topic, we will refer to the steps of hypothesis testing.

  • The first step is to formulate the alternative and null hypotheses.
  • The second step is to test the null hypothesis (rather than seeking to support the experimental hypothesis), by carrying out a statistical test of significance to determine whether it can be rejected, and consequently, whether there is a difference between the groups under investigation.

For our example research question (effect of teaching method on final exams marks), the researcher would run statistical tests to test whether the difference between the means of the two samples of students (those who used method A and those who used method B) is zero.

Remember that: When testing the hypothesis of a relationship between two variables we calculate a probability: the probability of obtaining such a relationship as a result of sampling error alone (conditional probability). It is the probability of obtaining a relationship in our sample by sampling error alone, if there was no such relationship in the population. If this probability is small enough, then it makes more sense to conclude that the relationship observed in our sample also exists in the population.

  • In the third step, the sample statistics appropriate for the sample, variables and hypothesis are calculated (in our hypotheses, the mean exam score).
  • In the fourth step, a significance test is conducted, to see if the null hypothesis can be rejected.

To do this, we first start with the assumption that the null hypothesis is true, and proceed to determine the probability of obtaining the sample results. In order to understand hypothesis testing, this is a quite important step to understand.

If the null hypothesis is true, what is the probability of obtaining the sample results?

Hypothesis testing involves the calculation of the probability of observing the data collected. If this probability (also know as p-value) is small it would be very unlikely that the observed sequence would have occurred if the null hypothesis were true.

The hypothesis is retained if a test of significance would show that if the research were repeated many times, similar results would occur in at least 95 out of 100 repetitions, or in other words if the p-value (probability of obtaining the results) is less than 5% (we would then write: p< .0.05). This specific criterion of significance level is a convention. (Sometimes two other probability levels are reported, that is, p< 0.01 (odds of 99 to 1) and p< 0.001 (odds of 999 to 1, as will be mentioned later in this topic).

  • Therefore, in the final step, the decision is made to reject or accept the null hypothesis:

If the probability is small (that is less than 0.05), the null hypothesis is rejected and the experimental hypothesis is retained, since we can say with some certainty (95% certainty) that the differences discovered between the groups in our example are not due to sampling error, but other factors.

Looking at the odds we realise that it is much more likely that the null hypothesis will be retained (95 to 1 for the 0.05 level of significance). To reject the null hypothesis we require at least a probability of 95 confirmations that there are differences in the groups (for every 100 repetitions of the study)

If the probability is large, the null hypothesis cannot be rejected.

Another example is available here.


Type I and Type II Errors

It may become obvious from what has been discussed so far, that, as the procedure of significance testing is based on probabilities, it is not without errors.

We may sometimes incorrectly reject the null hypothesis (rejects it when it is true). This is called Type I error (α).

Other times, we may fail to reject the null hypothesis, and actually reject the alternative hypothesis (when it is true), which is called Type II error (β).
 
For a summarising table and a further explanation about the two types of errors follow the links: