Different Types Of Scientific Studies And The Hierarchy Of Evidence

Different Types Of Scientific Studies And The Hierarchy Of Evidence

This article will give you a brief overview of the different types of studies health researchers use, the relative merits and drawbacks of each method and where they each sit on the general hierarchy of evidence.

In your Atlas Microbiome Test results, we reference all the scientific papers used to generate your report; this is because transparency is paramount to us, and you deserve to know how we reach the conclusions we do.

Sometimes, however, the scientific jargon used in these papers can be a little bit intimidating at first glance: to take a random reference from my own Atlas Health Dashboard, it begins: “A double-blind, placebo-controlled, cross-over study”.

Worry not! You do not need a degree in the sciences to grasp the meaning behind different study designs, simply a few minutes and perhaps a cup of tea. With this grounding, you will better understand the literature underpinning your results. Beyond this, it will help you cast a critical eye next time a headline shouts “Study shows”.

What is the hierarchy of evidence?

The hierarchy of evidence

The hierarchy of evidence is essentially a league table for different types of scientific studies, usually represented by a pyramid; the higher up you go, the stronger the conclusions of each study are. The reliability of each study, and therefore its place on the pyramid, is determined by how rigorous it is.

Whilst researchers continue to challenge and revise the pyramid, there is a broad consensus about the value of different study types in clinical and health research.

Experimental and observational studies: what is the difference?

The majority of health research can be divided into two categories: observational and experimental studies. In short, observational research look at a group of individuals in their real-world setting. In these studies, researchers do not intervene with the study group in any way (no treatment or placebo is given).

Also known as descriptive studies, these do not allow researchers to make predictions or establish causality but only to look at correlations, for instance, between smoking and the risk of cancer. As we all know, correlation does not equal causation.

Let’s say there is a correlation between activity levels and microbiome diversity. Whilst interesting and a good avenue for further research, this may be explained because those who exercise often eat a more balanced diet or smoke less than those who lead sedentary lives. These are known as confounding variables.

In other cases, correlation is clearly random. Tyler Vigen, a Harvard law student, comically demonstrates this on his aptly named website, Spurious Correlations. Using public data, he creates graphs showing extremely high correlations between things that are clearly unrelated, like:

  • Per capita cheese consumption and number of people who die by becoming tangled in bedsheets
  • Age of Miss America and murder by steam, hot vapours and hot objects
  • Marriage rate in Mississippi and per capita consumption of whole milk (U.S)

Observational studies can sometimes provide the impetus needed to fund more expensive experimental studies. They can also be helpful when researchers want to see the effects of something that would be unethical to administer in trials.

A helpful video dissecting health claim headlines

Experimental studies are where researchers introduce an intervention, for example, a drug, and study the effects on a group of individuals. There is a treatment group (those receiving the intervention) and a control group (those receiving a placebo) in an experimental study. They are often randomised to distribute confounding variables equally.

If the study group is large enough, randomisation means that the confounding variables will have the same average value between groups. The intervention, whether that is a drug or diet, is also known as the independent variable.

Let’s say that researchers are testing a patch that is supposed to minimise nicotine cravings; the patch is the independent variable, and nicotine cravings are the dependent variable. A confounding variable could be levels of stress for example. Randomised-controlled trials (RCT’S) are considered the most reliable type of experimental study and are able to establish causation, unlike observational studies.

Observational studies Cohort study, case-control study, case report, cross-sectional study
Experimental studies Randomised-controlled trials, quasi-experiments


Randomised-controlled trial

These are considered the gold standard of studies in clinical research, especially when they are large-scale and double-blinded, meaning neither the researchers nor the participants know who received the intervention.

In an RCT, participants are randomly allocated to two or more different groups, with one being given an intervention and another a placebo. In theory, any differences between the groups will be due to the intervention.

By randomising the groups, the potential confounding variables that might influence the results are equally distributed, making the groups statistically comparable. This is important as it means RCT’s can discover causation, unlike in observational or descriptive studies.

TIP☝ When you encounter an RCT, keep your eye out for the study size, whether it is blinded and who funded the study, as these could all impact its reliability.

Blinded In a blinded study, the participants do not know whether they have received the intervention or a placebo, whilst the researchers do know this information.
Double-blinded In a double-blinded study, neither the researchers nor the participants know who was given the intervention and who was given the placebo. This helps to eliminate selection bias which might occur in a single-blinded study.

Quasi-experimental design

Like in a true experimental study, such as an RCT, a quasi-experiment aims to establish cause-and-effect between an independent and dependent variable. Unlike an RCT, however, patients are not randomly assigned to groups.

These types of studies might be used when an RCT cannot be performed for ethical reasons or because the independent variable is an innate characteristic, like biological sex; If you are studying how someones sex affects aggression levels, for instance, you cannot randomly assign participants to male and female groups.

Moreover, if you are studying whether violent video games make children more aggressive, the parents might wish to choose whether their children are exposed to the control group or the independent variable (violent games). This means that selection is not random, however, the researchers are still able to control the variables and can establish causation.

Still confused? This short video will clear things up a little

Observational studies:

Cohort studies

In a cohort study, a group of people (cohort) that is alike in many ways is chosen and separated into two groups based on a specific characteristic they differ on. For example, a prospective cohort study might follow a group of nurses, separated into those who smoke and those who do not.

Cohort studies can be both forward-looking (prospective) and backwards-looking (retrospective). In a prospective study, the exposure is identified before the outcome. For instance, smokers are identified and then followed to see how this impacts their health over time.

Prospective cohort study

In a retrospective cohort study, researchers look backwards to patient records, identify a cohort and then try to reconstruct their experience as if it was prospectively followed up. To take the first example, they might find data about a group of nurses and observe the differences in health outcomes between smokers and non-smokers.

Retrospective cohort studies take less time and money than prospective studies, though they can also fail to account for all risk factors. This is because the records might not record certain exposures which could affect health outcomes.

Retrospective cohort study

Cohort studies can help researchers determine how common a disease is and what might influence your risk of getting it. They can also be a good way to discover promising avenues of research, though by nature they take a long time to complete and do not reveal causation.

One of the longest-running prospective cohort studies is the Framingham heart study, which began in 1948; 3 generations have participated, 15,447 individuals have been involved and 3698 studies have been written based on the data.

In the 1960’s the Framingham study demonstrated a clear correlation between cigarette smoking and the development of heart disease. More recently, it has found high blood pressure and cholesterol to be major risk factors for coronary heart disease.

When you see bold headlines claiming something reduces or increases mortality, it is often reporting on a cohort study. Unfortunately, a minority of journalists are not always as responsible as they should be with research and overstate results.

The British Heart Foundation even compiled a list of the most ridiculous health headlines in 2017. Some of my favourites are:

  • CHILLAX ON HOL IS 'DEADLY' Just two weeks lying on a beach on holiday ‘increases the risk of early death’
  • Do YOU have two or more children? You're at risk of heart disease - because they are so expensive to look after
  • Going grey early increases heart attack risk
The kind of headline a cohort study might produce

Case-control studies

Often referred to as retrospective studies, these work backwards to discover potential causes for a specific outcome. For example, researchers might look at two groups of people in the same city, such as those diagnosed with lung cancer (cases) and those without (controls).

They can then interview the groups about different experiences they have had, or check their health records, to try and discover what might have contributed to the outcome in question, also known as risk factors.

Photo by Robina Weermeijer / Unsplash

These can be good for coming up with hypotheses in the initial stages of research, which can then be tested further. They are also less costly in both time and money than prospective cohort studies. With that being said, if researchers rely on the participants' memories, there is a danger that historical risk factors may be forgotten or misremembered.

In 1950, a case-control study run by Richard Doll and A. Bradford Hill helped to discover an important correlation between the number of cigarettes smoked daily and the risk of developing carcinoma of the lung. This was the first report linking smoking to cancer, something we now take for granted.

Cross-sectional studies

Most people will be familiar with this type of study, for example, a classic survey or public opinion poll is a cross-sectional study. In short, researchers take a random sample of people and find out information about them at one set point in time.

These are quick and inexpensive to perform but give little insight into anything beyond prevalence. This is because both exposure and response are being measured at the same time, leading to a chicken and egg scenario.

Case reports and series

Case reports and case series record interesting cases in detail, especially if there is a patient suffering from something unique and novel. These can be useful in recording rare diseases. In fact, early case reports led to the discovery that mothers taking Thalidomide were having children missing limbs.

A famous case-report was also written about Phineas Gage, a miner who suffered a severe frontal lobe brain injury when a metal pole entered through his forehead. There is no control group in these studies and whilst they cannot suggest causation, they can raise the alarm if new and worrying symptoms begin to appear in clusters. They are considered one of the lowest forms of observational evidence.

Systematic reviews and meta-analyses

A systematic review pools all of the relevant research on a specific topic together. One or more researchers will then carefully summarise what this body of evidence suggests. The idea is that looking at a body of research can minimise the bias that might creep into a single study, thereby allowing for a more impartial and well-supported conclusion to be reached.

The review process is often repeated by multiple individuals to ensure the study is as objective as possible. Let’s say you want to look at how effective probiotics are at treating post-antibiotic diarrhoea. A systematic review would find all of the studies that have tested the effectiveness of probiotics for this particular problem and review what these studies suggest as a whole.

A systematic review can also include a meta-analyses; this is where the collected data from lots of studies are statistically analysed, allowing for an “overall” conclusion to be reached. Systematic reviews and meta-analyses are the preferred yardsticks by which to measure evidence, especially when combined.

They sit confidently at the top of the hierarchy of evidence and with good reason. More than any single study, they guard against bias and provide a high level of evidence. A systematic review and meta-analyses of randomised, double-blind, controlled trials is a clinical researchers dream, but for some questions, a meta-analysis of prospective cohort studies might be more appropriate.

TIP☝ Not all systematic reviews are made equal; the number of studies looked at, the methodology for selecting studies and the quality of the review all impact its worth.

Of mice and men: the value of animal studies

Photo by Kanashi / Unsplash

Animal studies have been to thank for numerous breakthroughs in science. One unique type of mice that have been invaluable in microbiome research is the Germ-free mice. These are born via C-section and then raised in a completely sterile environment.

As such, they are clean slates that allow researchers to colonise them with whichever microbes they wish, even single species if need be. They have advanced our knowledge considerably regarding the microbiome, showing that whilst we influence the microbial world within, it also influences us in an endless cycle.

Mice allow us to discover new avenues of research and identify causal relationships where human studies are not an option, usually due to ethical concerns.

Studies on mice have revealed that the microbiome can induce significant behavioural and weight changes in mammalian hosts. For example, when the bacteria from humans with Manic Depressive Disorder are transplanted into germ-free mice, they exhibit more despair behaviours (reduced danger-avoidance) and stress.

Researchers have even transplanted the gut microbiota from humans with autism spectrum disorder (ASD) into mice. Interestingly, this resulted in them exhibiting anti-social behaviours typical of ASD.

Despite their usefulness in advancing our knowledge, animal and in-vitro (test tube) studies rank below human studies on the hierarchy of evidence. Nonetheless, these studies suggest important avenues of research and advance our understanding in developing fields. Germ-free mice, in particular, have been instrumental in developing our understanding of the microbiome.

Expert opinion

Expert opinion is a comment or judgement made about a subject by a single expert or group of experts in that field. It might take the form of an editorial or an executive summary. This can be helpful when there is a dearth of empirical evidence but should be displaced if research arises which contradicts it.

Expert opinion is not considered a research method, but it can be invaluable when informing health policy, particularly where research is lacking. Throughout the Covid-19 pandemic, expert epidemiological opinion has been sought to guide regulations.

Key takeaways

Did you jump straight down here? Work smart, not hard, I suppose. Anyway, if you’re in a rush, these are the key points you should take with you:

  • Health studies are generally either observational or experimental.
  • Observational studies can only establish correlation.
  • Experimental and quasi-experimental studies can prove causation.
  • The highest-ranking research methods are systematic reviews and meta-analyses.
  • Not all studies, even of the same type, are created equal; factors like study size, selection and rigour impact the strength of its conclusions.
Ross Carver-Carter
Ross Carver-Carter Relationship counsellor for humans and their microbes

Featured topics

102 articles
71 articles
69 articles
55 articles
Digestive Health
49 articles
40 articles
36 articles
23 articles
21 articles
19 articles
Beat The Bloat
15 articles
Disease Protection
10 articles
Science Bites
8 articles
7 articles
Love and sex
4 articles