How to design a clinical trial—equipoise, bias, and big data
BY DR MICHAEL FEHLINGS
Dr Michael Fehlings explains the fundamentals of research project design, from the merits of randomization to the influence of ‘lived experience’.
Is your research project tackling an important question? Is your question even answerable? And is it relevant to the public? Before you undertake any kind of clinical trial, these are some of the questions you should be asking yourself. After all, they touch on the fundamental issues that underpin successful research.
However, deciding which research question you want to answer, and examining whether it is indeed an appropriate one, is merely the starting point. There are many more components to getting a research project off the ground, and to successfully bring it to a conclusion.
Chief among these challenges is the design of the research project itself. In order to ensure that the results you arrive at are as close to the truth as possible, you must consider various aspects. For instance, this includes the principle of clinical equipoise. Bias, and the best approaches to reduce it as much as possible, should also be among your top priorities.
Wherever feasible, I also recommend including input from key stakeholders in study design, especially those with lived experience. The insights we can gain from patients who have personal experience with the condition that we are investigating can be immensely helpful. Data collection and outcome measures are also crucial aspects to plan for from the very outset.
Different approaches to equipoise
As I outlined above, one aspect we must always keep in mind is equipoise. Equipoise exists when a clinical researcher genuinely does not know whether a treatment option administered in one arm of a trial is therapeutically superior or inferior to a treatment option offered in another arm. As soon as it is discovered that one treatment is of superior therapeutic merit, then there is an ethical obligation to offer that treatment to all participants in the study.
As a result, it is only ethically justifiable to randomize participants to receive a control when there is true equipoise. The control might be the best available medical treatment versus an alternative approach, a drug, or some other type of therapeutic initiative. Conversely, in cases where there is not perfect equipoise, it may not be as easy to randomize patients. The following are three examples of clinical studies in which I was involved myself, which highlight the different approaches that can be taken to equipoise:
- I led a North American and then an international effort to examine the role and timing of surgical intervention for Degenerative Cervical Myelopathy (DCM). In this case, it was felt that there was no equipoise to randomize patients to non-operative versus operative treatment. Instead, we elected to accrue patients prospectively and consecutively, and for patients to be treated according to the best available standards of care in that particular unit. This was quite effective in terms of gaining substantial information on the role and timing of surgery.
- In the recently completed CSM-protect trial, we examined the use of Riluzole. This drug is a sodium glutamate antagonist with an excellent safety profile, which is common in clinical practice for the treatment of Amyotrophic Lateral Sclerosis (ALS). Here I felt that there was equipoise in terms of randomizing patients to receive Riluzole or not, because patients were receiving the standard medical and surgical treatment, and then they would get Riluzole on top of this. So, in this example, a randomized trial was possible.
- The CSMF study is a trial led by Zoher Ghogawala from the Lahey Clinic in the United States to examine the role of anterior versus posterior surgical techniques. Here, patients would be randomized if it was felt by the surgical team that there was equipoise between doing either an anterior or posterior approach. What was cool with this clinical design in the study was that every patient was reviewed by a multidisciplinary panel of surgeons. The panel would first vote on whether they felt equipoise was present. Next there was a vote on whether to randomize the patient, and lastly, the panel voted on the technique they would use to do so. One such technique stipulated that for a patient to be eligible for randomization, there had to be consensus that the patients would be randomizable to an anterior versus a posterior approach, but there also had to be a reasonable spread of views on which approach was favored.
So, those are three examples of different trial designs that have been used quite effectively to answer important clinical questions.
Avoiding potential bias
Another aspect you should keep a constant eye on during your study design phase is bias. When we examine a question, we formulate a hypothesis that may or may not turn out to be correct. For example, we have an idea that a drug may or may not be beneficial. These kinds of bias are always inherently present, even though researchers and clinicians try to be as objective as possible.
The same is true for patients—they want to get better, so biases can creep in with them, too. For example, it is known that drugs can have placebo effects: a patient who is aware that they have received a particular medication might report that their outcomes are somewhat better, even though there might not in fact be an actual effect.
One standardized approach to try to reduce bias in clinical trials—and to get as close to the truth and the correct signal as one can—is the randomization of treatment options and the blinding of the assessments. However, this approach is not always possible. As I explained above, there are cases where it is not ethical to randomize patients because there is not equipoise. Still, there are ways to get around that and reduce bias in such cases. One of the main ones is to have independent assessors of the outcomes and to use objective outcome measures that are quantifiable and reproducible.
Alternatives to randomization
In most cases, the randomized control trial is still, and will likely continue to be, considered the gold standard. But there are disadvantages, for instance regarding generalizability, or crossover effects. For example, let us assume that you have a patient population of 1,000 people with DCM, and your plan is to undertake a randomized trial. Let us further assume that you suddenly find out that only ten percent of patients with DCM actually agreed to be enrolled in your trial, and that the remaining 90 percent felt that the constraints and circumstances of the trial were not acceptable.
This represents a potential bias: your results are specific to the patient population you have tested, but they may not be generalizable to the public. At this point, the design of your trial becomes very important, and there are approaches to manage this. For instance, pragmatic randomized trials are somewhat less stringent than the classical randomized trial design but allow for a greater generalizability.
In cases of surgery versus no-surgery trials, another potential problem related to randomization is that of crossover. For example, there have been trials of operative versus non-operative management for lumbar disc herniation with very high rates of crossover. This means that patients who initially get randomized to non-surgery treatment end up not doing so well and crossing over to surgery. However, many trials feature what is called an intention-to-treat analysis. So, if your patients were randomized to one treatment and they cross over, they will still be assessed as if they had remained in their original treatment arm. This, in turn, brings in a source of bias.
To me, one of the big evolutions in clinical trial design is the recognition that there are different approaches that are appropriate based on the type of a clinical question. Of course, where randomization is feasible and ethical, it is a good approach to take. However, wherever it is not, other protocols can be very effective as well.
I learned that lesson during the STASCIS trial more than a decade ago, when I realized that prospective non-randomized designs are valid. During the trial, we examined the role of surgical timing in cases of acute spinal cord injuries, a topic on which I had previously tried for years to conduct a randomized control trial. However, I finally realized that there was not equipoise here, and we were subsequently able to achieve our research result with a prospective non-randomized design.
As a result, I believe that people should not be averse to the notion of a prospective non-randomized design. In fact, I think that for spine surgeons in general, this is indeed the gold standard in many cases. I think people should be prepared to embrace this.
Another key aspect you should be considering from the outset is data. Rigorous and prospective data collection, in analyzable databases, is immensely important. If you want your research results to contribute to the enhancement of medical treatment and care in the real world, high-quality data as well as long-term follow-up are critical.
Moreover, I believe that the data that is being collected now could contribute to additional scientific learnings in the future that we may not yet be able to foresee. Key here are the current advances in computing. We are witnessing the emergence of powerful digital technology for the processing of large quantities of data, including advanced analytics and machine-learning technology. These may enable us to better collate data from different studies in the future, and to arrive at conclusions that even well-controlled studies may currently be underpowered to achieve.
Combining results from several studies in this way can provide a more accurate analysis in many cases. For example, I was involved in a pooled analysis of individual patient data derived from four independent, prospective, multi-center data sources, which showed quite conclusively that patients who receive surgical intervention within the first 24 hours after acute spinal cord injury have better prospects of recovery.
The power of networking
What I have learned to appreciate greatly over the years is the importance of having excellent collaborative networks. Yes, it does take time to build and effort to nurture these, but I have found that in the end, it is work well invested. It is also important for the funding agencies to support such networks and to realize that sometimes the fruits of their investment will not be realized in the short term, and that one needs to take a longer-term view on these types of outcomes.
In this regard, I am very grateful for having been able to work so closely with AO Spine. To my knowledge, it is the largest international professional body of spine surgeons in the world. In my own work, AO Spine has been critical in facilitating international collaborative efforts in the areas of cervical myelopathy, as well as traumatic spinal cord injury.
As an international organization, the AO is uniquely placed to undertake large prospective studies across borders, because it has such great power and reach. However, I do believe that there is still some work to be done before we can make the greatest possible use of this opportunity: we must come up with improved approaches to gather large quantities of data at a global level. I do agree that it is difficult to conduct studies at an international level. But it is worth it, because it tremendously enhances the generalizability of the results, and ultimately the international acceptance of the results.
The value of lived experience
In arriving at those results, I have also grown to understand the importance of including the public. This became clear to me when we were developing clinical guidelines through AO Spine for the management of DCM and traumatic spinal cord injury. We included individuals with those conditions, and their insights were really helpful.
For example, in some instances we might see that a treatment is beneficial, but we might perceive the effect as relatively small. One of the things we learned from involving persons with lived experience is that in spinal cord injury, even small effects can have a pretty dramatic impact on quality of life.
What are you measuring?
Having said that, the selection of endpoints in outcome measures is absolutely critical. How are you going to gauge how the intervention you want to investigate has affected the participants in your study? This represents a huge area of opportunity. However, randomized controlled studies often only feature one single primary efficacy endpoint to be measured. At the same time, what is gradually being learned at the biostatistical level is the fact that patients with spinal disorders are quite complex, and therefore no one outcome measure will be able to adequately analyze a study.
For instance, during the DCM studies that I led, we widely used and validated a number of outcome measures, including the modified Japanese Orthopedic Association (mJOA) scale. This scale is the current gold standard for the assessment of outcomes in DCM, but it has many limitations. For example, it is relatively insensitive when it comes to detecting changes in the milder end of the spectrum of DCM. It also is not great at picking up pain outcomes. We need improved outcome measures to reflect some of the changes that we are examining.
Under certain conditions, the Global Statistical Test may represent a better-suited approach in studies examining DCM and spinal cord injury. The Global Statistical Test combines different measures and can assess the efficacy of an intervention across different aspects of a condition. In my opinion, applying it more widely could have a positive impact on the way in which we analyze studies.
Inspiring the next generation
We have an opportunity to inspire our trainees to answer these questions. As teachers in residency programs and in fellowship programs we must make them aware of the condition of DCM and the literature around it, to enhance understanding.
In my view, every neurosurgeon and orthopedic surgeon who specializes in the spine should have a basic toolkit available in evidence-based medicine. They should be able to evaluate the literature, and perhaps even be able to answer questions. Then a subset of those individuals may be inspired to undertake specialized research training, clinical trials, or basic laboratory investigation.
In Toronto, we have created a surgeon scientist training program, and the majority of our neurosurgical residents obtain advanced degrees, either master's degrees or PhDs, in research. So that is one model. Other models might involve taking specialized programs or courses. The AO Program for Education and Excellence in Research (AO PEER) provides the essential knowledge and toolkits in a course-based format. Whatever the preferred approach, our ultimate hope must be to expand our understanding of this disease and to keep improving the treatments that will rebuild quality of life for our patients.
About the author:
Dr Michael Fehlings, PhD, MD, FACS, FRCSC, is a professor of neurosurgery at the University of Toronto. He is the Vice Chair Research for the Department of Surgery and Co-Director of the Spinal Program at the University of Toronto and established the Spinal Program at Toronto Western Hospital, University Health Network. Fehlings is a leading surgeon-scientist in the field of spinal cord injury and currently the most published author in Degenerative Cervical Myelopathy DCM. He is the founder of The Fehlings Lab, a world leader in the field of translational research dedicated to neural repair and regeneration. His main clinical interests are in spinal neurosurgery, and his research focus is on molecular mechanisms underlying spinal cord injury. Michael Fehlings is an Advisory Board member of the AO Spine Knowledge Forum Spinal Cord Injury.
References and additional resources:
- A podcast with Michael Fehlings discussing his personal tips for securing research funding was published as part of a special Research Top Tips series, from Myelopathy Matters (Myelopathy.org) and AO Spine RECODE-DCM. The initiatives worked together to accelerate knowledge discovery that can change outcomes in Degenerative Cervical Myelopathy (DCM). Each episode takes a dive into a fundamental concept of scientific research, with leading spine researchers talking us through their recipes for success in their own words.
- The AO Program for Education and Excellence in Research (AO PEER) is a collection of clinical research resources, reference documents, and learning opportunities for anyone wanting to learn more about doing research. The Grant Writing and Study Protocol module of AO PEER includes useful templates for applying for grants.
- Badhiwala JH, Wilson JR, Witiw CD, et al. The influence of timing of surgical decompression for acute spinal cord injury: a pooled analysis of individual patient data. Lancet Neurol. 2021 Feb;20(2):117-126.