How to choose the right animal model and make appropriate decisions in preclinical research



Non-reproducibility is a major killer of research value in preclinical studies. According to the most recently published estimate (2015), it wastes 28 billion US dollars annually in the US alone. Of that amount, inappropriate study designs account for almost 28%1of the problem. That means poor study design choices cost American research funders 7.8 billion dollars annually. And as the study design depends heavily on the animal model used, the choice of that model is crucial.

In an AO PEER webinar, researchers Stephan Zeiter and Caroline Constant from the AO Research Institute Davos explained some key principles of optimizing the value of preclinical research. This article accompanies and summarizes that webinar’s main points.

Watch the full AO PEER webinar:

Choosing the Right Animal Model
Presenters: Stephan Zeiter, Caroline Constant
Production: 2021

How to choose the most suitable animal model

The starting point

The starting point to decide upon your model is not the animals you have access to, the model you are familiar with, or the available budget. The priority is to formulate a well-chosen research question. But what makes a research question suitable? It must be answerable, relevant, focused, and precise. The more precise your research question is, the higher the chance of selecting an appropriate animal model to answer it. This will be the base for the hypothesis and study aim.

However, researchers must resist the urge to save resources by using one study to answer multiple questions. This can end up diluting the first question. Ideally, a study has one main question to answer and one hypothesis to test.

Is an animal model absolutely necessary?

Before planning a project involving animals, every researcher should remember the first of Russel and Burch’s three Rs of principles of animal research (i.e., replace, reduce, refine):2 Replace animal experiments whenever possible. Every possible alternative must be explored. If cell culture experiments or a bioreactor are appropriate methods to answer your question, these methods must be prioritized over animal experiments. Also, if someone else has already done what you want to do, a thorough literature review will avoid using more animals to answer the same research question.

Determine your primary outcome

Once the research question and hypothesis are in place, and no alternative methods to animal experiments are suitable, a primary outcome must be chosen. This should be chosen to allow the research to answer the research question and prove or disprove the hypothesis. Further, the primary outcome will be used to determine the sample size. Additional outcome measures can be defined as secondary outcomes of the study. 

Assessing the parameters that best answer the research question requires appropriate methods. For example, if the research question deals with bone healing, the possibility of observing bones as they heal includes CT imaging, mechanical testing, or histology. The researcher’s choice will depend on which method will be most useful in relation to the research question. Secondary outcomes may require other methods. 


Choosing the preclinical model: animal species and strain

Once your primary outcome measure is selected, choosing the preclinical model that will allow you to answer your research question best. The optimal animal model (species and breed/ strain/ stock) is the one that will yield the necessary type and range of data to answer the research question, using the smallest sample size possible and considering the associated costs.  

To summarize, the research question determines the primary outcome measure. And the combination determines the choice of animal model and strain. As with many decisions affecting research projects, the optimal animal model is the one that enables the research team to answer the research question both thoroughly and reproducibly. 


How to increase scientific rigor

Control Groups

Controls must be included in animal studies. Negative and positive control groups should be considered. Depending on the research question, the type of controls may vary: for example, within-subjects designs involve testing the same subjects before and after an intervention. Other study designs use control groups for comparison with the experimental group. In the case of a vaccine study, for example, the experimental group would receive the vaccine being tested; a positive control group would receive another vaccine (possibly the current gold standard) with a predictable rate of success; and a negative control group would receive injections containing no active ingredient.

Randomization and blinding

Research has shown that randomization and blinding have a major impact on the number of false positive outcomes from preclinical studies involving animals. For small samples, one very simple solution for randomization is to put every animal’s number in a hat, then draw the numbers out to allocate the groups. For larger studies, statistical or spreadsheet packages usually offer randomization functions. For example, every version of MS Excel includes the RAND function. Blinding is necessary to avoid bias in the results. In a blinded experiment, information is withheld from the person performing study-related tasks, such as the surgery or measurements, until after the investigation is complete. If it is not feasible to blind everyone, at least the researchers performing the outcome analyses should be blinded. Regarding surgeons, where total blinding is not always possible, they should be kept blinded until the surgical protocol differs between the groups.

Sample sizes and statistical analysis

The number of animals necessary is sometimes based on previous practices or numbers used in published studies. However, a sample size calculation is recommended to ensure a biologically relevant effect size while minimizing the number of animals used. As different calculations are appropriate in different situations, it is essential to learn which fits your study design, then to apply it as recommended.  

Bias can arise even after the study’s data acquisition or analysis.  It is necessary to define the research aims before running a study, not afterwards. Choosing a study’s aims after the research is underway increases the danger of detection bias or so-called P-hacking, where researchers notice statistical significance in their data, then re-write their research questions accordingly. Such practice should be avoided at all costs.

Recognizing other sources of bias and variation

Many sources of bias and variation can exist in an animal experiment, and here are some examples. Pain impacts most, if not all, biological systems. Therefore, the principle that you should minimize pain is not simply a matter of ethics: pain in test animals can be a significant source of bias. For this reason, every animal enrolled in a preclinical study must receive adequate veterinary care, including the appropriate use of anesthetic, analgesic, or tranquilizing drugs when such use would be proper to ensure that pain and distress are minimized.3,4   

Differences in the age of the animals can impact major biological changes, such as bone density . If you’re studying bone mass and your youngest mice are only six weeks old, the younger animals’ bone mass will increase naturally in ways that would not be related to your study outcome. If you have different ages and groups, the animal’s age could induce a bone density increase that is not relevant to the investigated groups or become a proxy variable.  

Another example can be the impact of seasonal changes: for larger species, such as sheep, the season—and with it, the amount and quality of the sunlight they receive—can also influence results. In female sheep, for example, seasonal factors trigger the onset of their estrous cycle. The resulting hormonal changes can alter bone growth, create osteoporosis, or change their physical parameters. 

Animal housing may be a source of stress that may influence study outcomes e.g. the number of animals in each pen or cage, noise, and, for smaller animals, activity around the cages. Particularly where single-cage samples are used, such stress factors can have very pronounced effects.  

Impact of the surgeon

The surgical technique should be as reproducible as possible between all animals. This includes any trauma to the studied structure and the amount of dissection, which should be similar in all animals. For example, if one surgeon does minimally invasive surgeries versus another who does not, the differences in soft tissue trauma can impact the outcomes of interest. 

One effective strategy to minimize these variations’ effects is to randomize how surgeons are assigned to procedures and groups. Otherwise, if the surgeon who performed the first round procedures in January is not available in April, the outcomes of groups planned for that surgeon could be affected. 

Researchers also need to know how to respond to unplanned events such as accidental bone breaks. For any event that causes deviation from the surgery protocol even fixable errors, affected animals should be euthanized and their data excluded from the study. Scientifically and ethically, it is problematic to keep an animal that has received a different or additional procedure in a study.  

Many drugs used in preservative anesthesia can affect outcomes. For example, NSAIDs can affect bone healing;5 therefore, if these are used without a predefined standardized protocol, they can affect outcomes in unexpected ways. 

Another factor to consider is that an animal’s body temperature requires close monitoring while under anesthesia. If some become hypothermic during surgery, but not others, that introduces another variable. A 2023 retrospective analysis of data done at AO corroborates the concept that bacterial colonization is affected by abnormal body temperature during general anesthesia at the time of bacterial inoculation in rodents.6 This factor needs to be considered to decrease infection data variability and improve experimental reproducibility. 

Last but not least, every reasonable safeguard must be taken against infection. This includes aseptic procedures for all surgical procedures independent of the size of the animal.


The bottom line

For researchers to ensure their studies’ lasting value, they must rigorously pursue every available measure not only to achieve usable results but to facilitate those results’ reproducibility. This means choosing appropriate study designs and—where animal models are necessary—the animal species and strains that best fit their needs. From randomizing their samples to managing their animals’ housing conditions, researchers must take every possible precaution to avoid skewing their results.


Useful resources:

Smith et al.’s PREPARE guidelines for planning animal research and testing are an essential resource to facilitate pre-test processes. These are accessible free of charge via the following link:  

The ARRIVE guidelines (Animal Research: Reporting of In Vivo Experiments) are another helpful resource. These are comprised of a checklist of recommendations to improve the reporting of research involving animals in ways that maximize the quality and reliability of published research and enable others to scrutinize, evaluate, and reproduce it.7 These are accessible free of charge via the following link:  ARRIVE Guidelines 

About the authors

Stephan Zeiter

With a background in veterinary medicine and a specialization in lab animal medicine, Zeiter is the manager of AO’s Preclinical Services program. He is also a co-founder and past chair of the Preclinical Model Section of the Orthopaedic Research Society (ORS), co founder of the European Academy of Laboratory Animal Surgery (EALAS) and the current president of the European College of Laboratory Animal Medicine (ECLAM). 

Caroline Constant

Constant is board-certified by both the American and European Colleges of Veterinary Surgeons (ACVS-LA, ECVS) and also trained as a biomedical engineer. At the AO Research Institute in Davos, Switzerland, she now works as a project leader and preclinical surgeon.

Further reading and resources:

  1. Freedman LP, Cockburn IM, Simcoe TS (2015) The Economics of Reproducibility in Preclinical Research. PLoS Biol 13(6): e1002165.
  2. Russell WMS and Burch RL. The principles of humane experimental technique. Wheathampstead: Universities Federation for Animal Welfare, 1959.
  3. Ref: Alcohol consumption by C57BL/6, BALB/c, and DBA/2 mice in a limited access paradigm. A D Lê 1, J Ko, S Chow, B Quan (Affiliations expand PMID: 8146231 DOI: 10.1016/0091-3057(94)90026-4)
  4. United States Congress. 1970. Animal Welfare Act of 1970, Public Law 91–579.
  5. Institute of Laboratory Animal Research, Committee for the Update of the Guide for the Care and Use of Laboratory Animals, National Research Council. Guide for the care and use of laboratory animals. 8th edition. Washington, DC: National Academies Press; 2011. 220
  6. The ARRIVE guidelines 2.0: Updated guidelines for reporting animal research Nathalie Percie du Sert, Viki Hurst, Amrita Ahluwalia, Sabina Alam, Marc T. Avey, Monya Baker, William J. Browne, Alejandra Clark, Innes C. Cuthill, Ulrich Dirnagl, Michael Emerson, Paul Garner, Stephen T. Holgate, [ ... ], Hanno Würbel. Published: July 14, 2020
  7. Jackson SJ, Andrews N, Ball D, Bellantuono I, Gray J, Hachoumi L, et al. Does age matter? The impact of rodent age on study outcomes. Lab Anim. 2017;51:160-9.
  8. Geusens, P., Emans, P. J., de Jong, J. J., & van den Bergh, J. (2013). NSAIDs and fracture healing. Current opinion in rheumatology, 25(4), 524–531.

You might also be interested in:


AO VET Research

AO VET Guest Blog