Part 4: Data analysis and report writing

Congratulations, you have now received all your study packets and/or accounted for all subjects in your study. The data have been entered, checked, and transferred to your SPSS program. You are now ready for analysis and write-up!!

  • 1 Getting started
  • 3.1 Study design
  • 3.2 Subjects
  • 3.3 Data collection
  • 3.4 Data analysis
  • 4.1 Descriptive statistics
  • 4.2 Analytical statistics

1 Getting started

The fun part begins! Now is the time to discover the truth behind your study questions. Is plate fixation better than external fixation? Does it depend on what group of patients received the treatment (eg, young versus old)? Is it possible that it doesn’t matter what technique you use? Finding out these answers was the motivating force behind performing your clinical study. Since you paid attention to detail in your study planning and worked hard in ensuring the quality and validity of your data collection methods, there is no reason these questions cannot be answered, at least in your study population.

The data collection process took a little longer than you thought. You now have less than 6 months remaining in your residency. You do not have time to put the data down and rest. It is time to get at it and start the data analysis because your goal is to write this up and submit it for publication before you graduate. You can present your findings at a conference when you are in your fellowship or first staff position. You realize that performing the data analysis takes a combination of expertise, discipline, and patience in carefully handling the data in accordance with the data analysis plan you set forth in your study protoco l[Part 2; Chapter 6] . Though you did not have money set aside for hiring an epidemiologist or biostatistician, you want to get some help before jumping into things. You have been using the AOs Knowledge Portal for DOING research and discovered that you were able to get a lot of background information to get you started [link to Part 4 didactic]

You decide to approach your project from a broad perspective. In other words, you plan to begin writing parts of the protocol as you are analyzing your data. You feel you will be more efficient this way and more likely to finish on time. To help you with this, you follow the outline detailed in Table 10.  Though the journal you eventually submit to will have its own requirements, you know that this template is representative of most.

Table 11.1. Example outline of study report or manuscript
​Element ​Description
​INTRODUCTION ​The key is to inform the reader as to why your topic is important. It isnot necessary to spend a tremendous amount of effort reiterating what othershave already discussed in previous studies on your topic. Try to make it noveland to the point.
​Study Design ​Eg, RCT, prospective cohort, retrospective cohort, etc.
  • ​Explanation of inclusion and exclusion criteria.
  • Institutions in which you identified and recruited your patients.
  • Time period in which you collected your data.
​Data Collection
  • ​A summary schematic and description outlining patients recruited, enrolled, lost to follow-up and completed.
  • Baseline and prognostic factors collected.
  • Outcome measures collected.
​Data Analysis
  • ​Data analysis plan.
  • Power analysis.
  • ​Descriptive statistics.
  • Analytical statistics.
  • ​Discuss implications of primary analyses.
  • Distinguish between statistical and clinical significance.
  • Discuss study weaknesses and strengths.
  • Discuss results in context with literature.
  • Discuss generalizability of results.
​CONCLUSIONS ​Limit to only those supported by the results of your study.


You can borrow from your Background and Significance section of your protocol when you write this section. It will be important to be relatively brief, likely far fewer characters than your protocol. The key is to inform the reader as to why your topic is important. It is not necessary to spend a tremendous amount of effort reiterating what others have already discussed in previous studies on your topic. Try to make it novel and to the point. The introduction should end with a clear and concise description of your study purpose. Since you have more than one purpose or a secondary purpose, list them separately. For example, you can state something like this:

“The primary purpose of this study was to compare plate osteosynthesis fixation to external fixation in patients who have undergone surgery for a tibial pilon fracture with respect to a generic patient reported quality of life outcome using the SF-36 and a joint-specific patient reported functional outcome using the Foot and Ankle Outcome Score. The secondary purpose was to look at hospital length of stay and post-operative complications.”

Bottom line - list the objectives you set out to answer. They should be consistent with the specific aims from your protocol [Part 2] . Statistically non-significant findings are still “significant”!


3.1 Study design

You may say something like this:

“We conducted a retrospective cohort study comparing plate osteosynthesis fixation to external fixation in patients who have undergone surgery for a tibial pilon fracture.”    

3.2 Subjects

It is very important to clearly and thoroughly describe your study populations. Readers need to know if your results and conclusions may be generalized to their population. Additionally, it is important to place your study in time. Technological advances, changes in patient care procedures, and differences in reporting practices at different times can affect the outcomes and interpretation of your study1. The following items should be included in this section:

  • Explanation of inclusion and exclusion criteria.
  • Institutions in which you identified and recruited your patients.
  • Time period in which you collected your data.

For retrospective studies, give the dates during which the data were originally collected as opposed to when you abstracted them from the study.  The statement below is an example of how you may write this section:

"All patients (n=84) with a diagnosis of tibial pilon fractures (AO 43-A1, 43-B2.2, 43-C3), who had surgery at [name your institution] between January 1, 2003 and December 31, 2004, were identified from the [name your institution] Trauma Registry if they were managed with plate osteosynthesis fixation or external fixation.  Subject mailing addresses and phone numbers were obtained from the registry, and when not available, from our hospital’s computerized patient records system. We first mailed them study packets which included a protocol synopsis, informed consent, contract for payment, and patient reported outcome questionnaires. For those subjects whose packets were returned by the postal service or who were found to have a disconnected or incorrect phone number, we attempted to ascertain more up-to-date demographic data by querying our hospital’s computerized patient records system. We then contacted each person by phone 1-2 weeks later to determine their interest in participating. All subjects were considered eligible if they were skeletally mature between the ages of 18 and 40 years. Patients were excluded if they had a record in our database as being an alcoholic, if we were unable to obtain a phone number or mailing address, they had advanced osteoporosis, cancer, motor function disorder, and history of vascular insufficiency involving the upper limb. Severely comminuted fractures requiring external fixation and delayed surgery were also excluded.  Our investigation, dealing with the study of human subjects, has been approved by the [name your institution] Human Subjects Review Committee”.

3.3 Data collection

It is important that you describe clearly and concisely the process by which you identified and recruited your subjects and whether they finished the study. You were able to obtain full accountability of all your subjects.  To illustrate this in your manuscript you provide a schematic summary of the study showing the number and disposition of participants at each stage, Figure 4. The schematic summary includes:

  • The total number of subjects approached.
  • The total number of subjects found eligible and ineligible.
  • The number of subjects who agreed to participate and signed informed consent.
  • The number of subjects who did not complete the study (eg, lost to follow-up or withdrew).
  • The number of subjects who completed the study and whose data are included in the final analysis.

You then describe the nature and duration of your follow-up effort. This comes directly from your protocol. For example, you may state something like this:

“An attempt was made to evaluate the generic and disease-specific outcome measures at 2-years after injury. We initiated the study first by mailing each individual a study packet approximately 1 month before their 24th month anniversary after surgery. This date was obtained from the Trauma Registry.  We followed up with phone calls 1-2 weeks later to ascertain eligibility, explain the study, and obtain informed consent.  Patients were asked to complete the questionnaires within one week of their 2-year anniversary since their surgery. The actual date of completion of each questionnaire was provided by the patient. Patients were given 3 weeks to complete and return their questionnaires.  Patients received a follow-up phone call at 2-weeks as a reminder and in 2-week increments if questionnaires were not returned."

Data entry was validated (i.e. logic checks including range checks, missing value checks, etc.) both by visual inspection and built-in database programming during the data entry process.

The following prognostic variables were identified in the Trauma registry and verified by the patient baseline questionnaire:

  • Patient age, gender, occupation
  • AO Fracture classification
  • Gustillo classification
  • Currently receiving Worker’s Compensation
  • Post-operative alignment
  • Smoking status
  • Alcoholic

Furthermore, the following short term outcome measures were obtained from the Trauma Registry:

  • Length of hospital stay
  • Post-operative complications (e.g., superficial and deep infection, pin-site infections, re-operation)

Lastly, we provided in the study packet the following generic and disease specific patient reported outcomes that the patients were instructed in over the phone and by written instruction in the study packet:

  • Generic instrument: SF-36- Composed of 8 subscales measuring physical and mental health (36 items). For each subscale, scores range from 0 to 100 points. The higher the score the higher the function.
  • Disease-specific instrument: Foot and Ankle Outcome Score (FAOS)- Composed of 5 subscales including pain, symptoms, ADLs, sports and recreation, and quality of life. Scores are normalized and range from 0 to 100 points. The higher the score, the higher the function.
Image description
Figure 4. Schematic summary of patient follow-up.

As best as you could account for them, you documented reasons for ineligibility, unwillingness to participate, and loss to follow-up or withdrawal.

3.4 Data analysis

Descriptive statistics

You may have stated something like this:

“For categorical variables, proportions and frequency counts were calculated.  For continuous variables, means and standard deviations were computed. In addition, minimum, maximum, and range will be reported for both types of variables. Missing, extreme, and variable distributions were explored. Group comparisons of categorical variables were made using Pearson’s chi-square (χ2) tests and continuous variables using t-tests.”

Analytical statistics

For primary aims, the differences in SF-36 and FAOS scores between plate fixation and external fixation groups were tested first with t-tests and then with analysis of variance (ANOVA) to control for potential confounding variables.  These outcome were also dichotomized at their mean value and analyzed using a negative binomial regression to estimate the relative risk and the 95% confidence interval of a poor outcome (ie, below the mean) comparing the two groups, while controlling for potential confounders.

For secondary aims, complication rates were compared using a Cox proportional hazards model to estimate the relative risk and 95% confidence intervals.  Hospital length of stay was compared using analysis of variance.

Sample size calculations

Our estimated sample size (with 80% power) to show a 10-point difference in ankle function (ie, SF-36 and FAOS scores) between the two groups was 27 (18 receiving external fixation and 9 receiving ORIF). We far exceeded this number with a high response rate.


Your results section should be quick and to the point. In other words, there should not be too much explaining or justifying your findings. This is left for your discussion. The flow of the results section should begin with descriptive statistics of study groups, findings with respect to measured risk factors or outcomes (analytical statistics) and accompanying tables and figures as necessary.

4.1 Descriptive statistics

A table reporting descriptive or baseline characteristics is typically the first table in your report or manuscript. The following table is a hypothetical example of your baseline data, Table 11.

Table 11. Hypothetical baseline date for tibial pilon fracture patients treated with osteosynthesis plate fixation or external fixation
Plate fixatio
External fixation
No. or mean % or range No. or mean % or range
​  Age (years) ​33 (18-39)​ ​39 (21-40)​
​  Male ​11 ​45 ​26 ​58
​  Alcoholic ​1 ​4 ​2 ​5
​  Smoker ​5 ​22 ​8 ​19
​Type A ​4 ​46 ​5 ​12
​Type B ​8 ​33 ​11 ​25
​Type C ​42 ​51 ​28 ​63
​Closed ​16 ​66 ​23 ​52
​Open ​8 ​34 ​21 ​48
​Workman’s comp ​2 ​7 ​2 ​4

From this table, we note that the external fixation group is slightly older and has a greater percentage of males. There were a relatively equal proportion of patients who were considered alcoholic and regular smokers.  Those who received external fixation tended to have a greater percentage of Type C fractures (63% versus 51%) and open fractures (48% versus 34%) compared to the plate fixation group, respectively. This may put the external fixation group at an unfair disadvantage if these fracture types are more likely to give rise to a poorer outcome. You discover in fact these two variables are associated with a poorer outcome (both complications and FAOS scores). An unequal distribution between treatment groups of an independent factor that is also associated (negatively or positively) with the outcome constitutes a potential confounding variable [Part 2; section 5.6.2] . These variables will be dealt with in your analytical analysis.

4.2 Analytical statistics

You have a combination of categorical outcomes (ie, complications and dichotomized patient reported outcomes) and continuous outcomes (ie, hospital length of stay, SF-36 scores, and FAOS scores).

Primary Outcomes

You proceed with your analysis of the primary outcomes. Below, you generate a table that compares SF-36 scores and FAOS groups between the two groups, Table 12. 

Table 12. Hypothetical SF-36 and Foot and Ankle Outcome Scores*
Treatment N Mean SD Min Max P-value
​SF-36 ​0.37
​  Plate fixation ​24 88​ ​17.5 41​ 98​
​  External fixation ​44 ​85 ​10.2 ​34 ​91
​  Plate fixation ​24 86 ​14.1 ​37 ​97
​  External fixation ​44 ​73 ​17.4 31​ ​94

*Scores normalized to a total possible of 100 points.

†Two-sample t-test.

Note, from your table above, there is no statistically significant difference in SF-36 scores; however, there is in the disease specific ankle score. This difference also appears clinically significant [Part 4; section 11.4]. However, you do note that fracture severity was worse in the external fixation group and so it is important that you control for fracture type and open versus closed. You can do this by running an analysis of variance (ANOVA) regression and adding these variables to the model. You do so and find that the p-value has weakened (p=0.01).  This p-value is still statistically significant allowing you to conclude that there may in fact be a difference favoring the plate fixation group with respect to an ankle patient reported outcome even controlling for fracture severity.

You proceed to divide your scores (ie, dichotomize) into “successful” and “unsuccessfully” outcomes by separating them at the mean value for all subjects. By doing so, you discover that the plate fixation group has 66% of the patients above the mean and the external fixation group only has 43% above the mean.  If you divide these two percentages (ie, 66/43) you end up with a relative risk of 1.53; meaning, patients receiving plate fixation were 53% more likely to have a “successful” outcome than patients receiving external fixation. This is also known as the relative benefit increase (RBI). [see part 4; section 11.3.2 for a discussion of effect estimates] .

[Note: This is purely a hypothetical example with fictitious data. There currently are no data to suggest that plate fixation is better than external fixation with respect to this outcome. This difference was created for teaching purposes].

Secondary outcomes

Often surgeons want to know the proportion of patients who experience a certain outcome, either negative (e.g. non-union) or positive (e.g. those who have an “excellent” clinical outcome). In many studies, the authors simply report these dichotomous outcomes as proportions for the treatment and control groups and a “p-value” to tell us if the results are “significant”. A common way of reporting these outcomes is presented in Table 1.3.

Table 1.3. Hypothetical outcomes comparing the plate fixation to the external fixation group.
Treatment group ​ ​ ​
​ ​LCP
Standard plate
(N=44)​ ​
​Outcome ​n %​ ​n ​%
​Complications 6​ ​25 16​ 36​ 0.34​

You note from this table that the external fixation group had a higher percentage of complications (36% versus 25%) compared to the plate fixation group. If you divide the two percentages you get what is called a relative risk (RR) [Part 4; section 11.3.2] .  In this case, the relative risk would be 36/25 = 1.4 indicating that there is a 40% increase in complications in the external fixation group. You can calculate a confidence interval for this relative risk by using your SPSS program. You do so and get a range from 0.66 to 3.2.  Note, that this confidence interval includes 1.0. A RR of 1.0 would indicate no difference. So we can’t be sure from these data if there is less of a risk or more of a risk if one receives external fixation. That is consistent with our non-statistically significant p-value of 0.34.

You recall when you did your sample size calculations that you would probably need a much larger sample size to show a difference in complication rates which probably explains why you note what appears to be a clinically important difference that is not statistically significant. It is important to note though again that the patients in your external fixation group had more severe injuries on average. You can make a more realistic comparison of complication rates and get an “adjusted” RR if you control for the two key variables of fracture type and open versus closed. You do an additional analysis using negative binomial regression and the output gives you a RR and confidence interval of (RR = 1.1; 0.58-2.9; p=0.46). So adjusting for these factors brings the RR and confidence interval down, indicating that there probably is little difference in complication rates and that any difference is probably attributed more to the fracture severity.

[Note: If you had the data, you could analyze by complication type. In this hypothetical example, we did not differentiate between minor (eg, pin tract infection) and major (eg, deep infection) complications. Furthermore, the output from the regression above is purely hypothetical and not based on real data].

Because this is a cohort study, other clinically useful estimates of risk difference (RD) and the number needed to treat (NNT) should not be calculated due to the relative weakness of the data compared to a randomized trial [Part 4; section 11.3.2] .


Since the results section is intended to report your finding without interpretation, the discussion section allows you to put your findings in context [Part 4; section 11.4] . However, the discussion section has a tendency to take on a life of its own. It is very common for this section to become long and difficult to read. It is also common for this section to serve as a platform for one to give their opinions with respect to the treatment intervention being evaluated. Some go as far as to discuss public policy changes in their discussion. There are no standard guidelines for this section which is probably why authors take the liberty to write everything they were not able to write in the previous three sections of the manuscript.  You are careful in your discussion to write a very clear and concise section by addressing the following issues in the order presented below:

  1. Discuss the implications of the primary analyses first.
    • Here you will reiterate what you found with respect to SF-36 scores and FAOS scores. This gives you a chance to present the data again and demonstrate that even while controlling for potential confounding factors there was still a statistically significant difference in FAOS scores.
    • Your secondary outcomes can also be briefly discussed here but you will note that your sample size was too small to detect differences in complications but that after controlling for fracture severity, the differences were small.
  2. Distinguish between statistical and clinical significance.
    • Here you will focus on the difference in your FAOS scores. You will need to make an interpretation of whether the true difference in scores is really clinically significant, even though it is statistically significant.
    • For a more in depth discussion of the concepts, see Part 4; section 11.4 .
  3. Discuss the strengths and weaknesses in your research design or problems with data collection, analysis, or interpretation.
    • Obvious strengths in your study are the use of patient reported outcomes for the first time and controlling for potential confounding variables (eg, fracture type and open versus closed). These should be discussed.
    • Weaknesses include the retrospective nature of the study, the use of a Trauma Registry which may be limited in detail.
  4. Discuss the results in the context of the published literature
    • Describe the similarities and differences of your work with that of other authors who have done similar studies. Make an attempt to explain why your findings may be similar and why they may be different.
    • Be cautious not to attempt to review the whole body of literature on this topic. Again, brevity is important so be selective in which studies you choose to review.
  5. Discuss the generalizability of the results
    • The purpose of your study is to produce results that the entire orthopedic community can apply to their practice.
    • The ability to generalize your findings is dependent on your study population, its inclusion and exclusion criteria, and other factors such as your follow-up rate.


Limit your conclusions to only those supported by the results of your study.  Unsupported conclusions are very common in scientific research. You may state something like this:

“Patients who suffer more severe tibial pilon fractures are more likely to receive external fixation than plate fixation. Controlling for these differences,, patients who receive plate fixation have better ankle-specific outcomes than those receiving external fixation.”

Note: This is a hypothetical scenario and does not represent a real conclusion.

Part 5: Submitting for publication