Statistical Analysis in Quantitative Research
Descriptive Statistics: Summarizing Your Data
Before testing hypotheses, researchers must understand the basic characteristics of their dataset. Descriptive statistics provide this overview through measures of central tendency—mean, median, and mode—and measures of variability—range, standard deviation, and interquartile range. Together, these numbers paint a portrait of how participants' scores are distributed.
The choice between mean and median depends on the data's distribution. When scores are symmetrically distributed, the mean is the most informative summary. When the distribution is skewed—as is common with healthcare cost data or length-of-stay figures—the median better represents the typical value because it is not pulled by extreme outliers.
Frequency distributions, histograms, and box plots offer visual complements to numerical summaries. These graphical tools reveal patterns that raw numbers may obscure, such as bimodal distributions, floor or ceiling effects, and the presence of outliers. Students should make descriptive statistics their first analytic step in every project, as the patterns they reveal often inform decisions about which inferential tests are appropriate and whether data transformations are needed.
Foundations of Inferential Statistics
Inferential statistics allow researchers to draw conclusions about a population based on data collected from a sample. The core logic involves estimating the probability that observed results could have occurred by chance alone if the null hypothesis—typically that there is no effect or no difference—were true. This probability is expressed as a p-value.
A p-value below a predetermined threshold, conventionally 0.05, leads researchers to reject the null hypothesis and conclude that the observed effect is statistically significant. However, statistical significance does not automatically imply clinical importance. A very large sample can detect trivially small differences that have no practical relevance for patient care.
Confidence intervals complement p-values by providing a range within which the true population parameter likely falls. A 95 percent confidence interval that is narrow suggests a precise estimate, while a wide interval signals uncertainty. Reporting both p-values and confidence intervals gives readers a more complete picture of the findings. Students should practice interpreting both, as this dual reporting is the standard expectation in healthcare journals.
Common Statistical Tests in Healthcare Research
Selecting the right statistical test depends on the research question, the number and type of variables, and the distribution of the data. The t-test compares means between two groups—independent samples when the groups are different people, or paired samples when the same individuals are measured twice. Analysis of variance extends this comparison to three or more groups.
Chi-square tests evaluate associations between categorical variables, such as whether the proportion of patients experiencing an adverse event differs between treatment and control arms. When expected cell counts are small, Fisher's exact test provides a more reliable alternative. Correlation coefficients, such as Pearson's r, quantify the strength and direction of linear relationships between two continuous variables.
Regression analysis allows researchers to examine the relationship between an outcome and one or more predictor variables while controlling for confounders. Linear regression is used for continuous outcomes, logistic regression for binary outcomes, and Cox proportional hazards regression for time-to-event data. Each method has assumptions—normality, linearity, independence of observations—that must be verified before results can be trusted. Students should view test selection as a deliberate, assumption-driven process rather than a rote matching exercise.
Avoiding Common Statistical Pitfalls
Several mistakes recur in healthcare research. Multiple testing—running many statistical tests on the same dataset without adjusting for the increased probability of false positives—can lead to spurious findings. Bonferroni corrections and false discovery rate procedures address this by raising the bar for significance when multiple comparisons are made.
Confusing correlation with causation is another frequent error. A strong correlation between two variables does not mean one caused the other; a third, unmeasured variable may drive both. Only well-designed experimental or quasi-experimental studies with appropriate controls can support causal claims.
Overfitting occurs when a statistical model is too complex for the data, capturing noise rather than true relationships. A model with too many predictors relative to the sample size may perform well on the current dataset but fail to replicate in new data. Cross-validation and parsimony—using the simplest model that adequately explains the data—guard against this problem. Students who learn to recognize these pitfalls will produce more credible analyses and evaluate published findings with sharper critical judgment.
Related topics from other weeks:
Frequently Asked Questions
When should I use a t-test versus ANOVA?
Use a t-test when comparing means between exactly two groups and ANOVA when comparing means across three or more groups. ANOVA controls the overall error rate that would inflate if you ran multiple separate t-tests instead.
What does a p-value actually tell me?
A p-value indicates the probability of obtaining results at least as extreme as those observed if the null hypothesis were true. It does not tell you the probability that the hypothesis is correct or the magnitude of the effect.
Why are confidence intervals important alongside p-values?
Confidence intervals show the range of plausible values for the true effect size, giving readers information about both the direction and precision of the estimate. A statistically significant p-value paired with a wide confidence interval suggests the effect's exact magnitude is still uncertain.
What is the difference between linear and logistic regression?
Linear regression predicts a continuous outcome variable, while logistic regression predicts the probability of a binary outcome such as disease presence or absence. The choice depends entirely on the nature of the dependent variable.
How do I know if my data meet the assumptions of a statistical test?
Diagnostic plots, normality tests like Shapiro-Wilk, and variance homogeneity tests like Levene's test help verify assumptions. When assumptions are violated, non-parametric alternatives or data transformations may be appropriate.
Explore more study tools and resources at subthesis.com.