Welcome Back

Google icon Sign in with Google
OR
I agree to abide by Pharmadaily Terms of Service and its Privacy Policy

Create Account

Google icon Sign up with Google
OR
By signing up, you agree to our Terms of Service and Privacy Policy
Instagram
youtube
Facebook

Generating Statistical Summaries

Generating statistical summaries is an essential step in data analysis. Statistical summaries provide a quick overview of the main characteristics of a dataset without examining every individual data point. These summaries help analysts understand the distribution, central tendency, and variability of the data.

In clinical and pharmaceutical data analysis, statistical summaries are commonly used to describe patient characteristics, treatment outcomes, laboratory values, and other important variables.

# Example dataset
data <- data.frame(
  patient_id = 1:6,
  age = c(45, 52, 37, 60, 49, 55),
  weight = c(70, 65, 80, 68, 75, 72),
  treatment = c("Drug", "Drug", "Placebo", "Drug", "Placebo", "Drug")
)

The summary() function provides basic statistical information for each variable in the dataset.

summary(data)

Common statistical measures are shown in the table below.

Measure Description R Function
Mean Average value mean(data$age)
Median Middle value median(data$age)
Minimum Smallest value min(data$age)
Maximum Largest value max(data$age)
Standard Deviation Measure of data spread sd(data$age)

Group-wise statistical summaries are often required in clinical studies to compare treatment groups.

library(dplyr)

data %>%
  group_by(treatment) %>%
  summarise(
    average_age = mean(age),
    median_age = median(age),
    sd_age = sd(age),
    patient_count = n()
  )

Statistical summaries provide a concise description of the dataset and help analysts quickly understand key characteristics before performing more advanced statistical analysis.