Mutating and Transforming Variables
Join our community on Telegram!
Join the biggest community of Pharma students and professionals.
In data analysis, it is very common to create new variables or modify existing ones based on calculations or conditions. This process is known as mutating or transforming variables. The dplyr package provides a simple and powerful function called mutate() that allows users to create new columns or change the values of existing columns in a dataset.
Transforming variables is useful when you need to compute new values such as totals, averages, percentages, categories, or flags based on certain conditions. Instead of manually editing the dataset, the mutate() function performs these transformations in a clear and reproducible way.
To work with mutate, the dplyr package must first be loaded into the R session.
library(dplyr)
Suppose we have a dataset called employees that contains the columns name, age, department, and salary.
The mutate() function can be used to create a new column. For example, if we want to calculate the annual bonus as 10 percent of the salary:
employees %>%
mutate(bonus = salary * 0.10)
The mutate() function can also be used to modify an existing column. For example, if we want to increase all salaries by 5 percent:
employees %>%
mutate(salary = salary * 1.05)
Multiple transformations can be performed in a single mutate() call. For example, we can create both a bonus and a total compensation column:
employees %>%
mutate(
bonus = salary * 0.10,
total_compensation = salary + bonus
