AGRON INFO TECH

Generating rapid publication ready ANOVA table in R

Introduction

In this blog post, we will explore an improved method for generating publication-ready ANOVA tables in R. Conducting analysis of variance (ANOVA) is a common statistical technique used to compare means across multiple groups or treatments. The ANOVA table provides valuable information about the sources of variation and the significance of each factor in the analysis.

Setup and Data Import

To begin, we need to load the necessary libraries for our analysis. We will be using the readxldplyrtibble, and flextable packages. The readxl package allows us to import data from Excel files, while dplyr and tibble provide helpful functions for data manipulation. Finally, the flextable package will help us generate the publication-ready ANOVA table.

# Library
library(readxl)
library(dplyr)
library(tibble)
library(flextable)

Next, we will import the data for our analysis. Assuming the data is stored in an Excel file named “Data.xlsx”, we can use the read_excel() function from the readxl package to import the data into R. We set the col_names parameter to TRUE to indicate that the first row of the Excel file contains column names.

# Importing data
data <- read_excel(path = "Data.xlsx", col_names = TRUE)

Data Preparation

Before conducting the ANOVA analysis, it is important to ensure that the categorical variables are correctly identified as factors in R. In our data, the variables “Rep”, “Water”, and “Priming” are categorical. We can use the as.factor() function to convert these variables into factors.

# Convert categorical variables to factor variables
data$Rep <- as.factor(data$Rep)
data$Water <- as.factor(data$Water)
data$Priming <- as.factor(data$Priming)

Analysis of Variance

Now we are ready to perform the ANOVA analysis. We will use a for loop to iterate over each response variable in our data set. In each iteration, we will fit an ANOVA model using the aov() function and store the model in a list.

# Analysis of variance
aov.model <- list()
for(i in 1:ncol(data[-c(1:3)])) {
          cols <- names(data)[4:ncol(data)]
          
          aov.model <- lapply(X = cols, FUN = function(x) 
                    aov(reformulate(termlabels = "Rep + Water*Priming", 
                                    response = x), 
                        data = data))
}

Next, we will create another list to store the ANOVA tables for each response variable. We will iterate over the ANOVA models and use the anova() function to extract the relevant information from each model. We will round the values to three decimal places, assign appropriate column names, add asterisks to indicate significance levels, and merge the mean squares (MS) and asterisk columns. Finally, we will remove unnecessary columns and update the column names.

# Creating list of ANOVA tables
aov.anova <- list()
for(i in 1:ncol(data[-c(1:3)])) {
          aov.anova[[i]] = do.call(cbind, anova(aov.model[[i]]))
          rownames(aov.anova[[i]]) = rownames(anova(aov.model[[i]]))
          aov.anova[[i]] = as.data.frame(round(aov.anova[[i]][,c(1,3,5)], digits = 3))
          # Setting column names
          colnames(aov.anova[[i]]) = c("DF", "MS", "P-value")
          # Assign astericks according to p values
          aov.anova[[i]]$sign[aov.anova[[i]][3] < 0.05] <- "*"
          aov.anova[[i]]$sign[aov.anova[[i]][3] < 0.01] <- "**"
          aov.anova[[i]]$sign[aov.anova[[i]][3] > 0.05] <- "ns"
          # Replace NAs with blank
          aov.anova[[i]][is.na(aov.anova[[i]])] <- " "
          # Merge MS and Sign column together
          aov.anova[[i]]$MS_comb = paste(aov.anova[[i]][,2], aov.anova[[i]][,4])
          # Removing p-value and F-value columns
          aov.anova[[i]] = aov.anova[[i]][,-c(2:4)]
          colnames(aov.anova[[i]])[2] <- cols[i]
}

Generating the ANOVA Table

Now we will combine all the ANOVA tables from the list into a single data frame. We will remove any duplicate columns for degrees of freedom (DF) and create the final ANOVA table.

# Combine all ANOVA tables into a single data frame
aov.table <- as.data.frame(do.call(cbind, aov.anova))

# Remove duplicate columns for DF
dup.cols <- which(duplicated(names(aov.table)))
aov.table <- aov.table[, -dup.cols]

# Creating the ANOVA table using flextable
table <- flextable(data = aov.table %>%
                    rownames_to_column("SOV"))

# Formatting the table header
bold(table, bold = TRUE, part = "header")
SOVDFPlant heightSpikeletsSpike lengthGrain per spikegrain weightBiological yieldGrain yield
Rep20.449 ns7947.473 **8.709 **24.412 ns79.416 **1.927 ns0.265 **
Water229.93 ns266.547 ns1.095 ns21.115 ns3.372 ns0.199 ns0.196 *
Priming20.498 ns14166.35 **40.299 **1374.041 **836.568 **46.744 **8.577 **
Water:Priming415.85 ns229.535 ns1.437 ns30.412 ns7.004 ns0.177 ns0.02 ns
Residuals1615.499709.5751.27342.055.2290.7030.034

Conclusion

In this blog post, we have demonstrated an improved method for generating publication-ready ANOVA tables in R. By automating the process using a for loop and leveraging the flexibility of the flextable package, we can efficiently generate ANOVA tables with detailed information, including degrees of freedom, mean squares, and p-values.

This method not only saves time but also ensures accuracy in reporting ANOVA results. The resulting ANOVA table can be easily exported to various formats, such as Word, PDF, or HTML, allowing seamless integration into research reports and publications.

We hope you find this method useful in your statistical analyses. Happy researching!

Please note that the above R Markdown code assumes that you have the necessary packages installed and the data file named “Data.xlsx” in your working directory. You may need to modify the code accordingly to match your specific setup.

Downlaod data file — Click_here
Download R program — Click_here
Download R studio — Click_here