Normal distribution
2021-01-22
The data of most food experiments will be analyzed for significance. In the writing method section of many articles, we will mention what method is used for significance analysis. If P is less than 0.05, the difference between variables is considered to be significant. However, before the significance analysis, some preprocessing shoule carried out to make sure Normal distribution, otherwise the result is inaccurate. The following summarizes the data preprocessing that can be performed in R to meet the requirements of saliency analysis.
Before statistical analysis of data such as significance analysis, the data should meet normally distributed. ## load dataset
setwd("C:/blog/Dataset")
data <- read.csv("fruits_Vc.csv")
head(data)
## Number Fruit Repeat Vitamin
## 1 1 Apple A1 4.6
## 2 2 Apple A2 3.9
## 3 3 Apple A3 5.2
## 4 4 Apple A4 6.9
## 5 5 Apple A5 4.8
## 6 6 Apple A6 3.3
Plot
Apple <- data[1:6,4]
shapiro.test(Apple)
##
## Shapiro-Wilk normality test
##
## data: Apple
## W = 0.94874, p-value = 0.7301
library(ggpubr)
ggdensity(Apple,
main = "Density plot of apple",
xlab = "Apple")
ggqqplot(Apple)
Shapiro-Wilk normality test
dataset: data$Vitamin
W = 0.96026, p-value = 0.6066 When the P value here is greater than 0.05, it represents a normal distribution.
You can also observe the normal distribution graph:
library("ggpubr")
ggdensity(data$Vitamin,
main = "Density plot of Vitamin",
xlab = "Vitamin")
ggqqplot(data$Vitamin)