We provide effective and economically affordable training courses for R and Python, Click here for more details and course registration !
When we do data analysis, random variables in the dataset are usually mutually correlated. Sometimes, we may want to measure the pure relationship between two variables, and the influence from other variables being controlled. A partial correlation calculation could fulfill this purpose.
For example, the white area in the figure below represent the pure correlation between 1 and 2, after the orange area which is the common among 1, 2, 3 is excluded during correlation measuring.
In R programming, function pcor() from ggm package can be used to calculate partial correlations.
The following example shows the calculation of partial correlation between score and stress, and the effect from variable time is controlled.
#create a dataframe containing only the three variables of #interest.
> examData = read.delim("Exam Anxiety.dat", header = TRUE)
> examData2 <- examData[, c("score","stress","time")]
#show first 6 observations of the data
> head(examData2)
score stress time
1 40 86.298 4
2 65 88.716 11
3 80 70.178 27
4 80 61.312 53
5 40 89.522 4
6 70 60.506 22
#partial correlations between score and stress, and
#variable time is controled
> pcor(c("score","stress","time"), var(examData2))
[1] -0.2466658
#can assign the result to an object
> pc<-pcor(c("score","stress","time"), var(examData2))
> pc
[1] -0.2466658
#square of the partial correlation measures the variance
#between the two variables
> pc^2
[1] 0.06084403
#t-test of the partial correlations
#as the p-value < 0.05, it is significant
> pcor.test(pc, 1, 103)
$tval
[1] -2.545307
$df
[1] 100
$pvalue
[1] 0.01244581
You can also watch full video of R tutorial from our YouTube channel.
0 Comments