normal Archives - We provide R, Python, Statistics Online-Learning Course

How to calculate normal distributions in R

wilsonzhang746 — Sun, 11 Aug 2024 12:32:41 +0000

We provide effective and economically affordable online training courses for R and Python, click here for more details and course registration !

Normal distribution is a continuous random variable distribution with bell-shaped probability density curve. It is widely used in statistical data analysis, and the basis for many other distributions as well. The probability density function can be expressed as

Normal distribution probability density function

In which the two parameters μ and σ, are its mean and standard deviation, respectively.

The probability under the curve between any two x values x = x1 and x = x2 equals the integral

Probability under Normal probability density curve

With R programming, there are generally following functions that are used to calculate for Normal distribution.

dnorm() – probability density

pnorm() – cumulative probability up to a specified value

qnorm() – quantile value, below which is the specified cumulative probability, and this is the opposite operation of pnorm()

rnorm() – to generate random number from a determined normal distribution

In the next code examples, you can see how to implement normal distributions with these functions.

# creating a sequence of values 
# between -3 to 3 with a step of 0.1
T = seq(-3, 3, by=0.1)
T
#output
[1] -3.0 -2.9 -2.8 -2.7 -2.6 -2.5 -2.4 -2.3 -2.2 -2.1 -2.0 -1.9 -1.8
[14] -1.7 -1.6 -1.5 -1.4 -1.3 -1.2 -1.1 -1.0 -0.9 -0.8 -0.7 -0.6 -0.5
[27] -0.4 -0.3 -0.2 -0.1  0.0  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8
[40]  0.9  1.0  1.1  1.2  1.3  1.4  1.5  1.6  1.7  1.8  1.9  2.0  2.1
[53]  2.2  2.3  2.4  2.5  2.6  2.7  2.8  2.9  3.0
#calculate probability density for these values
#from a normal distribution with mean 0.5 standard deviation 1.2
dt = dnorm(T, mean=0.5, sd=1.2)
dt
#output
[1] 0.004725734 0.006005083 0.007577969 0.009496655 0.011818778
 [6] 0.014606917 0.017927867 0.021851574 0.026449710 0.031793853
[11] 0.037953294 0.044992472 0.052968089 0.061925970 0.071897766
[16] 0.082897616 0.094918912 0.107931330 0.121878295 0.136675062
[21] 0.152207571 0.168332238 0.184876796 0.201642270 0.218406128
[26] 0.234926563 0.250947860 0.266206671 0.280439019 0.293387772
[31] 0.304810305 0.314486023 0.322223431 0.327866430 0.331299555
[36] 0.332451900 0.331299555 0.327866430 0.322223431 0.314486023
[41] 0.304810305 0.293387772 0.280439019 0.266206671 0.250947860
[46] 0.234926563 0.218406128 0.201642270 0.184876796 0.168332238
[51] 0.152207571 0.136675062 0.121878295 0.107931330 0.094918912
[56] 0.082897616 0.071897766 0.061925970 0.052968089 0.044992472
[61] 0.037953294
# Plot the graph.
plot(T, dt)

Normal distribution probability density output from RStudio

#Cumulative probabilities for these values
#from a normal distribution with mean 0.5 standard deviation 1.2
PT <- pnorm(T, mean = 0.5, sd = 1.2)
PT
#output
[1] 0.001768968 0.002303266 0.002979763 0.003830381 0.004892537
 [6] 0.006209665 0.007831677 0.009815329 0.012224473 0.015130140
[11] 0.018610425 0.022750132 0.027640146 0.033376508 0.040059157
[16] 0.047790352 0.056672755 0.066807201 0.078290204 0.091211220
[21] 0.105649774 0.121672505 0.139330247 0.158655254 0.179658669
[26] 0.202328381 0.226627352 0.252492538 0.279834464 0.308537539
[31] 0.338461120 0.369441340 0.401293674 0.433816167 0.466793248
[36] 0.500000000 0.533206752 0.566183833 0.598706326 0.630558660
[41] 0.661538880 0.691462461 0.720165536 0.747507462 0.773372648
[46] 0.797671619 0.820341331 0.841344746 0.860669753 0.878327495
[51] 0.894350226 0.908788780 0.921709796 0.933192799 0.943327245
[56] 0.952209648 0.959940843 0.966623492 0.972359854 0.977249868
[61] 0.981389575
# Plot the graph for cumulative probabilities
plot(T, PT)

Normal distribution cumulative probabilities output from RStudio

# Create a sequence of values representing cumulative probabilities 
prob_vec <- seq(0, 1, by = 0.05)
prob_vec
#output
[1] 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60
[14] 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00
#quantile values associated with these cumulative probabilities
#from a normal distribution with mean 0.5 standard deviation 1.2
QT <- qnorm(prob_vec, mean=0.5, sd=1.2)
QT
[1]        -Inf -1.47382435 -1.03786188 -0.74372007 -0.50994548
 [6] -0.30938770 -0.12928062  0.03761544  0.19598348  0.34920638
[11]  0.50000000  0.65079362  0.80401652  0.96238456  1.12928062
[16]  1.30938770  1.50994548  1.74372007  2.03786188  2.47382435
[21]         Inf
# Plot the graph for quantile values
plot(prob_vec, QT)

Quantile values from a normal distribution, RStudio output

# Randomly generate 10000 numbers from a normal distribution
# with mean=0.5 and standard deviation=1.2
RT <- rnorm(10000, mean=0.5, sd=1.2)
# Plot generated random numbers in a histogram, with 50 bins
hist(RT, breaks=50)

Hisrogram for generated random numbers from a normal distribution

For getting more knowledge of R and a preview of our training course, you can watch R tutorial videos on our YouTube channel !

The post How to calculate normal distributions in R appeared first on We provide R, Python, Statistics Online-Learning Course.

Using probability functions in R

wilsonzhang746 — Fri, 21 Jun 2024 13:24:26 +0000

We provide effective and economically affordable training courses for R and Python, click here for more details and course registration !

R provides rich availability of using probability functions. Probability functions in R takes the form

[dpqr][prob],

Where

[dpqr] represent which kind of variates the function works on

d – probability density

p – cumulative probabilty

q – quantile value

r – random number generation

[prob] denotes which probability distribution is taken on. The following example code illustrate the situation when we deal with normal distributions.

#load library
library(ggplot2)
#create a vector
x <- seq(from = -10, to = 10, by = 0.2)
#generate density values from standard normal distribution
y <- dnorm(x)  
#show first 10 values of result 
y[1:10]
#output
 [1] 7.694599e-23 5.573000e-22 3.878112e-21 2.592865e-20
 [5] 1.665588e-19 1.027977e-18 6.095758e-18 3.472963e-17
 [9] 1.901082e-16 9.998379e-16
#create a data frame
data <- data.frame(x = x, y = y)
#plot relationship using ggplot2
ggplot(data, aes(x, y)) +
  geom_line() +
  labs(x = "sequential numbers",
       y = "Normal variate") +
  scale_x_continuous(breaks = seq(-10, 10, 1))

Standard Normal Probability Density Curve

#the area under the standard normal curve to the left of z=1.25?
pnorm(1.25) 
#output 
[1] 0.8943502
#the value of the 60th percentile of a normal distribution 
#with a mean of 30 and a standard deviation of 8
qnorm(.6, mean=30, sd=8) 
#output 
[1] 32.02678
#Generate 20 random normal variates with a mean of 30 and a 
#standard deviation of 8
rnorm(20, mean=30, sd=8)
#output 
[1] 18.03202 41.01986 25.47644 16.87644 11.62907 35.38060 21.98076
 [8] 35.86054 25.21310 17.53319 31.04676 35.65328 31.70905 22.68345
[15] 18.20250 34.46939 34.46720 42.65455 14.22156 27.36327

For getting more knowledge of R and a preview of our training course, you can watch R tutorial videos on our YouTube channel !

The post Using probability functions in R appeared first on We provide R, Python, Statistics Online-Learning Course.

How to generate random numbers from Normal, Uniform and Poisson distribution in R

wilsonzhang746 — Tue, 18 Jun 2024 12:30:49 +0000

We provide effective and economically affordable training courses for R and Python, click here for more details and course registration !

Doing statistics using R is perfect for many data analysts. Dealing with various statistical distributions and generating random numbers from some widely used distributions are mandatory for data science. In this post, we show how to generate random numbers from Normal, Uniform and Poisson distributions in R.

Normal distribution

Normal variates can be generated by using rnorm() function in R. The basic form of the function is:

rnorm(N, mean, sd) ,

Where N is for how many such random numbers to generate, mean and sd for the mean and standard deviation of the distribution and default values are 0 and 1 for these two parameters. Next code block show several examples of generating random numbers from specified Normal distributions.

#generate 10 random numbers from standard normal distribution
#standard normal distribution (mean=0, sd=1)
vec_norm1 <- rnorm(10)  
vec_norm1
#output
 [1] -0.10300454 -0.49992423 -0.04867705 -0.20479786  0.64047272
 [6]  0.80908181  2.63997308  0.54729061 -1.53948859 -0.56861347
#10 random numbers from normal variates with mean 32
#and standard deviation 2
vec_norm2 <- rnorm(10, mean=32, sd=2)
vec_norm2
#output
 [1] 31.77717 29.00409 33.75556 32.12455 29.60058 33.79212 32.04183
 [8] 33.66872 31.60234 31.13245

# a matrix of 50 normal variates with mean 32 sd 2
mat_norm <- matrix(rnorm(50,mean=32, sd=2),nrow=10)
mat_norm 
#output
          [,1]     [,2]     [,3]     [,4]     [,5]
 [1,] 32.17958 27.05001 36.68007 30.66535 33.00489
 [2,] 31.97359 33.11925 32.96932 30.13305 33.49547
 [3,] 32.30013 32.74311 33.43902 32.04489 32.22167
 [4,] 35.18345 28.34490 32.31010 30.84051 29.25815
 [5,] 28.83627 34.10811 29.39567 30.01753 34.64026
 [6,] 28.89909 27.29548 32.78904 32.68282 32.00039
 [7,] 33.32865 31.11026 32.56352 34.01731 34.44393
 [8,] 32.18550 35.94802 33.88321 30.73226 32.15024
 [9,] 29.06265 30.57460 30.49779 31.30553 30.00966
[10,] 29.73152 30.07022 29.59840 32.81636 35.68608

2. Uniform distribution

Uniform distribution states the situation where a constant probability density exists in a range. The function runif(N, min, max) in R generates Uniform variates, Where

N represents sample size, min and max parameters specify the range. The following code block illustrate examples about Uniform distribution.

#25 uniform variates between 0 and 1
#default range is 0 and 1
vec_uni_1 <- runif(25)   
vec_uni_1
#output
 [1] 0.96642510 0.38370536 0.99770052 0.99907261 0.38873183
 [6] 0.56486749 0.31835318 0.33375417 0.91052761 0.81353888
[11] 0.19505230 0.06403500 0.96022244 0.40504310 0.02559239
[16] 0.66514033 0.96932741 0.55033963 0.52867588 0.13435604
[21] 0.49814712 0.51983895 0.80709462 0.53736817 0.16027540

#10 uniform variates between 3 and 8
vec_uni_2 <- runif(10, min=3, max=8) 
vec_uni_2
#output
 [1] 5.264467 5.507854 5.859587 5.014579 3.052218 7.628316 3.571432
 [8] 4.471559 4.501750 7.780484

3. Poisson distribution

Poisson distribution is modeling event occurrence with a time interval, in which a constant average occurrence rate exists. For example, With a constant event occurring rate 5 telephone calls coming in a call center per hour, the mean of coming calls in a 2 hour period is 10 times. The following code example show the Poisson random numbers (integers) are generated from function rpois(N, mean).

#a matrix of 100 poisson variates with mean 10
matrix(rpois(100, 10),nrow=20)
#output
    [,1] [,2] [,3] [,4] [,5]
 [1,]   10    9   14    8    8
 [2,]   19   14   19   12   11
 [3,]    6   10    9    5    7
 [4,]   12    9   11   12   12
 [5,]    7   15   15    9    6
 [6,]   12   14   10    7    3
 [7,]    6   14   10    9   13
 [8,]    7    7   17    7   15
 [9,]   12    7    7   12    8
[10,]   11   10   14   10   12
[11,]   10    5    9    6   11
[12,]    9    7    9    7    4
[13,]   13    5   10    6   10
[14,]    7    8    8    6   11
[15,]   11    9    9    7    9
[16,]    8    9    8   13   10
[17,]   12    5   13   11    7
[18,]   14   11   11    6   14
[19,]    7   12   11    9   12
[20,]   16   14    7   12   15
>

For getting more knowledge of R and a preview of our training course, you can watch R tutorial videos on our YouTube channel !

The post How to generate random numbers from Normal, Uniform and Poisson distribution in R appeared first on We provide R, Python, Statistics Online-Learning Course.