Using t-distribution and t-test with R
A Student t-distributed random variable is modeling the ratio between a standard Normal random variate and square root of a Chi-squared random variable divided by its degrees of freedom.
Course registration link:
https://rdatacode.com/contact-us/
R programming tutorials includes two parts. Part 1 focus on R programming fundamentals. There are following sections in this part: Get started with R and RStudio environment; R data structure and create datasets; R data management basic methods; R data management advanced methods; Data visualization with ggplot2 package; Data analysis with dplyr package; Working with string and text mining. Part 2 focus on Statistical data analysis using R programming.
A Student t-distributed random variable is modeling the ratio between a standard Normal random variate and square root of a Chi-squared random variable divided by its degrees of freedom.
Kernel density function is a nonparametric method to find the drawing density curve of random samples, and it is often used to draw a smoothed curve in data visualization. In R programming with ggplot2 package, a chaining of functions ggplot() and geom_density() is often used to draw different smoothed curves showing the distribution of continuous variables.
In hypothesis testing, the analyst has chance to commit both Type I and Type II errors. The Type I error (α) refers to the probability of wrongly rejecting a true Null hypothesis – H0, while the Type II error (ß) represents the probability that failing to reject a false H0. The value of 1- ß is called the Power of Test in hypothesis testing. Its value says the ability of correctly rejecting a false H0, under the specified Null hypothesis – H0 and Alternative hypothesis – H1.
In statistical hypothesis testing, there are usually two types of errors that the process will encounter, namely Type I and type II errors. Type I error (α) refers to the probability of rejection of a Null Hypothesis (H0) when actually it is true, and if a false Null hypothesis is missed to reject when an Alternative Hypothesis (H1) is true, then a type II error (ß) occurs.
Weibull distribution, named after Swedish mathematician Waloddi Weibull, is a continuous distribution which is widely used to model the distribution of random time between events. Exponential distribution, which is used to model the random time until next event occurs and have so-called memoryless feature or constant failure rate. In order to relax this memoryless condition, analysts may use either Gamma distribution or Weibull distribution instead.
Lognormal distribution in probability and statistics is used to model the distribution of a positive random variable Y, if Y = ln(X) has a normal distribution with mean μ and standard deviation σ.
Beta distribution is a family of distributions which are used to model the probability of continuous random variables defined on [0, 1]. There are two parameters , α and β in Beta distribution. A continuous uniform distribution defined on [0, 1] is actually a special case of a beta distribution, when both α and β equal 1.
read.table() function in R is often used when a delimited ASCII file (e.g. text file or csv file) is to be imported to generate a data frame. The basic syntax is
df <- read.table(input, options)
Data frames are the most widely used data structures in R programming. Unlike each element in vector/matrix/array must have same data mode, a data frame can store data elements with different mode or type in one object. For example, a data frame of family information can have numeric (e.g. age, income), character (e.g. name), and logical (work/not work) data types. Data frames in R act somewhat similar as a spredsheet in Microsoft Excel, where each row represents each observation or subject and each column refers to each variable or attribute.
When it is needed to store many elements of same type or mode into one data object in R, you can use array. Actually, vector and matrix are special types of array with one and two dimensions respectively.