Calculate point-biserial and biserial correlations using R

When a correlation, usually Person type correlation, is calculated, two variables have to be continuous. But this requirement does not excludes the situation when one of the two variables is a dichotomous (binary) distributed. Say if we want to measure the correlations between height and gender for a group of people, the variable gender has clear dichotomous values. This kind of Pearson correlation is called point-biserial correlation, because the value for gender variable is strictly 0 or 1.

Creating and indexing lists in Python

List is the simplest type of data structure in Python programming. A list is used to store a collection of elements of same type (numeric, string, etc.). In Python, a pair of brackets [] indicates the data object is a list type. For example, the following two statements create two lists, in which one is numeric and the other is of string type.

How to install Anaconda and start programming with Python?

Python is among the most popular programming language for data science nowadays, and getting started with Python is quite easy. You can just install e.g. a free platform like Anaconda, then you can get direct access to Python as well as most of its preinstalled modules (Numpy, pandas, matplotlib, etc.), its IDE (Spyder, etc) and its easygoing package management tools.

How to create factor variables in R programming

Categorical variables, including nominal and ordinal variables in R programming language are called factor variables. For example, gender(male/female) is nominal, and survey results (excellent, good, normal, bad) have ordinal values. Categorical variables are useful because many data analysis operations are related to values in different categories, such as contingency tables between two categorical variables for independence analysis, hypothesis testing of homogeneity of variances, just name a few.