How to select variables in data frame with R

We provide online training for R and Python, click here for more info ! When a data frame is created in R, sometimes the data frame contains dozens of variables and only a subset of them will be used in data analysis. Thus, selecting these variables and saving them into a new object will make data management clear and concise. Say, we have a data frame about student testing score, ‘grade’ on hand. And now, Read more…

How to drop variables from data frames in R

We provide effective and economically affordable training courses for R and Python, click here for more details and course registration ! In this post we show several methods of dropping unwanted variables of a data frame in R. Say we have a data frame ‘grade’ in the current R working session, and we try to remove variables ‘Math’ and ‘Physics’. The first method is just using minus – symbol in front of the column number Read more…

How to sort datasets in R

We provide effective and economically affordable training courses for R and Python, click here for more details and course registration ! Data frame is the most widely used object type in R data analysis, because it allows storing different modes of data in a tabular form. The rows of a data frame represents each observation, and the columns denotes different variables each observation has. When the data is collected and read into a data frame, Read more…

How to rename variables in R

We provide effective and economically affordable training courses for R and Python, click here for more details and course registration ! When we have created a data frame in R , for example by reading a csv file from working directory, the first row of the file will be used as the column or variable names of the data frame. Then we can use names() function to rename the variable names as we like. In Read more…

How to read excel spreadsheet into a data frame in R

We provide effective and economically affordable training courses for R and Python, click here for more details and course registration ! Data frames in R are the mostly widely used object type in R data analysis, due to the fact that a data frame can afford to have different variables with different modes(numeric, character, Boolean, etc). The data sources that are adaptable to create a data frame in R are versatile. Excel files are such Read more…

How to create numerical lists in Python

We provide effective and economically affordable training courses for R and Python, click here for more details and course registration ! Although a numerical list can be created manually by filling each element within square brackets in Python, very often will the analyst use some type of ready functions. For example in the following code example, we first try using a for loop with a range() function to print out number 1,2,3,4, then use list() Read more…

Some simple differences between R and Python

We provide effective and economically affordable training courses for R and Python, click here for more details and course registration ! Both R and Python are popular programming languages for data analysts. R is designed mainly for statistical data analysis, and Python may have more usage in deep learning and web application in addition to general data analysis. Although there are some common points between R and Python, i.e. they are both free and open Read more…

Test on an individual coefficient in Multiple Linear Regression model

We provide effective and economically affordable training courses for R, Python and Statistics, click here for more details and course registration ! When a multiple linear regression model is estimated, the next step is usually checking the significance of each coefficient. As the arbitrary inclusion of an unnecessary variable into the model will increase the SSR (Regression sum squares) a little, which is not warrant the reduce of the SSE (Sum of Squared Error). So Read more…

How to create a heapmap using ggplot2 in R

We provide effective and economically affordable training courses for R and Python, click here for more details and course registration ! A heatmap is just a grip type data visualization of the correlation matrix between each variable pair of a data frame. In R, if you want to create a heatmap, you will have to first create a correlation matrix using cor() function for the data frame. Then using melt() function from reshape2 package in Read more…

Using for loop through a list in Python

We provide effective and economically affordable training courses for R and Python, click here for more details and course registration ! Looping in Python provides a feasible way to deal with each element of an object for particular operations using just a single block of statements. A for loop handling a list in Python creates a temporary variable first, then do a series expected operation with the block. Although the temporary variable name can be Read more…