Working with duplicate values in Pandas Series with Python - We provide R, Python, Statistics Online-Learning Course

About
Download
- Python Course Source FilesDownload Python Course source files
- R Course Source FilesDownload R Course code source files.
- Python Machine Learning Source FilesGithub link for Python Machine Learning Source Files
- Download R Machinelearning codeR machine learning source files
Web and Mobile App
- Web appWeb application, with Python or other kinds of backend, and with machine learning, deep learning, AI model as contents.
- Mobile appMobile app, with Android or IOS platform, with social media , AI as contents.
R Programming
- DplyrR tidyverse frame, and Dplyr plackege. Using pipe structure to chain fucntions together. Filter commands of Dplyr in R Arange functions of Dplyr in R, Rename the column name in R using Dplyr, Mutate, Select to Choose Variables/Columns, Joins, Slice, Summarise, Gather, Spread , Separate , n(),Nth,n_distinct() , na_if, coalesce , Ranking functions , Sampling, count , case_when , Group By .
- ggplot2R graphic plackage ggplot2. Introduction of ggplot2, Start building a graph and Add Geoms in ggplot() , Using Grouping, Using Scales , Using Facets , Formulate Labels, Formulate Themes , Graphs as objects , Saving graphs , Bar charts, Pie charts , Histograms , Box plots, Kernel density plots , Violin Plots , Scatter plots , Dot plots , Stem and Leaf Plots , Tree maps , Lollipop , Diverging Bars , Colorful Display of Categorical/Character Frequencies , Nested Pie Chart using plotly , Bubble Plot , step chart , Heatmap.
- R Advanced Data ManagementThis section and category includes mainly intermediate and more advanced techniques for R data analysis. It specially involve R mathematical functions, R Statistical functions, Probability functions in R, and R Character functions. Apply function as well as in this family, descriptive statistics, and table function family are talked about in this section too. We will also focus on how to write your own functions in R, and how to use control flow. In addition, some frequently used techniques such as reshaping, aggregating dataset are introduced.
- R Basic Data ManagementR basic data management contains recode and rename variables, sorting , sorting data, handling missing values, using data values. There are also reshaping data, merging data , and subset of data frame in this category.
- R Data Structure DatasetThis section introduces R data structures: One-dimensional data structure Vector; 2-dimensional data structure Matrix; n-dimensional data structure Array; tabular data structure Data Frame; and data structure List which can store any other type data objects. And this section also introduce how to create R data structure: using read.table() function to read text or csv files to create a data frame; using read.csv() function to read csv file to create a data frame. This section also introduces several functions: with().
Python
- NumpyNumpy module, ndarray, data type.Numpy array include creation of ndarray, and indexing , slicing of the array, and broadcasting array, reading data to return array, etc.
- PandasPython Pandas module. Data structure Series, Data Frame.
- Python ClassObject oriented programming in Python uses class. Class definition contains attributes, methods creation. Particular objects(instance) belonging to a class can be created by call a class, or instantiation. The attributes of an object can be directly assigned a new value, by using methods defined in class, etc.
- Python dictionaryDictionary is a data object type in Python. It stores key-value pairs information. The creation of a dictionary uses braces.
- Python FunctionFunctions in Python a block of code doing some specific job. Python uses def keyword to define a function. A function can contain argument, default value for the arguments, etc. Functions can be called when it is doing real tasks. And parameters are passed to the function when it is called.
- Python ListIntroducing lists Changing, Appending,Removing items of Lists . Sorting lists. Looping through a list. Making Numerical Lists . List comprehension and Working with Part of a List. Tuples. Building sets. Removing items from a set – remove(), pop(),and difference. Using a while Loop with Lists. Set operation.
- Python loopingfor and while loop in Python
- Python Machine LearningMachine learning algorithms in Python programming. The topics include document sentiment analysis, logistic regression, linear regression, and computer vision, CNN, RNN with PyTorch, Tensorflow.
- Python StringPython denotes information stored in quotes are string, no matter it is number or words inside the quotes. Both double quotes and single quotes can be used to create a string.
Statistics
- Statistics distributionStatistical distributions contains both continuous and discrete random distributions. Discrete distributions include binomial, Poisson, Hypergeometric, Negative Binomial, Geometric. Continuous distributions contain Normal, Exponential, Gamma, Beta, Chi-square, Lognormal, t, F, and Weibull distribution.
- Statistics Using PythonDoing and discovering statistics using Python programming. Python functions handling calculating the density, cumulative probability, quantile and random number generation for different statistical distributions: Normal distribution, t distribution, gamma distribution, chi-square distribution, F distribution, beta distribution, Hypothesis testing, etc. Linear regression and Generalized linear models using Python programming. ANOVA, factor analysis using Python programming. Clustering model using Python programming.
- Statistics Using RDoing and discovering statistics using R programming. R functions handling calculating the density, cumulative probability, quantile and random number generation for different statistical distributions: Normal distribution, t distribution, gamma distribution, chi-square distribution, F distribution, beta distribution, etc. Linear regression and Generalized linear models using R programming. Discrete choice modeling using R programming. ANOVA, factor analysis using R programming. Clustering model using R programming.
Course
- Course Registration
- Course InstructorCourse instructor introduction
- R Basic CourseR fundamental programming course.
- Python Basic CoursePython Basic Course

We provide affordable online training course(via ZOOM meeting) for Python and R programming at fundamental level, click here for more details.

When a Pandas Series data object is created in Python, is values can be evaluated with respect to duplicate values. Pandas provides several handy functions dealing with duplicate values in Series. unique() returns unique values of the object, value_counts() will list frequency of each unique value, and isin() will return a boolean Series in terms of elements of the Sereis can be found in the specified list. Next we will show you how to implement these functions in Python IDE.

#Import Pandas and Numpy module
import pandas as pd
import numpy as np
#create a Series with duplicate values
S1 = pd.Series([32,19,201,7,32,19])
S1
#output
0     32
1     19
2    201
3      7
4     32
5     19
dtype: int64
#return unique values of the Series
S1.unique()
#result is a Numpy array
array([ 32,  19, 201,   7], dtype=int64)
#count frequency of unique values in the Series
S1.value_counts()
#output, result is a new Series
32     2
19     2
201    1
7      1
Name: count, dtype: int64
#check values of Series are in the specified list
S1.isin([32,19])
#result is a Series with boolean values
0     True
1     True
2    False
3    False
4     True
5     True
dtype: bool
#isin() can be used to filter values, and store to a new Series
S1[S1.isin([32,19])]
#result is a new Series, with fewer elements than original one
0    32
1    19
4    32
5    19
dtype: int64

You can also watch videos on our YouTube channel for more understanding of Python programming skills.

Categories: NumpyPandasPython

Tags: arraycountduplicatefilternumpypandaspythonseriesuniquevalue

0 Comments

Leave a Reply Cancel reply

You must be logged in to post a comment.