Pandas in 10 Minutes— Part-1

VARSHITHA GUDIMALLA
4 min readMay 3, 2021
image by Ilona Froehlich on unsplash

Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.

Pandas is build on top of NumPy and Matplolib python libraries.

Installing Pandas

If you Have Anaconda installed in your System, then you can simply install from your terminal or command prompt using:

conda install pandas

Otherwise, if pip is installed in your system, then you can install it from your terminal or command prompt using:

pip install pandas

Importing pandas

import pandas as pd

Instead of writing “pandas.” and using the method inside pandas we can simply write “pd.”. so we are importing it as “pd”.

Data Structures in pandas

Series

It is a 1-Dimensional data structure which is very similar to array.

This is a list in python

a = [3, 5, 2.71, -9.4, 8.432]
print(type(a))
a

Creating a Series

1.

s = Series(a)
print(type(s))
s

2. In series the indices can be any values.

idx = ['a', 'd', 'f', 'h', 'i', 't']
idx
s1 = Series(a, idx)   #Series(data, index)
s1

3. Using dictionary

dic = {"a" : 6, "b": 7, "c": "Disha", "d" : 30}
dic
s2 = Series(dic)
s2

Accessing the elements of Series

s[3]
s2[“name”]

Arithmetic Operations on Series

s1 + s2
s2 + s2
s1 — s2

Pandas gives output as NaN when it is unable to find a match.

s — s
s * s
s / s

Data Frame

Data frame is a 2-Dimensional data structure where the data is aligned in tabular form.

Mostly used data structure in Pandas

Creating a data structure

import numpy as np #another most important python library
df = pd.DataFrame(np.random.randn(5, 2), columns = list('AB'))
df
sample = {'name' : ['Riya', 'Sandy', 'Tonny', 'Alex'
'age' : [14, 24, 30, 38],
sample = {'name' : ['Riya', 'Sandy', 'Tonny', 'Alex'],
'age' : [14, 24, 30, 38],
'country' : ['India', 'New Zealand', 'Russia', 'Bangladesh']}
#creating a dataframe using dictionary
df1 = pd.DataFrame(sample)
df1

Selecting columns from a data frame

df1['name']
df1['name', 'age']

Adding columns

df1['year'] = [2006, 1996, 1990, 1982]
df1

Removing or dropping a column

df1 = df1.drop(‘year’, axis = 1)
df1

for dropping multiple columns we can place multiple columns in a list as shown below:

df1 = df1.drop([column1, column2, axis = 1)

Removing or dropping rows

df1 =df1.drop(3)
df1

axis = 1 refers to columns and axis = 0 refers to rows by default axis value is ‘0’.

Access elements

  1. accessing one column
df1['name']

2. accessing multiple columns

df1[['name', 'age']]

3. Accessing columns of a Data Frame based on certain condition

df1[df1['age'] > 18] # df1[condition]

--

--

VARSHITHA GUDIMALLA

Computer Science Engineering Graduate || Machine Learning enthusiast.