🔹 Introduction

In today’s world, data is the new oil. But raw data is messy — it has missing values, duplicates, unwanted columns, and more. This is where Pandas comes in.

Pandas is a Python library used for data manipulation, cleaning, and analysis.
Think of it as your personal Excel on steroids 🏋️‍♂️ — faster, more flexible, and made for programmers.

🔹 Why Learn Pandas?

Makes handling large datasets super easy.
Works well with CSV, Excel, SQL databases, JSON, etc.
Core library in Data Science & Machine Learning workflows.
Saves time for analysts and developers by simplifying data operations.

🔹 Pandas Data Structures

Series → A one-dimensional array (like a single column).
Example:
```
import pandas as pd
s = pd.Series([10, 20, 30, 40])
print(s)
```
Output looks like:
```
0    10
1    20
2    30
3    40
```
👉 Notice how Pandas automatically gives index numbers (0,1,2,3).

DataFrame → A two-dimensional table (like Excel).
Example:


data = {
    'Name': ['Aarav', 'Isha', 'Raj'],
    'Age': [23, 25, 21],
    'City': ['Delhi', 'Mumbai', 'Pune']
}
df = pd.DataFrame(data)
print(df)

Output:


  Name  Age    City
0 Aarav  23   Delhi
1  Isha  25  Mumbai
2   Raj  21    Pune

🔹 Beginner Level

1. Reading Data

Pandas can read from different file types.


df = pd.read_csv("data.csv")   # Read CSV
df = pd.read_excel("data.xlsx") # Read Excel

2. Exploring Data


df.head()       # first 5 rows
df.tail()       # last 5 rows
df.shape        # rows & columns count
df.info()       # column types
df.describe()   # summary stats

Use case: Before analysis, always explore your dataset. Example: In sales data, you may check how many rows (transactions) and columns (features) exist.

🔹 Intermediate Level

1. Selecting Columns & Rows


df['Name']         # single column
df[['Name','Age']] # multiple columns
df.loc[0]          # row by label
df.iloc[1:3]       # row by index

2. Filtering


df[df['Age'] > 22]     # condition
df[(df['Age'] > 20) & (df['City']=='Delhi')]

👉 Useful in real life: Filtering customers above 30 years old in sales data.

3. Modifying Data


df['Salary'] = [50000, 60000, 45000]  # add column
df.drop('Salary', axis=1, inplace=True) # remove column

4. Handling Missing Values


df.dropna()     # remove rows with nulls
df.fillna(0)    # replace nulls with 0

🔹 Advanced Level

1. Grouping & Aggregation


df.groupby('City')['Age'].mean()

👉 Example: Average age of people in each city.

2. Merging Data


merged = df1.merge(df2, on='ID')

👉 Example: Join customer data with their purchase history.

3. Pivot Tables


pd.pivot_table(df, values='Age', index='City', aggfunc='mean')

👉 Example: Average salary by department.

4. Time Series


df['Date'] = pd.to_datetime(df['Date'])
df.set_index('Date', inplace=True)
df.resample('M').mean()

👉 Example: Monthly sales analysis.

5. Visualization


df['Age'].plot(kind='bar')

👉 Quick charts directly from Pandas.

🔹 Conclusion

Pandas is the backbone of data analysis in Python. From simple data cleaning to complex aggregations, it makes the job easier and faster.

Beginners → Learn how to load, explore, and filter data.
Intermediate → Work on grouping, merging, handling missing values.
Advanced → Dive into time series, pivot tables, and visualizations.

🚀 Once you master Pandas, you’re ready to step into Data Science, Machine Learning, and AI.

🐼 Pandas A to Z — Beginner to Advanced