Topic 5: What is pandas?
What is Pandas?
![]() |
| Diagram: Pandas Workflow |
Need of Pandas?
Without Pandas, working with large datasets in plain Python is difficult.
Pandas helps by providing: 1. Efficient data handling Load, modify, and analyze data easily.
2. Flexible data structures – Handle missing values, duplicates, etc.
3. Integration – Works well with NumPy, Matplotlib, Scikit-learn, etc.
4. Data Import/Export– Easily read/write data in CSV, Excel, JSON, SQL, etc.
5. Faster data processing– Optimized for performance.
Core Data Structures in Pandas:
![]() |
| Core Data Structure in Pandas |
Common Pandas Inbuilt Functions with Examples
1️⃣Creating Series and DataFrame
python
import pandas as pd
Series
s = pd.Series([10, 20, 30, 40])
print(s)
DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['Delhi', 'Mumbai', 'Chennai']}
df = pd.DataFrame(data)
print(df)
2️⃣ head() and tail()
Display first rows and last few rows.
python
print(df.head(2)) # First 2 rows
print(df.tail(1)) # Last 1 row
3️⃣ info()
Shows basic information about the DataFrame.
python
df.info()
Output:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
4️⃣ describe()
Provides statistical summary of numeric columns.
python
print(df.describe())
Output:
| | Age |
| ----- | ---- |
| count | 3.0 |
| mean | 30.0 |
| min | 25.0 |
| max | 35.0 |
5️⃣ shape
Returns (rows, columns) of DataFrame.
python
print(df.shape)
Output: (3, 3)
6️⃣columns and index
Get column names and row index.
python
print(df.columns)
print(df.index)
7️⃣ sort_values()
Sort data based on column values.
python
print(df.sort_values(by='Age', ascending=False))
8️⃣ iloc[] and loc[]
Access specific rows and columns.
python
print(df.iloc[0]) # First row (by index position)
print(df.loc[1, 'City']) # Value at row 1 and column ‘City’
9️⃣ isnull() and dropna()
Handle missing data.
python
df.isnull() # Check for missing values
df.dropna() # Remove rows with missing values
🔟fillna()
Fill missing values with specified data.
python
df.fillna(value=0)
1️⃣1️⃣ groupby()
Group data by certain columns and apply functions.
python
group = df.groupby('City')['Age'].mean()
print(group)
1️⃣2️⃣ merge(), concat(), join()
Combine multiple DataFrames.
python
df1 = pd.DataFrame({'ID':[1,2], 'Name':['A','B']})
df2 = pd.DataFrame({'ID':[1,2], 'Salary':[50000,60000]})
result = pd.merge(df1, df2, on='ID')
print(result)
1️⃣3️⃣ read_csv() & to_csv()
Import/export data easily.
python
df = pd.read_csv('data.csv')
df.to_csv('output.csv', index=False)
1️⃣4️⃣ value_counts()
Count frequency of unique values in a column.
python
print(df['City'].value_counts())
Pandas Quiz
Summary
| Function | Purpose |
| ---------------------------------- | ----------------------- |
| `head()` / `tail()` | Show first/last rows |
| `info()` | Data info summary |
| `describe()` | Statistics summary |
| `shape` | Dimensions of DataFrame |
| `sort_values()` | Sort data |
| `isnull()`, `dropna()`, `fillna()` | Handle missing data |
| `groupby()` | Aggregate data |
| `merge()`, `concat()` | Combine DataFrames |
| `read_csv()`, `to_csv()` | Data import/export |


Comments
Post a Comment