Python at Unimelb
Digital Skills TrainingSupport
  • 👩‍💻Python Digital Skills Training
  • Welcome
    • 🧐Training overview
      • 😇About Trainer
        • 😚Python Class
        • 🎬Python Tutorial
        • ✨GitHub
    • ⚠️Eligibility and Requirements
    • 🤏Expectations
    • 🦸Support
  • Python Workshops
    • 🤖Install Python
    • 🔮Virtual Environment
    • ⛔Error
    • 🎲Introduction to Python
    • 🐢Data Structure
    • 🧠Think like Python
    • 🔢Introduction to NumPy
    • 🔢Numpy 2
    • 🐼Introduction to Pandas
    • 🐼pandas 2
    • 🐼pandas 3
Powered by GitBook
On this page
  • Table of Contents:
  • Pandas 3 Documents 📚

Was this helpful?

  1. Python Workshops

pandas 3

Previouspandas 2

Last updated 1 year ago

Was this helpful?

Table of Contents:

In the Pandas part 3, we have covered the following topics:

  1. Index and MultiIndex:

An index is a set of labels that uniquely identify each row or element in a DataFrame or Series. It works like an address, enabling fast access and modification of the data. In a DataFrame, the index refers to the row labels, while in a Series, it labels individual elements. Indexes can be automatically generated as numeric values, or they can be set explicitly by the user to have custom labels, including strings or datetime objects, making data manipulation and retrieval more intuitive and efficient.

  1. ReIndex and SetIndex:

In Pandas, reindex is a method used to change the order of rows or columns in a DataFrame or Series, aligning them to a new set of labels. This method can fill in missing indices with NaN or other specified values, providing flexibility in data alignment and handling missing values. On the other hand, set_index is used to set a specific column or multiple columns of a DataFrame as its index, replacing the existing row labels, which allows for more meaningful and convenient data access based on these column values.

  1. Groupby:

In Pandas, groupby is a powerful function used for splitting data into groups based on some criteria. It involves separating data into different groups by a specified key or keys and then applying a function to each group independently, whether it be for aggregation, transformation, or filtration. This technique is particularly useful for analyzing subsets of data and performing operations like summing, counting, averaging, or other custom functions to understand patterns or relationships within the data.

  1. Count and Count Values:

In Pandas, value_counts() is a method typically used on columns in a DataFrame or on a Series to count the number of occurrences of each unique value, providing a frequency distribution of these values. It's useful for understanding the distribution of categorical data. On the other hand, the count() method in a DataFrame returns the number of non-null or non-NA entries in each column or row, which is helpful for identifying missing data or for understanding the volume of valid data points in the dataset.

  1. Sort Values:

In Pandas, the sort_values() method is used to sort a DataFrame based on the values of one or more columns. You can specify the column(s) you want to sort by, and the sorting order (ascending or descending). This method is incredibly useful for organizing data in a meaningful order, whether it's sorting sales data from highest to lowest, arranging dates in chronological order, or any other sorting based on column values. It allows for greater readability and easier analysis of the data.

Pandas 3 Documents 📚

🐼
33MB
weather_data.csv
81KB
attendance.csv
596B
apples_and_oranges.csv
13KB
pandas3.ipynb
Indexs in DataFrame are used to identify the rows and can be change and replace with other values.
Values in DataFrame can be sorted and Grouped