Handling data with pandas

Overview

We can consulte the slides about pandas in the following link

In this folder you will find some laboratories about database managment with pandas (a python library) the purpose not is present notebooks to deploy rather to expose some methods or functions.

Dataframe structure

The main object with interact with data.

Laboratories

The objective is that you can find how solve some problems according to keywords

Load data

  1. load url (video)

  2. load from local (video)

Change name and drop columns

  1. Change name of columns colab

  2. Create and drop columns

Counting Missing values

  1. Nans colab

Filter values

  1. Filter lab I

  2. Filter lab II

Groupby

  1. Bankrupt by sector

  2. Introduction to groupby

  3. Excersice using groupby

Dates

  1. Intro dates

  2. Count transactions

Duplicates values

  1. Lab duplicated

  2. Lab duplicated groupby

Summation or mean over columns or rows

Sometimes you need calculate the mean or sum a set of columns or the values of a variable

  1. Lab mean-sum

Replace

  1. Replace values

Merge and append dataframes

  1. Basic merge

  2. Loop-merge

  3. Variables consistency

Imputation values

  1. Basic imputation

Practical laboratories

The following laboratory is to review some concepts with practical results, to realize this lab dowloand the database pibs-dptos.xlsx avaliable (Click here)

  1. Health lab

  2. Gdps lab I

  3. Gdps lab II

  4. World bank data

  5. TMR lab

  6. Hanushek lab

Regular expressions (Regex)

This topic deserve “apartado” given its importance. “Al principio” regex could be messy, but with practice each time its implementation is more inutitive.

Regex lab

regex basic

Laboratories I

Reshaping data

Table one

  1. Table one bankrupt

Visualization

Hands-on Pandas workshop

Solutions Pandas workshop