pandas

What is Pandas python and installation of Pandas..?
pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool,built on top of the Python programming language.

Installation:

Dataframe Basics:

DataFrame is a main object of pandas. It is used to represent tabular data (with rows and columns).

1) What is dataframe?

Pandas DataFrame is two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in rows and columns. Pandas DataFrame consists of three principal components, the data, rows, and columns.

2) Create dataframe from csv file and python dictionary 3) Dealing with rows and columns 4) Operations: mean, max, std, describe 5) Conditional selection 6) set_index function and usefulness of it

Different Ways of Creating Data frames:

Create dataframe using read_csv() method Create dataframe using read_excel() method Create dataframe using python dictionary DataFrame() method Create dataframe using tuples list DataFrame() method Create dataframe using the list of dictionary DataFrame() method

Read Write Excel csv files in Pandas:

1) Different options on cleaning up messy data while reading csv/excel files 2) Use convertors to transform data read from excel file 3) Export only portion of dataframe to excel file

Topics covered:

Read CSV file using read_csv() method Skip rows in dataframe using "skiprows" Import data from CSV file with "null header" Read limited data from CSV file Clean up messy data from file "not available" and "n.a." replace with "na_values" Supply dictionary for replace with "na_values" Write dataframe into "csv" file with "to_csv() method" Read excel file using read_excel() method Converters argument in read_excel() method Write dataframe into "excel" file with "to_excel() method" Use ExcelWritter() class All properties for Read Write Excel CSV File

How to handle Missing Datas:

to handle missing data in pandas using fillna, interpolate and dropna methods. You can fill missing values using a value or list of values or use one of the interpolation methods.

Convert string column into the date type Use date as an index of dataframe usine set_index() method Use fillna() method in dataframe Use fillna(method="ffill") method in dataframe Use fillna(method="bfill") method in dataframe "axis" parameter in fillna() method in dataframe "limit" parameter in fillna() method in dataframe interpolate() to do interpolation in dataframe interpolate() method "time" dropna() method Drop all the rows which has "na" in dataframe "how" parameter in dropna() method "thresh" parameter in dropna() method

Handle Missing datas - Replace function:

replace method can be used to replace specific values with some other values. It supports replacement using single value, a list, a regular expression and a dictionary. Often times you get data in one form and want to transform data into some other form as far as values are concerned. At this time replace method can be used to perform transformation.

How to use replace method to deal with missing data? How to handle special values in data? Use replace() method to replace values in dataframe Replace values using a dictionary How "regex" (regular expression) works Replace data with "regex" using a dictionary Replace the list of values with another list of values

GroupBy Method:

groupby method can be used to group your dataset based on some criteria and then apply analytics on each of the groups. This is similar to SQL group by. It is also called split apply combine strategy in data science.

Use groupby() method groupby() representation internally What is split apply combine? Use describe() function in groupby

Concat dataframes:

concat function to join or append dataframes.

What is concat? Concat two dataframe using concat() function ignore_index argument in concat() function List of arguments for concat() function What is "keys"? pass "keys" to concat() function "axis" argument in concat() function Join dataframe with series() function

Search This Blog

learndatascience

Hiring Companies

pandas

Comments

Post a Comment

Popular posts from this blog

Hiring Companies

How to Became an Data Engineer