Data Science & Machine Learning - 4.1 Pandas & Its Installation ~ Coding Interview Questions With Solutions

Friday, 21 July 2017

Data Science & Machine Learning - 4.1 Pandas & Its Installation

Krishna Chaurasia data science, machine learning, pandas, python No comments

Hello friends,

Welcome to yet another post on Data Science & Machine Learning. In the previous few posts, we learnt about NumPy library for Data Science mostly used for NumPy Arrays. From this post onward, we will learn about Pandas, one of the most important Data Science libraries in Python.

About Pandas

Pandas is an open source library and provides the following functionalities:

Fast data analysis
It is high performance
It has its own built-in data visualization features
It can work from variety of data sources such as excel files, csv files, sql files and so on.

Pandas Installation:

Just like the installation of NumPy, it is recommended to use the Anaconda distribution of Python in order to install Pandas as well. You can see the installation of Anaconda distribution of Python here. Once you have that installed, you can install Pandas by running the following command in the command prompt:

conda install pandas

You can still install Pandas even if you don't have the recommended Anaconda distribution of Python (not recommended) using the following command:

pip install pandas

We will learn about the following topics under the Pandas library:

Series
Data Frames
Filling missing data
Various data input & output techniques
Operations on series and data frames

Now that we have installed Pandas successfully on our systems, we will start using the Pandas library starting with Series from the next post.

Friday, 21 July 2017