Friday 21 July 2017

Data Science & Machine Learning - 4.1 Pandas & Its Installation

Hello friends,

Welcome to yet another post on Data Science & Machine Learning. In the previous few posts, we learnt about NumPy library for Data Science mostly used for NumPy Arrays. From this post onward, we will learn about Pandas, one of the most important Data Science libraries in Python. 

About Pandas

Pandas is an open source library and provides the following functionalities:
  1. Fast data analysis
  2. It is high performance
  3. It has its own built-in data visualization features
  4. It can work from variety of data sources such as excel files, csv files, sql files and so on.

Pandas Installation:

Just like the installation of NumPy, it is recommended to use the Anaconda distribution of Python in order to install Pandas as well. You can see the installation of Anaconda distribution of Python here. Once you have that installed, you can install Pandas by running the following command in the command prompt:

conda install pandas

You can still install Pandas even if you don't have the recommended Anaconda distribution of Python (not recommended) using the following command:

pip install pandas

We will learn about the following topics under the Pandas library:
  • Series
  • Data Frames
  • Filling missing data
  • Various data input & output techniques
  • Operations on series and data frames
Now that we have installed Pandas successfully on our systems, we will start using the Pandas library starting with Series from the next post.
Share:

0 comments:

Post a Comment

Contact Me

Name

Email *

Message *

Popular Posts

Blog Archive