Saturday, 22 July 2017

Data Science & Machine Learning - 4.2 Pandas Series

Hi friends,

In the previous post under Data Science & Machine Learning, we learnt how to install the Pandas library to our Python. From this post onward, we will start learning about the Pandas library which is one of the most important libraries for Data Science. In this post, we will learn about the Series data structure supported by the Pandas library.

Note: All the commands discussed are run in the Jupyter Notebook environment. See this post on Jupyter Notebook to know about it in detail.

Pandas Series

So, let's begin to learn about Pandas Series. The first thing we need to do before we can start using any Python library is to import the library. So, let's import the Pandas library using the following command:


You might see an error if Pandas was not installed successfully otherwise you won't see any other message.

As a Pandas' Series is built on the NumPy Arrays so we need to include the NumPy library as well to our script. 

You might find Pandas Series similar to Python Dictionaries by its usage just like NumPy Arrays find similarity with the Python Lists but these (Pandas Series and NumPy Arrays) are designed specifically for data science tasks and supports many additional features which the lists and dictionaries lack. 

If you type pd.Series (in your Jupyter Notebook cell) and press Shift + Tab, you can see a wide variety of parameters that the Series method takes but the two most important ones are the data and the index. These are like the values and the keys in the Python dictionaries.


So, let's declare our first Series using Pandas


Notice, I have first created two Python lists that act as the keys and the data for the Series. Also, notice it is mentioned that the datatype(dtype) of the data part of the Series is int64

We can pass the NumPy arrays as well in place of the data part and it will generate the same result:


We can even pass the myLabels(list of strings) as the data part and it will work quite well which shows the flexibility of the Series data structure that it can take a variety of parameters as data points.


Accessing elements in Series

Accessing an element in a Pandas Series is just like working with dictionaries. So, if we want to access the data corresponding to the index 'y' in the first series we declared, we do the following:


Now, that we have learnt how to declare and use a Pandas Series, we will start working with Pandas DataFrames, another very important data structures supported by Pandas from the next post
Share:

0 comments:

Post a Comment

Contact Me

Name

Email *

Message *

Popular Posts

Blog Archive