The opening salvo: Python PANDAS Library
PANDAS - Part 1 - read the doc
What is Python's PANDAS:
Similar to NumPy, Pandas is one of the most widely used python libraries in data science and for data analysis.
It provides high-performance, easy to use structures and data analysis tools.
Unlike NumPy library which provides objects for multi-dimensional arrays, Pandas provides in-memory 2d table object called Dataframe.
In this blog I intend to share what I know about PANDA.. PANDAS is a complex yet easy to implement library system.
What is a DATAFRAME:
Accounting to Databricks.com a DATAFRAME is "the most common Structured API and simply represents a table of data with rows and columns. The list of columns and the types in those columns the schema. A simple analogy would be a spreadsheet with named columns.."
A DATAFRAME can be assembled in different ways by loading Microsoft Excel(tm) spreadsheets, CSV (comma separated values), JSON (JSON is a generic data format with a minimal number of value types: strings, numbers, booleans, lists, objects, and null), simple text, from Python's own dictionaries and lists, and from scratch, among others.
What PANDAS does with this data is just to convert it to a spreadsheet that can then be manipulated in so many difference ways, if would probably easy your mind in terms or learning how to deal with spreadsheets, for those who do not know much about Microsoft Excel.
How do I install PANDAS in my computer?
1. Open a CMD prompt
2. Assuming that you have Python installed do this:
pip3 install pandas
(*) if you are in jupyter notebooks to this:
!pip3 install pandas
How do I create a dataframe our of my Microsoft Excel spreadsheet:located in my current folder?
1. Open IDLE or in jupyter notebook create a new python file
2. import pandas as pd
3. import numpy as np
4. df = pd.read_excel(myexcelfile.xlsm) # watchout for the excel extension! ✋
# Now your Excel spreadsheet will be named "df" and to verify that the Excel file has been read into the DATAFRAME type this:
5. df.head(3) # this will print the first 3 rows of data.
More coming in Part 2 of my series... hope you enjoy this notes.
.
Comments
Post a Comment