Intro to Data Visualization Using Matplotlib in Python

Sambit Mahapatra
4 min readFeb 14, 2018

--

I am going to show, how to draw simple line plots on financial data using matplotlib in python for deriving suitable insights later.

The data that have been used here is a randomly generated data of revenues and expenses for each of the 12 months (january, february, …….., december) in a year.

#Data of revenue and expenses for each month in a year
revenue = [14574.49, 7606.46, 8611.41, 9175.41, 8058.65, 8105.44, 11496.28, 9766.09, 10305.32, 14379.96, 10713.97, 15433.50]
expenses = [12051.82, 5695.07, 12319.20, 12089.72, 8658.57, 840.20, 3285.73, 5821.12, 6976.93, 16618.61, 10054.37, 3803.96]

First we have to calculate the profit after each month and final profit of each month after 30% tax on profit.

profit = np.array(revenue)-np.array(expenses)
profit_after_tax = profit - 0.3*profit

Now to plot a line graph of the features for deriving suitable insights later, we have to first import the matplotlib module.

import matplotlib.pyplot as plt
%matplotlib inline

Now to have a clear picture of trends of different financial stats over months of the year, we need to plot the line graph of each financial feature over the year. The respective code snippet looks like :-

plt.plot(revenue, color="black", ls="--", marker="o", ms=6, label="revenue") 
plt.plot(expenses, color="red", ls="--", marker="+", ms=6, label="expenses")
plt.plot(profit, color="blue", ls="--", marker="s", ms=6, label="profit")
plt.plot(profit_after_tax, color="green", ls="--", marker="^", ms=6, label="profit_after_tax")
plt.legend(loc = 'upper left', bbox_to_anchor=(1,1)) #to show the labels at proper location
plt.xticks(list(range(12)), months, rotation="vertical")
plt.show()

Here, the parameters mentioned in the plot function are an list or array, color, ls, marker, ms and label. To have a better understanding of the parameters in the first line ‘revenue’ is an array of revenue of each month of the year. These are the y-axis values represented in the final plot. color is the color of the line plot, ls represents the line of sight, marker represents the style of marking at the data points, ms represents the marker size. The xticks function has taken the number of the labels, the array of labels, and the rotation type as parameters. Here, rotation is set to vertical so that x-labels will be shown in a vertical manner for readability.

The final graph looks like :-

As shown above, setting features and drawing of the line plot for each financial feature is a tedious task. Also, in real life the data will be in a tabular format. So, we have to deal with multidimensional arrays rather than different 1-D arrays for each attribute. So let’s club all the attributes together to an array named ‘financial_stat’. Here the 4 arrays represents the values of ‘revenue’, ‘expenses’, ‘profit’ and ‘profit after tax’ of each month of the year. The data look like:-

financial_stat = 
[[ 14574.49 7606.46 8611.41 9175.41 8058.65 8105.44
11496.28 9766.09 10305.32 14379.96 10713.97 15433.5 ]
[ 12051.82 5695.07 12319.2 12089.72 8658.57 840.2 3285.73
5821.12 6976.93 16618.61 10054.37 3803.96 ]
[ 2522.67 1911.39 -3707.79 -2914.31 -599.92 7265.24
8210.55 3944.97 3328.39 -2238.65 659.6 11629.54 ]
[ 1765.869 1337.973 -2595.453 -2040.017 -419.944 5085.668
5747.385 2761.479 2329.873 -1567.055 461.72 8140.678]]

To have the line plots for any number of features in a matrix or array of arrays, we can write a function for plotting. The function looks like :-

featurelist = ["revenue", "expenses", "profit", "profit_after_tax"]
fdict = {"revenue":0, "expenses":1, "profit":2, "profit_after_tax":2}
def myplot(data,featurelist):
for f in featurelist:
plt.plot(data[fdict[f]],color="blue", ls = "--", marker="o", ms=6, label=f)
plt.legend(loc = 'upper left', bbox_to_anchor=(1,1))
plt.xticks(list(range(12)), months, rotation='vertical')
plt.show()
myplot(finance_stat,["revenue", "expenses", "profit"])

Here, the parameter data is the matrix or array of arrays from which plottings need to be done. featurelist represent the list of features or attributes or columns on which lines will be plotted. The graph looks like :-

The problem with the above graph is same color and marker style for each financial feature or attribute in the plot. Thus, the function used can be more advanced for a better and clearly visualized graph. The advanced function looks like :-

def myplot(data,featurelist):
cdict = {"revenue":'black', "expenses":'red', "profit":'blue', "profit_ater_tax":'green'}
mdict = {"revenue":'o', "expenses":'+', "profit":'s', "profit_ater_tax":'^'}
for f in featurelist:
plt.plot(data[fdict[f]],color = cdict[f], ls = "--", marker="o", ms=6, label=f)
plt.legend(loc = 'upper left', bbox_to_anchor=(1,1))
plt.xticks(list(range(12)), months, rotation='vertical')
plt.show()
myplot(finance_stat,["revenue", "expenses", "profit"])

Now the final graph will look like :-

The github link of the complete code is : https://github.com/sambit9238/Visualisation/blob/master/PlotOnFinancialData.ipynb

--

--

Sambit Mahapatra

Putting ML to Customer Support at CSAT.AI | Natural Language Processing | Full Stack Data Scientist (sambit9238@gmail.com)