Data visulization helps indentify patterns and trends in large datasets.
There are 3 broder categories of visualization.
Univariate Visualization
Univariate visualisation is about visualise single attribute. First we need to find the data type of an and then we can visualise them. There are 4 data types: categorical nominal, categorical ordinal, metric discrete and metric continoues. Based on the data type we can choose appropriate visualization
Consider the following dataset about t-shirts.
#Below we create the above dataset using random function. Let us generate a dataframe for 1000 randomly generate t-shirt information.
import random
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import string
import seaborn as sns
print('matplotlib: {}'. format( matplotlib. __version__))
sizes = ["S", "M", "L", "XL","XXL"]
colors = ["black", "red", "white", "blue"]
#Below we create functions to generate a probable value for each column attribute randomly.
def generate_random_id():
return str(random.randrange(100,999))+random.choice(string.ascii_uppercase)
def generate_random_size():
return random.choices(sizes,weights=[0.15, 0.32, 0.28,0.20,0.05])[0]
def generate_random_color():
return random.choice(colors)
def generate_random_price():
return round(random.uniform(0, 10000),2)
def generate_random_stock():
return random.randrange(12,2345)
total_no_of_tshirts=1000
data = []
for i in range(total_no_of_tshirts):
row = []
row.append(generate_random_id())
row.append(generate_random_size())
row.append(generate_random_color())
row.append(generate_random_price())
row.append(generate_random_stock())
data.append(row)
df=pd.DataFrame(data,columns = ['id', 'size','color','price','stock'])
print(df)
Output,
matplotlib: 3.5.3
id size color price stock
0 885J S red 1004.74 1642
1 422W XXL black 629.46 2065
2 642L M red 2224.21 1789
3 231I S red 8137.09 1519
4 551D S blue 9303.51 132
.. ... ... ... ... ...
995 530W XL black 4792.29 948
996 921V M white 6985.76 1266
997 119Q M red 8022.32 1008
998 703T M blue 853.45 2338
999 233U XL white 6272.56 1924
[1000 rows x 5 columns]