Python Visualizations - Altair - 4 (Heatmap)

Heatmaps are great way to look at the spread and patterns of values. Let's create a heatmap to explore Marriage Rates by Region.

Dataset source: https://data.world/siyeh/state-marriage-rate

Step 1: Data Prep

import pandas as pd
import altair as alt
import matplotlib.pyplot as mp
import numpy as np

rmr= pd.read_csv(r'Census Region.csv')
print(rmr.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 18 entries, 0 to 17
Data columns (total 5 columns):
year         18 non-null int64
Northeast    18 non-null float64
West         18 non-null float64
Midwest      18 non-null float64
South        18 non-null float64
dtypes: float64(4), int64(1)
memory usage: 800.0 bytes
None

Unpivot the Regions and clean up

rmr1=pd.melt(rmr, id_vars=['year'], var_name='Region', value_name='Marriage Rate') rmr1.rename(columns = {'year':'Year'}, inplace = True)

print(rmr1.head())


   Year     Region  Marriage Rate
0  1999  Northeast           65.3
1  2000  Northeast           66.4
2  2001  Northeast           66.4
3  2002  Northeast           64.9
4  2003  Northeast           63.6

Step 2: Create the Heatmap

import altair as alt
from altair.expr import datum
Let's first create the rectangular heatmap. The type of aggregation doesn't matter here because there is only one value for a combination of Year and Region.

rect = alt.Chart(rmr1).mark_rect().encode(
 alt.Y('Region:O', scale=alt.Scale(paddingInner=0)),
 alt.X('Year:O', scale=alt.Scale(paddingInner=0)),
 color = alt.Color('median(Marriage Rate):Q', sort='ascending',
                   scale=alt.Scale(scheme='bluepurple'),
                   legend=alt.Legend(title='Marriage Rate', orient='left')),
 tooltip=alt.Tooltip('median(Marriage Rate):Q', title='Marriage Rate')
)

The paddingInner attribute for the axes specify the padding for x and y bands and it takes values in the range (0,1). Since we need continuous bands to get the look of a heatmap, we choose 0 for both X and Y axes. Then, we color our heatmap based on the Marriage Rate values. We technically do not need the above tooltip specification as it doesn't provide any new information, given that we are going to next add the text values to the map.

q = np.quantile(rmr1['Marriage Rate'],0.65)

text = alt.Chart(rmr1).mark_text(baseline='middle').encode(
    alt.Y('Region:O'),
    alt.X('Year:O'),
    text=alt.Text('med_mr:Q', format='.2f'),
    color=alt.condition(datum.med_mr>=q, alt.value('white'), alt.value('black'))
).transform_aggregate(
    med_mr='median(Marriage Rate)',
    groupby=['Region', 'Year']
)

Notice here that we use datum to reference the data values. We want to change the color from black to white for the top one third values, which will have the darker background color from the rectangular map. Finally, we display the chart:

region = alt.layer(rect, text).properties(
    width=750,
    height=300
)
region





Comments