Plotly: Bar Charts

Faran Mohammad
7 min readJul 13, 2023

--

This is the 3rd part of my Plotly series. If you haven’t read the previous ones, you can read them here.

In this blog, I am going to explain how can we render Bar Charts using Plotly. We are going to be using the Gapminder dataset as our source.

df_in = px.data.gapminder().query("country == 'India'")
px.bar(df_in, x='year', y='pop')

In the code snippet provided, a bar chart is being created using the Plotly Express library in Python. The data used for the chart is from the Gapminder dataset, specifically for the country ‘India’.

Here’s a breakdown of the code:

  1. df_in = px.data.gapminder().query(“country == ‘India’”): This line retrieves the Gapminder dataset using the px.data.gapminder() function. The query() method is then used to filter the dataset based on the condition “country == ‘India’”. The resulting data is assigned to the variable df_in.
  2. px.bar(df_in, x=’year’, y=’pop’): This line creates a bar chart using the px.bar() function from Plotly Express. The df_in dataset is passed as the first argument. The x parameter is set to ‘year’, which represents the x-axis values, and the y parameter is set to ‘pop’, representing the y-axis values.

The resulting chart will display the population (‘pop’) of India over the years (‘year’) as vertical bars, where each bar represents a specific year, and the height of the bar corresponds to the population value for that year.

Now, let’s say we want to render a stacked bar chart, so we are going to be plotting the data about tips given by the two different sexes on each day.

df_tips = px.data.tips()
px.bar(df_tips, x='day', y='tip', color='sex', title='Tips by Sex on Each Day', labels={'tip':'Tip Amount', 'day': 'Day of the Week'})

In the provided code snippet, a bar chart is being created using the Plotly Express library in Python. The data used for the chart is from the tips dataset, which is included in the Plotly Express library.

Let’s break down the code:

1. `df_tips = px.data.tips()`: This line retrieves the tips dataset using the `px.data.tips()` function from Plotly Express. The dataset contains information about tips given in a restaurant.

2. `px.bar(df_tips, x=’day’, y=’tip’, color=’sex’, title=’Tips by Sex on Each Day’, labels={‘tip’:’Tip Amount’, ‘day’: ‘Day of the Week’})`: This line creates a bar chart using the `px.bar()` function. The `df_tips` dataset is passed as the first argument.

- `x=’day’`: This specifies that the ‘day’ column from the dataset will be used as the values on the x-axis.

- `y=’tip’`: This specifies that the ‘tip’ column from the dataset will be used as the values on the y-axis.

- `color=’sex’`: This adds color differentiation to the bars based on the ‘sex’ column. Each bar will be color-coded to represent the different sexes.

- `title=’Tips by Sex on Each Day’`: This sets the title of the chart to ‘Tips by Sex on Each Day’.

- `labels={‘tip’:’Tip Amount’, ‘day’: ‘Day of the Week’}`: This specifies the labels for the x-axis and y-axis, respectively. The ‘tip’ column will be labeled as ‘Tip Amount’, and the ‘day’ column will be labeled as ‘Day of the Week’.

The resulting chart will display vertical bars representing the tips given (‘tip’) on each day of the week (‘day’). The bars will be color-coded based on the sexes (‘sex’) of the individuals who gave the tips. The chart’s title will be ‘Tips by Sex on Each Day’, and the x-axis and y-axis will be labeled as ‘Day of the Week’ and ‘Tip Amount’, respectively.

Here, if we want to place the bars next to each other, we can make a few changes to the above code:

px.bar(df_tips, x='sex', y='total_bill', color='smoker', barmode='group')

In the provided code snippet, a grouped bar chart is being created using the Plotly Express library in Python. The data used for the chart is from the tips dataset, which is included in the Plotly Express library.

Here’s a breakdown of the code:

1. `df_tips`: This represents the tips dataset, which is assigned to the variable `df_tips`.

2. `x=’sex’`: This specifies that the ‘sex’ column from the dataset will be used as the values on the x-axis. This column represents the sex of the individuals who visited the restaurant.

3. `y=’total_bill’`: This specifies that the ‘total_bill’ column from the dataset will be used as the values on the y-axis. This column represents the total bill amount for each individual.

4. `color=’smoker’`: This adds color differentiation to the bars based on the ‘smoker’ column. Each bar will be color-coded to represent whether the individual is a smoker or a non-smoker.

5. `barmode=’group’`: This sets the bar mode to ‘group’, indicating that the bars will be grouped based on the ‘sex’ column. The bars representing each category (‘Male’ and ‘Female’) will be grouped together side by side.

The resulting chart will display grouped bars where each bar represents the total bill amount (‘total_bill’) for a specific sex (‘sex’). The bars will be color-coded based on whether the individual is a smoker or a non-smoker (‘smoker’). The bars for each sex category will be grouped side by side. This chart allows for easy comparison of the total bill amounts between different sexes and their smoking habits.

Let’s say we want to display population data for countries in Europe in 2007, greater than 2M people.

df_asia = px.data.gapminder().query("continent == 'Asia' and year == 2007 and pop > 2.e6")
fig = px.bar(df_asia, y='pop', x='country', text='pop', color='country')
fig

In the provided code snippet, a bar chart is being created using the Plotly Express library in Python. The data used for the chart is from the Gapminder dataset, specifically for the year 2007 and the continent Asia. Additionally, only countries with a population greater than 2 million are included in the chart.

Here’s a breakdown of the code:

1. `df_asia = px.data.gapminder().query(“continent == ‘Asia’ and year == 2007 and pop > 2.e6”)`: This line retrieves the Gapminder dataset using the `px.data.gapminder()` function. The `query()` method is then used to filter the dataset based on the conditions “continent == ‘Asia’”, “year == 2007”, and “pop > 2.e6”. The resulting data is assigned to the variable `df_asia`.

2. `fig = px.bar(df_asia, y=’pop’, x=’country’, text=’pop’, color=’country’)`: This line creates a bar chart using the `px.bar()` function from Plotly Express. The `df_asia` dataset is passed as the first argument.

- `y=’pop’`: This specifies that the ‘pop’ column from the dataset will be used as the values on the y-axis. This column represents the population of each country.

- `x=’country’`: This specifies that the ‘country’ column from the dataset will be used as the values on the x-axis. This column represents the countries in Asia.

- `text=’pop’`: This adds the population values as text labels on top of each bar.

- `color=’country’`: This adds color differentiation to the bars based on the ‘country’ column. Each bar will be color-coded to represent a specific country.

3. `fig`: This line displays the resulting bar chart.

The resulting chart will display horizontal bars where each bar represents the population (‘pop’) of a specific country (‘country’) in Asia in the year 2007. The length of each bar represents the population value, and the bars will be color-coded based on the respective countries. The population values will be displayed as text labels on top of each bar, allowing for easy comparison between countries.

If we want to give the above chart a little bit of styling, for better readability, we can add a few lines to the code as:

df_asia = px.data.gapminder().query("continent == 'Asia' and year == 2007 and pop > 2.e6")
fig = px.bar(df_asia, y='pop', x='country', text='pop', color='country')

fig.update_traces(texttemplate='%{text:.2s}', textposition='outside')
fig.update_layout(uniformtext_minsize=8)
fig.update_layout(xaxis_tickangle=-45)

fig

In the provided code snippet, we are making some modifications to the previously created bar chart using the Plotly Express library in Python. The modifications include formatting the text labels, adjusting the minimum size of the text, and rotating the x-axis tick labels.

Here’s a breakdown of the modifications made:

1. `fig.update_traces(texttemplate=’%{text:.2s}’, textposition=’outside’)`: This line updates the text template and position for the data labels on the bars. The `texttemplate` parameter is set to `’%{text:.2s}’`, which formats the text labels to display the population values with two significant figures and using abbreviations (e.g., 1.5M for 1.5 million). The `textposition` parameter is set to `’outside’`, which positions the text labels outside the bars.

2. `fig.update_layout(uniformtext_minsize=8)`: This line updates the minimum size of the text labels to 8. This ensures that even if the text labels are long or there are many bars, the text size will not go below 8, preventing overcrowding or illegible labels.

3. `fig.update_layout(xaxis_tickangle=-45)`: This line updates the angle of the x-axis tick labels. The `xaxis_tickangle` parameter is set to -45 degrees, which rotates the x-axis tick labels by 45 degrees in a counter-clockwise direction. This can be useful when there are long country names, as it helps prevent overlapping of the tick labels.

Finally, the `fig` object is displayed, which shows the modified bar chart with the updated text labels, adjusted minimum text size, and rotated x-axis tick labels.

These modifications enhance the readability and aesthetics of the bar chart, making it easier to interpret and compare the population values for different countries in Asia in the year 2007.

You can find the complete notebook here.

--

--

Faran Mohammad

I am a Software Development Engineer. I like reading and writing about tech. Being a geek, I like to experiment with various technologies and stacks.