Visualizing Data to Gain Insight into Makeup Naming Conventions

Data Visualization Final Project

Creating an interactive data visualization dashboard to explore the distribution of shade ranges across naming conventions and the racialized dimensions of classification

ROLE: DATA VISUALIZATION, DATA CLEANING, DESIGN, CODE (R AND SHINY)

TIMELINE: MARCH 2023

What’s the deal with names?

Explore the dashboard here ↗

Project Background

This dashboard was created in my Data Visualization course, where we could choose to explore any topic of interest for our final project. With an interest in cultural insights and browsing Sephora 😌, I hoped to surface beauty industry trends. I was also excited to build and publish a web app that people can interact with and learn from!

Learning Goals

Personal Goals:

Create a user-driven exploratory experience of factors (name category and whether the brand is women of color founded/owned)

Get comfortable with coding interactive visualizations in R!

Research Questions:

What is the overall distribution of foundation shades across different naming conventions?

What are the popular naming conventions used by the beauty industry? Is there a particular naming strategy? Does the naming convention vary by shade?

Process

The visualization was first inspired by The Pudding’s data journalism piece, The Naked Truth. I used the same dataset, which was public (and luckily already web-scraped and cleaned!). Several variables were mutated to improve the organization of certain categories; for example, the rock, wood and plant categories were combined into the 'Nature' category.

To add interactive components, I coded in widgets so viewers can develop and investigate their own questions relating to naming conventions and foundation shades. Exact names of shades appear below the plot and in the ‘Table: Shades by Brand’ tab to support more in-depth explorations.

Insights

Here’s one insight I found compelling: the distribution of foundation named after 'gem' is skewed towards lighter shades. In contrast, the category 'drink' has a distribution skewed towards deeper shades. This discrepancy is interesting given the history of stereotypes or classifications given to people of color. Why are certain shades named after gemstones, while others are more frequently named after objects of consumption, like food and drink? (Just food for thought!)

Check out the dashboard to discover your own insights!

My Learnings

📝  Invest in Planning
Planning is the most important part of the process, especially when you're still exploring the data and figuring out what you want to communicate. I have a habit of jumping into things head first, but investing in thorough planning is so worth it.

📈 Follow the Data
The research questions are the foundation. Let the data guide design and development, not the other way around

📊  Data isn't Neutral
Creating categories from data can be difficult—and certain cases can be fuzzy. As the data scientist, you get to decide how observations are classified into broader categories. For example, you choose what counts as ‘nature.’ Accessing, cleaning and presenting data means you have the ability to control the narrative. I’m currently reflecting on this responsibility of data scientists.

📔  Storytelling
Data visualization is a POV. So what’s your story?

Creating an interactive data visualization dashboard to explore the distribution of shade ranges across naming conventions and the racialized dimensions of classification

What’s the deal with names?

Explore the dashboard here ↗

Process

Insights

OpenIDEO: Designing for Digital Thriving