Jay Grossman
Jay Grossman
Published on:

My favorite data science interview question

My favorite data science interview question

I generally prefer to structure interviews for data candidates with open discussions around topics that are fundamental to the company they are interviewing with, as opposed to generic toy examples or brain teasers. There is a high likelihood that this approach introduces them to types of challenges they may experience and the underlying complexities. I make it very clear that we are not using the interview process to get free solutions to current problems from candidates.

The Question:

Rent the Runway (RtR for short) is an online platform that allows users to rent, subscribe to, or buy designer clothing and accessories. From 2016-2020, I led various data teams there and got to interview + hire lots of smart folks. I would usually ask candidates my favorite question:

What clothes should Rent the Runway buy and at what price?

Ground rules and assumptions:

  • 30-45 minute discussion.
  • I start high level and add context + complexity.
  • Candidates encouraged to whiteboard ideas.
  • Candidate could ask questions.

Why I love this question:

  1. Real World Problem - Inventory acquisition is core to RtR's business. Most candidates walked away learning new things and more interested in the company after the discussion.

  2. Dealing with something new - I am looking to see how candidates react to something they probably hadn't thought about, since almost all of them didn't have merchandising backgrounds. I want to understand:
    • How do they think through the problem?
    • How do they structure their response & communicate?
    • How well do they understand business, specifically RtR's business?
    • What questions do they ask?
  3. Adapting to new info - I added more details about the business as we progressed, introducing nuanced complexity. Often details they wouldn't know unless they were familiar with RtR's operations. I'm interested in observing how they react as their initial assumptions are challenged.

Some things that make this harder + interesting:

  1. RtR was creating a new category (clothing rental at scale) and it was very different from traditional retail in several ways:
    • The inventory needed to last through many wears and cleanings.
    • Items may need to have a lifetime of several years to reach profitability. This means that capital is tied up and longer payback timeframes.
    • The fashion and size needs of RtR's customer base may be different than many retailers, so assortment choices and depths of sizes are more nuanced.
    • Retailers generally discount excess inventory toward the end of a season (to recoup capital and free space). End of life considerations/strategies for rentals are likely different.
  2. RtR was in growth mode after significant VC investment, always looking to aggressively expand their customer base and offerings. Inventory acquisition was the largest capital expense for the company, so maximizing this investment area was always a top priority. This meant we had to consider how to balance the needs of their existing customers (minimizing churn) while cost efficiently attacting new cohorts of customers.

    The finance team did amazing jobs to build out models + forecasts for many potential growth targets, considering many complex assumptions and trade offs. The buying and merchandising teams had to balance multiple forecast scenarios with the realities of the fashion industry and the operational constraints of the business.

  3. RtR has created multiple offerings that could serve different customer profiles:
    • One-time rental - you rent a dress for 4 or 8 days for a specific event.
    • Subscription - you pay a monthly fee to rent a certain number of items at a time, allowing you to swap items whenever you want.
    • Try to Buy - you can buy items instead of returning them.
    Each of these offerings may have different customer segments, different pricing strategies, and different inventory needs. A large percentage of the inventory was made available for multiple offerings and customers often switched between offerings. It is especially interesting to consider that the user activity for each offering may directly impact the inventory considerations for the other offerings.

    Please note - RtR has further iterated since 2020 to significantly change their offerings, adding more options and complexity.


Things I looked for in the Interview:

As with many data science related problems, there are not completely right or wrong answers to the question. I am generally expect candidates to demonstrate understanding of the business and how they may consider the aspects of the problem:

Discussions about our Data:

It's likely going to be pretty hard to do analysis and potentially build models if you don't have relevant data available! Hopefully there is some brainstorming around some the types of data we can hope to have for this exercise. Some common areas include:

  • Listings of items bought from previous seasons, along with the inventory costs and rental history.
  • Customer feedback for items (RtR has millions of reviews, requiring subscribers to rate each item rented).
  • Customer shopping experience details.
  • Operational details (costs, delivery times, cleaning/repair history, customer service interactions, etc).
  • Current and upcoming trends that should be considered.

Discussions around Demand:

Since part of the question is "What to buy?", pretty much every candidate spends some time talking about how they think about demand for the items we buy.

Using Historical Data:

Most candidates came to the idea that we should look at historical sales totals to understand to see the overall direction and potentially make assumptions about it continuing. Many drew a diagram similar to the simple example below:

demand_planning_linear_regression

The basic formula can be represented as:

Predicted Demand = f(historical rentals, time)

Trend, Seasonality and Growth

Trend Factor is a multiplier derived by looking at historical sales data to identify the average rate of increase or decrease over a given time period. For instance, we may see sales has increased by 20% over the past 3 months, so Trend Factor = 1.2.

Seasonality Index is calculated similar to Trend Factor. We can divide the demand for each month by the average monthly demand across all months. For example, if December's demand is 1.4 times the average, its seasonality index would be 1.4.

Growth Rate = (Predicted Subscribers - Current Subscribers) / Current Subscribers

Final Demand = (Predicted Demand * Trend Factor * Seasonality Index) * (1 + Subscriber Growth Rate)

Product Assortment

While short black cocktail dresses in size 4 may be profitable and very popular in the spring, we probably wouldn't want to only buy only this category of inventory. So we would probably want to think about the best of inventory categories, styles and designers that will be popular with users and are complimentary.

rtr_assortment
Rent the Runway offered a variety of fashion and accessories

Better candidates talk about some of the following for :

  • Utilization Metric - The percentage of days in a month that is an item rented (and not sitting in the warehouse)? If an item is rented 10 times in a month, it has a utilization rate of 10/30 = 33%.
  • Meeting Customer Expectations - Need to consider that the inventory would be attractive to prospective customers. This means we need to consider that we have carry sizes that would fit and styles that would support various fashion needs (everyday work, formal events, casual, etc.).
  • Depth and Replenishment - We may choose to buy new styles or replenish previous styles from designers. We also need to make decisions about the sizes we want to carry for certain products.

Some candidates would go on to discuss how they could model a style level portfolio. Some would talk about the relationship between demand and personalized recommendations to customers. Some even talked about segmenting customers into target groups (based on size, geography, fashion persona, etc.) and discussed how RtR needed to have enough desirable inventory to attract + keep users.

Discussions around Unit Economics:

Since the second part of the question is "At what price?", usually there were some discussions about item level profitability and about how to spend our inventory budget.

Having multiple offerings required RtR to consider the profitability of each item in their inventory for each use case. The way we would calculate the revenue associated with an item from a one-time rental for 4 days would be different than one of the items that was rented as part of a subscription for the same 4 days.

One-time rental (simple model):
I would generally start with asking candidates to talk through a methodology to calculate profitability for items rented exclusively via the one-time rental business.

Successful candidates would arrive at some version of this formula:

  • unit revenue = (number of rentals * average revenue rental of that unit) + final sale revenue
  • unit costs = initial cost of the item + (number of rentals * average incremental logistics costs of each rental)
  • unit profitability = unit revenue - unit costs

RtR offered customers a “free” backup size with each rental to minimize issues when the primary size choice did not fit customers. When we added this context, candidates would need to consider how to attribute across multiple item orders.

Subscription:
If the candidate did well with the simple model, I would introduce the subscription model. This would require candidates to think about how to attribute the revenue contribution for items rented via the subscription model.

Let's illustrate a scenario for subscription:
- Monthly subscription cost: $149
- Number of slots per month: 4
- Total days in the month: 31

Let's say a Ralph Lauren dress (item sku 123) was rented for 8 days in this 31-day subscription period, how would you attribute the revenue for this item?

For our example, we track that the number of days the user had items in the 31 day period:
- slot 1 = 12 days with items
- slot 2 = 14 days with items
- slot 3 = 22 days with items
- slot 4 = 31 days with items

At the end of the month, can count up the total number of days items were rented across all 4 slots for the month (12 + 14 + 22 + 31 = 79). Then we can divide the Monthly subscription cost by number of days with items (average daily item revenue attribution = $149/79=$1.89)

To get the revenue for this item from this subscription, we can multiply the number of days the item was rented (8) by the average daily item revenue attribution ($1.89) to get the attributed revenue for the item ($15.12).

We then subtract the blended average unit costs ($5.02 for prorated shipping, repairs, etc.) of a subscription item to get the profitability for the item as part of this subscription ($15.12 - $5.02 = $10.10).

We can calculate the lifetime item profitability of subscriptions by adding up all of the item's profitability from each subscription rental.

Inventory Acquistion Strategies:
A few candiates brought up that there may be different ways to acquire inventory. RtR had several avenues for acquiring inventory that have impact on the economics and inventory decisions:

  • Inventory Purchase: RtR would buy items from designers at wholesale prices. This would require a significant capital outlay and would require a longer time to recoup the investment.
  • Private Label: This is where RtR would design and partner to manufacture their own branded items. It provided a potentially benefical cost structure and timelines vs. traditional inventory purchase. It also was an opportunity to strategically fill some demand areas in the catalog.
  • Revenue Share: In 2018, my team helped introduce a program where RtR would pay designers a portion of the wholesale cost up front and then share the rental revenue. During the pandemic, a larger percentage of procurement transitioned to this type of agreement.

Discussions around Operational Concerns:

RtR's warehouse was a pretty impressive and complex operation. In additon to more traditional logistics functions (picking, packing, putaway), they had unique processes and workflows for dry cleaning, stain removal and clothing repairs.

rtr_warehouse
Racks of clothes hang in Rent the Runway's distribution center in New Jersey.

Very few candidates brought up operational aspects that may influence decisions around inventory, but here are some very important considerations by the RtR's various teams:

Durability
These items need to last and be available for many rentals (sometimes >30). Items with delicate materials or that wore down quickly were often eliminated during buying cycles. The cost of repairs and replacements can be significant.

Shipping & Delivery
Were the items cost effective for shipping to users? Some larger items, footwear and accessories did not fit well in the standard Rent the Runway shipping bags.

Cleaning
RtR has the biggest dry cleaner facility in North America. While the cleaning expertise is likely unparalleled, it is not always efficient to have multiple rounds for cleaning certain types of materials. Also some items tend to have more frequent challenges when trying to treat stains.

Logistics
RtR implemented automation around their warehouses for moving/tracking inventory (via scanning bar codes and later RFID), and not all types of inventory are well suited for the automation requirements. In addition, larger and non-standard items can cause delays in both inbound and outbound warehouse process flows.

General Interview Aspects:

In addition to the mechanics of trying to answer the question, I am also looking for the following behavior I have seen in effective data scientists:

Framing the problem
It is interesting how candidates begin their attempt to come up with their responses. Some specific things I look for:

  • Do they try to define what would constitute success?
  • Do they ask for context and/or look to validate any assumptions, or do they immediately dive into providing a solution?
  • Do they spend time explaining their thought process and/or how they can associate this problem with others they have seen?
  • Do they try to break the hard problem into digestible parts?

It's worth noting that there were quite a few candidates that immediately tried to identify the machine learning models they would try without really understanding the scope of the problem. Once I provided feedback about their solution, some folks continued with that incorrect path while others adjusted their approach.

Communication Style
Assuming I am their stakeholder for this problem, how do they interact with me? I am looking for candidates to be able to explain their thoughts clearly and concisely. How do they react when the parameters of the problem change or their explanations are challenged? I am looking for cultural red flags in their responses, as I am inclined to hire gritty folks over brilliant/condescending jerks. How do they use visual cues effectively to tell a stroy and/or express their points (like drawing things on a whiteboard)?

Questions I get asked
I generally try to leave the last 10-15 minutes of the interview for the candidate to ask me questions. I am looking for candidates to ask questions that show they are interested in the company, the teams, the problem space and most importantly learning.

The most memorable candidates asked detailed questions that surprise me or lead me to learn new things. Some have led to interesting discussions about hobbies, career aspirations, side projects, team dynamics, management styles and learnings from being a parent.


Other parts of the interview:

While I like to think that the session with my question was always a highlight, the RtR data interview process generally also included:

  • Phone screen: discuss the role + company + team + opportunities and the candidate's interest + background.
  • Technical exercise: either a take home project or 45-60 minute in person pair programming session. Example exercises were analysis of a customer data set, building data pipelines + data models or coding a recommendation engine.
  • Cultural + team fit: candidates meet with 1-2 stakeholders (often product managers) to talk about how they would interact within project team scenarios.
  • Project deep dive: more senior level candidates would present a previous data project to (roughly 10-20) data team members and stakeholders - sharing the business problem + implementation approach + results.