Measuring, Understanding, and Predicting Subscriber Value

16. June 2014 23:53 by Jay Grossman in   //  Tags: , , , , , ,   //   Comments (0)

The subscription business model is a business model where a customer must pay a subscription price to have access to the product/service. Below are some examples of some popular commercial subscription offerings:

  • has built a popular brand offering CRM functions.
  • Amazon, Microsoft, etc. offer hosting infrastructure. 
  • WeightWatchers and eDiets offer access to their diet information, tools, and community.
  • Getty offers download of stock photography.
  • Netflix and Hulu offer access to view streamed movies.
  • Odeo and offers access to reference data. 

It is becoming more common that people would rather subscribe to products and services than buy them outright. I am such a big fan of the model for web/software based applications that I implemented it successfully in 2002 on SportsCollectors.Net.

The obvious differentiator for a subscription business vs. selling a sku’d product is the potential for recurring revenue. Most subscription businesses are inherently engineered making it easy for subscribers to continually want to pay for access to a valuable product. This eliminates the need for additional costs of acquisition or sales cycles.

Measuring Subscription Models

As with any business, I am usually interested in understanding its profitability and growth. In the case of subscription models, I look at the profits and growth at both a company/product level and at a user level. I monitor the following metrics to help evaluate performance:

  1. Recur Revenue (RR) for time period
    How much subscription based revenue does the business generate.
  2. Retention
    Length of time that users are willing to pay for the subscription fees.
  3. Total Lifetime Value (TLV)
    All Revenue opportunities and estimated value associated with the user’s subscription. I break this down into 4 categories:
    •  Subscription Income
    •  Product/Affiliate Sales 
    •  Residual Income (advertising, data/content licensing, etc.)
    •  Brand Enhancement (subscribers can become product evangelists)
  4. Customer Acquisition Cost (CAC)
    Costs (marketing, sales, etc.) associated with onboarding the user.
  5. Operational Costs 
    Non-growth spend, the costs of operating the business. This includes hosting infrastructure, developers/designers, license fees, salaries, cost of goods sold, general/administrative, R&D, etc.

Measuring a Subscription Company/Product

As an example to illustrate, let’s say our company offers a subscription product for $10/month. We currently have 500 active subscribers. Each month we average 55 new subscribers and lose 40, with subscription length averaging 5 months long. We calculate an average of $5 of lifetime value for each user related to non-subscription income. The CAC is $8 per user and our operations costs $1000 monthly.

Profitability is the most obvious thing we can look at. Most people would agree with the definition of profit = revenues – cost. 

Month’s subscription income = (500 subscribers * $10 subscription price) = $5000
Month’s non -subscription income = ($5 non-subscription income / 5 months average length of subscription * 500 subscribers) = $500
Month’s acquisition costs = ($8 CAC * 55 new subscribers) = $440
Operations cost = $1000

Monthly Profit Margin = Month’s subscription + Month’s non -subscription income - Month’s acquisition costs - Operations cost
Monthly Profit Margin = $5000 + $500 –$440 - $1000 = $4060

Then I like to look at growth measures. How many people are we bringing and how many are staying can tell us a lot. 

New Subscriber Ratio = New Subscribers / All Subscribers
New Subscriber Ratio = 55 / 500 = 11% 

Monthly Recur Revenue Growth = (Current Month’s Recur Revenue – Previous Month’s Recur Revenue) / Previous Month’s Recur Revenue
Monthly Recur Revenue Growth = ($5000 – $4850) / $4850 = 0.31%

So adding more active subscribers than you lose is almost always a good thing, but I am interested in whether it is cost effective. Growth Efficiency measures how much it costs you to acquire $1 of Acquired Customer Value:

Growth Efficiency = TLV for average user / (CAC for average user– Operational Costs for average user)
Growth Efficiency = ((5 month subscription * $10 subscription fee) + $5 non subscription income) / ($8 CAC + ($1000 operations cost/500 subscribers * 5 month subscription))
Growth Efficiency = $55 / ($8 + $10) = 3.06

In businesses that scale well, we’ll see a linear representation showing as the Operations Costs per user decline as they get spread out as more subscribers are added. Hence growth in subscribers will result in greater profitability and growth efficiency. 

Measuring a Subscriber

If we assume SubscriberX is fairly typical example of the population, and subscribes to the service for 6 months and we invested $7 in CAC to bring him onboard.

The revenues for the SubscriberX are represented by the TLV : (6 month subscription * $10 subscription fee) + $5 non subscription income = $65

The estimated operations cost per user is calculated by taking the overall operations cost, dividing by the number of subscribers, and multiplying it by the number of months the user subscribes to the service. Combining operations and customer acquisition cost give us:  $7 CAC + ($1000 operations cost/500 subscribers * 6 month subscription) = $19. 

SubscriberX represents $46 ($65 TLV - $19 cost) of lifetime profitability.

Growth Efficiency = TLV for SubscriberX / (CAC for SubscriberX – Operational Costs for SubscriberX)
Growth Efficiency = ((6 month subscription * $10 subscription fee) + $5 non subscription income) / ($7 CAC + ($1000 operations cost/500 subscribers * 6 month subscription))
Growth Efficiency $65 / ($7 + $12) = 3.42

Understanding User Value

Just because we can measure the LTV of a subscriber, does not necessarily mean we understand it. It’s important to understand why some subscribers have higher LTV than others. This will allow us to identify actions we can take to attempt to increase an individual subscriber’s LTV.

Although each subscription business is unique and may have different variables describing users’ options and actions, we can take machine learning approaches to gain insight. We should hopefully be able to define the key variables that go into solving for LTV as the target feature.

I usually start by getting a high level understanding of the distribution of the population’s LTV. By getting the summary and generating histograms for LTV and log(LTV), I’ll get the structure of distribution (if it is normalized) and size of the deviation. I then do the same for the components that comprise LTV (Subscription Income, Product/Affiliate Sales, Residual Income, Brand Enhancement).

Then I can use the identified variables and create an optimized decision tree solving for LTV. The partitions of the data identify the most important features and the decision points.

I'm not just interested in solving for LTV, but I want a way to estimate the relationships among variables. So I'll use Regression analysis with backwards elimination to find the set of variables that best explain the variance (highest r-squared) and lowest error.  

For example, on one site I noticed that members that log into the site 5+ days a week and average 2+ logins per day are most likely to highest LTV. The guidance provided was to take steps to improve user engagement and incent users to return regularly. Enhanced alerting features were added, leading to increases in subscriber visitation and LTV.

Predicting User Value

Making predictions is another common machine learning task. Since the structure of our data is defined (we know the target feature and the key variables), there are quite a few supervised learning options that can be used to make predictions (linear regression, CART, support vector machine, K-Nearest Neighbors, etc.).

Predicting LTV

I have found it to be valuable to predict the LTV at different time intervals. For instance, I want to predict the LTV of a subscriber based on the first month’s activity. Then the second month, third month, sixth month, etc.

It can open up interesting possibilities such as:

  • Better understanding behavior as it relates to changes in LTV
  • Categorizing members to make recommendations/introductions
  • Identify candidates where early intervention methods may increase LTV or retention

I set up templates for a number of predictive algorithms and feed each the subscriber data set. To avoid overfitting, cross validation support is implemented as part of the algorithms. I will most often use the one returning the lowest RSME or a blend of the most accurate ones.

Predicting Retention

Subscription sites generally want the highest retention possible, as it translates to greater LTV and profits. Another metric I look to gather is the probability that the member will continue to pay for the service during the upcoming billing period. 

This becomes a binary classification problem, predicting whether a subscriber will be retained for the following month. The model considers variables related to each subscriber’s lifetime activities and current month’s activities. My modeling uses a combination of logistic regression, k-Nearest Neighbors, and CART, also looking for the optimal RMSE.


The framework described in this post has allowed me to gain better understanding of my subscription business and guided my activities in the following areas:

  • Product Development
  • Marketing
  • Customer Service

Creating a representation for the Brand Enhancement component of LTV is usually not a straight forward exercise. I have attributed value for:

  • subscribers (or that have web sites) that refer traffic
  • moderators and admins
  • subscribers that host activities/events

About the author

Jay Grossman

techie / entrepreneur that enjoys:
 1) my kids + awesome wife
 2) building software projects/products
 3) digging for gold in data sets
 4) my various day jobs
 5) rooting for my Boston sports teams:
    New England PatriotsBoston Red SoxBoston CelticsBoston Bruins

Month List