Tuesday, May 29, 2012

The New Competitive Front: Time

Does speed pay?

Speed dating, speed pitching and speed networking all work on the same principle:  First matters.  

In the digital age the time it takes to plan and execute is compressing radically as previously distinct processes are smoothed over with the goal of shrinking time.  A recent Boston Consulting Group article discusses how consumer goods firms (FMCG) are competing on time.  

From the article, best in class firms are substantially quicker at getting things to market:

The secrets are a common goal, no functional boundaries and no politics; this in turn contributes to revenue growth, market positioning and transformation. 

At the extreme, the decisions are automated.   In the world of digital advertising where real-time-bidding, targeting and tracking have shown the value of eliminating time comes new research from Nielsen Catalina Solutions that shows integrating touch points with a consumer's transaction history works even better.   Here is the payback of directly integrating what people buy with the media buy.



The implication is that there are no channels, there are no demos.  There are only needs to satisfy.

Must go faster....
 



Cyborg Anthropology and Marketing

What will we attach next?

At TED Women in late 2010, Amber Case made the point that we are all cyborgs because we're using technology to create or adapt to new environments. The ideas behind the cell phone glued to our ear and a pair of Google Glasses on our head are no different than what the science fiction writers dreamed up decades ago.  However, the key point is that technology isn't used to modify our physical self to be stronger, faster, or immortal - but rather for the first time it extends our mental self across time and space thus making us more human.  

Cyborg
This brings us new challenges:
  • We aren't slowing down so have no time for reflection and as result are losing the ability to create 'self.'
  • We are reinforcing a dependency on technology to remember for us where everything and anything is. 
  • There are now parallel and simultaneous timelines as each device, screen, and application supports varied asynchronous schedules.

So, how can marketers help people deal with alternative selves, panic architecture, and ambient intimacy?
  • Guide us thru this digital adolescence that is defined by competency and understanding rather than age.
  • Facilitate introductions and methods that make us more connected to those with shared interests.  
  • Serve only relevant and interesting content that satisfies our shared needs.
If the digital world creates wormholes that allow consumers to bend time and space to connect then marketing must drop the linear notion of a funnel.

Thursday, May 24, 2012

Risk, Complexity, and Marketing

Why is analysis so misunderstood?

Marketing's goal remains the same: align solutions and needs to everyone's benefit.  A recent McKinsey article described that reality as one of the most complex and riskiest business functions.   Complex because we have to succeed as the tools used to communicate evolve and fragment while at the same time messaging shifts from paid placement to self-expression.  Risky because we're dealing with human emotions and behavior while at the same time we're trying to expand or create new markets as needs shift. 

Analysis, in its simplest form, answers the question: Why are things the way they are?   For businesses, it often starts with the understanding of why we might be deviating from expectations.  Our forecasts are always wrong, what is interesting is why.  Analysis turns the interesting into the useful.

We deal with risk and complexity by one of three means:  ignoring it and forging ahead with a vision and thus change the world, sticking our heads in the sand and let the world pass us by, or by trying to minimize the associated uncertainty and adapt accordingly.  In fact one can think of analytics as a means of quantifying uncertainty in order to reduce risk to the business. 

Analysis. like marketing, comes in many flavors.   A recent post in Forbes outlined three different uses of analysis:
  • Evaluative:  the review of marketing execution and includes ROI, attribution and a host of other 'devil in the details' functions.  By far the most common use of analysis.
  • Instrumental: the development of insights that guide strategy development and implementation.  These analyses focus on understanding why we might be successful, not how we will get there.
  • Conceptual: the challenging (and changing) of pre-conceived notions and assumptions about a market.   Since the mind is the most difficult thing to change and the impact is likely longer term this class has the potential to affect total business performance.

We need to do more of all of the above. 

Just like a brand is not a logo, analysis is not a computation.




Tuesday, May 22, 2012

Methods in Customer Lifetime Value

What is the future value of each customer?

Sales and Marketing needs to make choices about where to invest: whom to encourage or discourage, whom to offer premium support to, and a myriad of other decisions.   The scarcity of resources or simple misalignment means that not every customer is necessarily a good one.   So, how do we define good?

In this context value is defined in terms of profitability – gross, net, or contribution – and not sales.  Total revenue masks too many costs to be useful in investment decisions.  The objective here is to explore ways that create differentiation or spread between customers so that different strategies and tactics can be implemented.

A customer’s lifetime value (CLV) is often used as the basis of discriminating between customers.    The premise is simple: profitable customers yield a profitable business.  While we may know what a customer has done for us lately, the important question is:  what will she do going forward?

Rather than dusting off the crystal ball, this series looks at three different categories for estimating the future value of a customer and describes a total of nine methods using customer behavior and interactions.  Each of the following three areas are described in separate posts.
crystal ball
  1. Time Series – the simplest, and often overlooked, type of forecasting
  2. Segment Migration – the idea that some people migrate from one band of profitability to another over time
  3. Predictive Models – the use of explanatory variables to identify the specific levers that drive value

I'll save the idea of predicting the Stanley Cup and NBA finals for the Ouija Board....

CLV Episode 1: Time Series

What can the passage of time tell us?

Working under the dual assumptions of “history repeats itself” and “we are creature of habits” time series approaches leverage the pattern of historic sales or transaction data to estimate the future.

With these methods you are driving down the road looking out the rear view mirror.  They are based on simple assumptions and make no attempt to explain why anyone is or is not profitable.  That said these techniques might out perform more sophisticated models. 

Let’s imagine a scenario with sufficient historic information to estimate the profit of each customer for each month in the past two years.  That is we know what he bought and what our cost to market, sell and service has been.

NPV
Given either average or most recent profitability we can create the present value of that stream of cash over the foreseeable future.     It looks like this:

where
  • N is the number of time periods
  • R is the profit at time t
  • i is the discount rate that addresses uncertainty about the future.

An interesting twist is to vary both time and risk on a per customer basis.

Since customer lifetime value is predicated on transacting in the future we can adjust the current value accordingly.  In certain categories a person’s age is a surrogate answer to the question:  How long will she be a customer?   And if there are insights in terms of attrition, then the risk factor can be raised or lowered accordingly.

This approach works well when average or current profit itself isn't expected to change that much.  It also fits expected ideas about customers:  Young, active customers will be more profitable over the long run. 

Trend
Given sufficient history of profit, one could apply time series techniques to predict the trend going forward.  There are two approaches possible here using regression:
  • The simplest way would be to estimate profit as simply the passage of time in order to isolate the trend component.  
Yt = a + bt + et
where the value at time t is some multiple of the monthly change in a customer's profit (b) plus a baseline figure (a).    Future profit is simply the number of months since the start of the measurement window times the rate of change.
  • A slight extension would be to estimate profit based on the previous profit levels themselves.  It looks like:
where the value at time t is a weighted value of previous profits.   Future profit is based on estimating t+1, t+2, etc. up to the expected end of the relationship.   

As with NPV, both methods can create the desired spread by varying the length of time for each customer.

This approach fits well over the short-to-medium term when customer behavior, and thus profit, trends one-way or another over time.  However, at some point there is a bend in the road that we won’t see and this type of trending can produce surprises if used too far out into the future.

Depth of Repeat
“Once a customer, always a customer” is a big fat myth.   Consumers defect and do so at a surprisingly predictable rate.   We can take advantage of this fact and develop a depth of repeat model that estimates the number of purchases over a given time span.

The date requirements are slightly different than above and focus on using recency and frequency type metrics as the first.   For each customer we need to answer the following questions:
  • How many repeat transactions did they make?
  • How long did they have to make those transactions?
  • How long since their last transaction?
Rather than the regression or econometric approach typical used by marketing analysts, this method uses the distribution of past behavior to estimate the probability of a customer doing something.   To provide a basis of discussion, we’ll start with a picture of the results first.   The following comes from a published analysis of repeat purchases for CDNow and it shows good fit with the number of future transactions. 


And since this is a more technical piece, here’s how it looks to the analyst (and yes it can be implemented in Excel).

The above simply turns things marketers believe into useful planning information.  The probability that a customer will make a certain number of transactions given her history is based on two key assumptions.
  • The number of transactions a customer makes varies around their historic average.
  • Customers may stop being customers after a transaction and thus drop out never to be seen again.

The result is a series of four parameters that generate the best fitting depth-of-repeat curve.   Thus we can now estimate the number of transactions each customer will have in the future.  And given the profitability of a transaction, we have future lifetime value.

This approach (BG/NBD) and its variants have been used for years in estimating the success or failure of new products from survey data so has a lot of validity.  But as with other time series approaches it is a bit of a black box from a tactical point of view.

Next up:  Segment Migration
Then: Predictive Models

CLV Episode 2: Segment Migration

What bucket will you be in?

The Time Series methods discussed previously compute lifetime value based on individual performance and behavior without regard to either similarity between customers or specific drivers that impact profitability.  Sometimes we can gain more insight by pooling customer behavior and segmenting customers into cohesive groups.  This allows us to either use more information in estimating future profitability or reducing uncertainty by focusing on sameness.   These approaches also begin to help marketers think about what strategic and tactical choices they might want to make from an investment point of view.

Segmentation offers two specific advantages for estimating the future:  stability and migration.
  • Stability comes from the safety of understanding how a group behaves rather than a specific individual.  And while ‘average’ is the most dangerous word in marketing; segments provide multiple averages to use.
  • Migration reflects that people can change over time and move from one segment to another.  Life-changing events are clear boundaries between segments. 
A market segment is a group of consumers with a common need and can be reached effectively using a common set of tactics.   Similarly, a profit segment is a group of consumers whose profitability, behavior or characteristics are likewise common to one another.  Since the time series methods can be applied to segments as well as to individuals, we’ll focus on potential ways to segment and then think about migration.

Profit Only Quantiles
The concept of dividing customers into equal sized groups based on some metric is common in response analysis.   We often set campaign selection rules that ‘target the top two deciles’ – or that 20% of our database most likely to respond.

In the CLV scenario, we can create segments based simply on the distribution of individual profitability across customers.   Two commonly used methods would be deciles and quartiles. 
  • Deciles: create 10 equal sized segments of customers, each with 10% of the customers
  • Quartiles: create four equal sized segments, each with 25% of the customers.   Sometimes this will be simplified into three segments by combining the middle two.
The boundaries of quantiles are driven purely by first ranking the customers and then dividing them into equally sized buckets.   This is where the Pareto rule drives from:  20% of my customers account for 80% of my profits.  This approach works when profit is a direct result of behavior, product margin is fixed and customer characteristics don’t discriminate between profit levels.

Descriptive Nodes
Rather than using profit as the sole segmentation measure, we can use all available behavioral and descriptive data to differentiate between customer profitability.   Since we’re interested in assigning people to segments, as opposed to predicting a specific outcome, this approach leverages classification and regression trees.

The following example looks at survival rate of Titanic passengers according to demographic characteristics.  If you were female or a male in a large familial group (siblings or spouses) the odds of survival were in excess of 70%.



Descriptive nodes have a couple of distinct advantages:
  • We begin to understand what characteristics and events relate to profitability because the approach splits customers according to the attribute that does the best job of discriminating between high and low.   
  • All customers are assigned to one of a set of end segments that can be reduced to simple business "if then" rules.   While often mapped to deciles, there is no reason that the business has to fit the data to the 10% rule. 
This approach works when customer attributes or life-stage relate to behavior or when contribution margin varies greatly across the product line or channel.  

Clusters
Rather than splitting on levels of a specific attribute like profit as Descriptive Nodes does, clustering is a machine-learning approach that groups consumers based on their similarity to or distance from one another.   Those who exhibit similar patterns across a host of variables are grouped together while other consumers are placed into different groups with their neighbors.  

In clustering we’re trying to find homogeneous groups and then see how they differ in terms of profitability.  There are numerous techniques to choose from, but they have all the same objective: reduce customers to groups.  This approach does carry the  risk that while distinct segments may emerge they may not have different profitability profiles.

Clustering is appropriate if unusual or complex ways of identifying 'sameness' are required.  And it is particularly useful when one should segment based on implicit or explicit consumer needs.  Because needs come and go or shift in importance, the size of the segment is critical to understand over time. There may be very profitable customers in a segment but if the underlying need evaporates, they will go the way of the carrier pigeon.

Migration
Since we are interested in the future we need to come up with ways to understand where customers will end up.  All of the segmentation approaches allow us to not only to define to which segment a customer belongs, but also the likelihood of customers migrating from one segment to another over time.

If we look at historic data we can apply a segmentation scheme at the beginning and then again at every subsequent planning period.  This allows us to understand how consumers change over time and it is this information we’re most interested in.    The following is a simple view of what we’re looking for using quantiles. 

The columns represent where someone is today and the rows are the probability she will be in a particular segment next period.  This includes the idea that someone might no longer be a customer, thereby accounting for attrition.  For example, a "Bottom Third" customer is likely to still be in the "Bottom Third" but there is a 10% chance they'll improve and a 20% chance of no longer being a customer.

Because we’re dealing with uncertainty, migration should be simulated a large number of times in order to arrive at a stable estimate for each customer.  If we didn’t simulate the answer would simply be the segment with the largest probability – which in this example is the status quo.   And if we're not sure how stable the migration rates are, we can simulate them as well.

The key is to focus on the odds of changing segments over time and use that to guide the estimate of profitability.


Previous episode: Time Series
Next episode: Predictive Models

CLV Episode 3: Predictive Models

What drives profit?

The previous two sections, Time Series and Segment Migration, address the problem of future profitability through trends and migration.   Predictive Modeling approaches tackle the problem head on.  That is, they use all possible information to predict the profitability figure itself.  This class of techniques requires the most data and has the greatest risk/benefit profile.   It can be very good or be very wrong.   Predictive modeling is excellent for interpolation within the boundaries of a problem, but it can be suspect when trying to extrapolate to areas – like the future – where it hasn’t seen any data. 

All the ideas presented here look like the following:

where the x's are the different variables used and 'B' is their respective contribution to profit 'y'.

These approaches can be applied to individual customers, segments or across the enter customer database depending on objectives and business needs.

Direct Estimation
The easiest method to imagine is to predict profitability directly from a combination of factors – customer characteristics, behavior, market conditions, etc.   This is an extension of the Time Series model shown in the first chapter.

Since we’re dealing with time series, a subset of variables should be the consumer’s previous behavior and profitability.  It is common to see ‘lagged profit’, ‘lagged growth in profit’ or ‘lagged transactions’ in such a model.    Future profit is simply a recursive set of estimates each one period further in the future using the last period’s estimates.   

This approach works given sufficient history of profitability data.  How far in the future one can predict is often a function of how far back one can go to create the baseline trend.
 
Drivers First
Profit is often an outcome of a distinct set of behaviors and characteristics.  Thus, it may make more sense to estimate those factors first and then derive profitability from them.   This is particularly true if product margins and the cost to serve vary substantially over customer segments and product lines.

Profit only exists if a consumer is still a customer.  So, if retention or churn is an issue then focusing on the probability of future purchasing makes sense as a first step.   In this scenario, factors like cross-buying might help since it is often assumed that the more products a customer buys, the more loyal they are.  

An issue with driver estimation is that one model begets another.   For instance, to use cross-buying we need to estimate not only the likelihood of doing so but its relationship to overall profitability.  For instance, it is known that consumers with high customer service costs; heavy promotion usage and a history of revenue reversal actually are more unprofitable the more they buy.

This approach works when profitability varies greatly by things the business can influence.  The challenge then becomes one of estimating all the inputs in the future and keeping those relationships straight.

Marketing Influence
A special extension of the driver estimation procedure is to explicitly model the impact of marketing on the propensity to purchase in the first place.  Whether this is a marketing mix model approach or an assessment of direct contact is a function of the kind of business involved.

This approach fits scenarios with the following logic:

•    Marketing is allocated to people deemed worthy of investment; i.e. it isn’t random.
•    Marketing directly impacts the probability of a purchase
•    Profitability is a function of purchasing and level of marketing conducted

These statements imply that three separate models are required using transaction and marketing contact information.  Each builds on the previous one by using one set of predictions as inputs to the next model.  

This approach is the most theoretically sound and directly accounts for marketing activity. On the flip side it has substantial data and analytic requirements to implement. It has been shown to do a better job than some of the time series suggestions. 

Dealing with Uncertainty
Usually in predictive modeling we’re interested in interpretation rather than estimation.  As a result we tend to ignore the little ‘e’ at the end of the equation listed above.   The error term represents variance we can’t explain and is assumed to average out to be zero and not related to anything else in the model.  Thus, it has little value to the business in terms of explaining what’s going on.  

However, in forecasting the future that little e implies that there isn’t one predicted profitability but rather a whole bunch of them that vary in amount depending on the size of the error term.  Plus, if we do use some nested form of models where one set of outputs is used later as inputs it is quite possible that the individual error terms play off one another creating huge swings in profitability. 

So, like the migration suggestions above it is recommended that any predictive model that estimates the future be run thru a number of scenarios that change the error term.  Here’s an example of the distribution of Customer Lifetime Value from a research paper entitled “Will the Frog change into a Prince."


While the most likely lifetime value for this individual customer is around $200,000 the long tail of high values just might influence the marketing investment or customer support decisions.

Summary and Recommendations
Given scarce marketing resources, it makes sense to focus them on the most important or valuable customers.   Targeting customers who will be profitable is probably the most efficient way to spend marketing dollars.   To do so requires the creation of Customer Lifetime Value based on future behavior and potentially marketing investments.

This series covered a wide range of methods from the simple, naïve approach of ‘most recent’ to contemporary advances from academics.   Since the future is unknowable sometimes the simplest models work best.  There is no single, best approach – it all depends.

1.    Define the objective, appetite for risk and uses of Customer Lifetime Value
  • Time Series: stable, abrupt changes not expected, direct marketing not in play
  • Segment Migration: different groups exist that behave very differently, marketing influences medium to long term, market structure may change
  • Predictive Modeling: large number of known drivers, one of which is direct marketing contact
2.    Define multiple techniques for estimating and simulating the future and ‘average the results’

And remember: Forecasts are always wrong, what is interesting is why.



Friday, May 18, 2012

Expressive, Editorial and Intentional Content

What kinds of ads work where?

Julie Shoenfeld of Perfect Market updated her summary of the LUMAscape for display and discusses the directions display advertising will likely take as the industry shakes out.   One key point at the end is that display advertising may be served with emotion (Facebook) or with intent (Google).

Thinking about that point from a content perspective suggests there are three environments where ads appear.
  • Expressive: the communal world of sharing and social networks
  • Editorial: the curated world of brands and publishers
  • Intentional: the direct act of searching for something
Given those different contexts one can begin to imagine different creative briefs, media plans and even measures of success.  The discussion about GM pulling $10m from advertising Facebook revolves around understanding expectations.  And their pull-out from the Super Bowl clarifies how they view things a bit.

If we think ads served in the world of self-expression surrounded by the banal and the inspiring are going to work the same way as those served in a shopping scenario then we're not digging deep enough into how content works and people decide.  

The proof is in the click-thru rate of all ad types:
Facebook: 0.051%
Google: 0.4%
Average: 0.1%

As the worlds largest broker of destinations, Google's number should be the highest.   And the drop down to Facebook reinforces the notion, as does Internet minutes, that people aren't willing to go anywhere else.   In the middle is the traditional, editorial context where brands have a broader range of options to work with.

So, given those performance figures how do we use these media properties and what is the best use for them in our plans?    Some ideas on what attributes of might work in each context.
  • Expressive: interesting, entertaining, humorous, startling
  • Editorial: affinity, knowledge, reputation, positioning, differentiation
  • Intentional: features, advantages, benefits, pricing, promotions
Not sure what attributes to use yet for the home page on Kindle where Amazon will now sell ads.

Thursday, May 17, 2012

Foresights vs. Insights

What's the difference?

Insights are typically focused on the here and now looking for opportunities to take advantage of or activities where course corrections are needed that impact the current planning cycle.   

In the case of foresight we try to imagine a new future that might affect long term planning.  Since there are things we simply don’t know the thinking and analysis style is quite different.  Two questions that help frame the future:
What macro events or trends create step changes in consumer needs that we’re attempting to satisfy? 
For instance the Great Recession fundamentally shook our confidence, which in turn reversed the trends in savings rate and disposable income.  This translated into a need to be frugal which in turn led to ‘deal chic’ where promotions are a badge of honor.  
Who would have thought that sub-prime mortgages would give rise to Groupon?
What needs and solutions could emerge that alter the fundamental positioning of our brand in consumers’ minds?  


For instance, disruptive innovation or blue oceans completely clearly reset the landscape, resulting in new winners and losers.  Given that the mind is the most difficult thing to change there is a tendency to be blindsided.  The idea of taking an unimportant feature and making a market out of that is the basis of being blindsided.   Who would want a small drive that is slower, holds less, and costs more per unit stored?  The folks that wanted to make computing distributed rather than efficient; they created totally new markets.

Insights: how do we improve the existing marketing plan?
Foresights: what marketing plan should we have?

Note: this idea came from discussion on "Growth and Foresight Analytics" on LinkedIn.