Wednesday, June 06, 2007

Netflix and Recommendation engines


Given I was talking about recommendation engines I should really mention one of the most interesting of the moment . This is a big prize where academic and small start ups attempt to beat the internal teams recommendations. I mention those to groups as the larger companies are most likely not paying any attention as the prize involves handing over all methodology to Netflix, something the larger companies would like to keep for themselves thank you very much. (a million dollars isnt worth SAS giving away their trade secrets).

The problem with the way netflix set this up is that it's all about horsepower, its all about trying to show that netflix can be outscored with some tremendous algorithm, however what I suspect the competitors are finding is that much like most data mining situations, its not the power of your algorithm, more your understanding of the industry and hence the underlying data. There will be very little between many of the main algorithms, certainly not the 10% Netflix is looking for, it all comes from knowing the area. I think this is what the comment “One of the reasons of our current leading position is that we keep on trying to implement new tricks and ideas, and a reasonable portion of them works” really alludes to.

What will happen is that the solution will probably be one very fine tuned for Netflix (good for them) but that it isnt totally translatable and marketable to other industries. However I'm sure whatever architecture they come up with will indeed be a good starting point and I think its really good Netflix have open this up.

Why does anyone need a recommendation engine?

Well in Netflix case its all about making sure people select products and as many as possible - making sure the recommendation precedes a selection at all costs (in the case of Netflix I bet I could get 70% of there score by clever customer segmentation and only 50 of all of their products - but I've not done this so you can chose to ignore what I say on the basis I'm an armchair analyst)

Is this the only goal of a recommendation engine?

Not really. Something like a public services broadcaster should not ultimately be about maximum purchases but about delivering appropriate selections that open up the various offerings to a previously unknowing general public. Particularly in the days of the long tail and thousands of products, it can be about making people aware of what you offer. This is done in conjunction with your regular campaigns to make sure you dont lose the business of people unwilling to take a chance but that those that like a varied mix are dealt with accordingly. In doing this a company would take a hit immediately on the netflix score but over the course of time gain the trust of customers who will eventually pay them back several times over.

How do you go about making recommendations?

Well traditionally there are five different ways I'm used to doing it, although I'd love to know more.

Business Rules - you tell the customer what they want and what you are prepared to give them based on business knowledge and current strategy.

Counting - you find out what is most commonly purchased with the other products bought by a customer

Predictive algorithms - you say "what are the characteristics that define a person who buys this product" and then score all the products against each other.

Customer Segmentation - the customer looks like they belong to group X. We know that X's most like children's movies

Trending/Time series - "we know that action movies sell better on fridays whereas cerebral offerings tend to be earlier in the week"

Sequences - Hmm. A person has bought the first 4 series of "24", I wonder what they'll want...

What factors influence these approaches?

Trigger events - "we know loads of people seem to get romantic movies around mid february"

Discounts - People love discounts

Whos buying - People dont always buy for themselves.

Service levels - not point in recommending stuff thats out of stock, takes time to deliver or even costs more than a person would likely pay.

Current Behaviour - often called Scenarios by a lot of people nowadays. If a person is looking at horror movies, even if they have a high propensity for fluffy romantic numbers based on previous behaviour, its probably best you stick to recommending the horror genre.

Its a new product! - I have no data, how do I recommend it!

So how does all this come together?

Well thats an interesting point, more often than not, one approach will get selected and then optimised and squeezed until it reaches a plateau, few vendors have the time to invest on a complex architecture. I am a great believer in Ockham's razor and as such selecting one approach that has clear feedback to a business on how it works is in line with a lot of my thinking.

In fact, one of the main points that suspect will be missed by this scoring competition - any business needs to understand why something gets recommended so it can plan and structure itself to meet these needs better - a black box is good in the short term, bad in the longer and the better something gets at predicting activity, the more specialised and harder to understand it becomes.

However, acknowledging Ockham's Razor, I think that one of the main things with producing a good recommendation engine is to have a combination of all approaches (each has weakness in different areas that I'll go into at another post) so that the system can be adjusted to reflect the way a customer does business - for instance, trigger points will have different impacts in different business, current behaviour will also be more relevant in some than others., in some industries trends will be more prevalent.

As I highlighted to begin with the goal of the engine may be different, and within the same organisation it could be to push the top ten products in one channel, but to be more exploratory in another - and as long as the engine can provide feedback on the cost of those decision to the business this can then alter strategy going forwards.

Why am I interested? well it is fascinating stuff and it has potential for so many areas. Anywhere where recommendations or advice can be offered could stand to benefit from greater expertise in this area. For my own part I'm looking at content and product delivery on websites currently, but it certainly doesnt stop there.

Anyway, enough for today. ttfn.

NB. I think I need more pictures, this is a horrid wall of text....

No comments:

Footer

Add to Technorati Favorites