Thursday, July 30, 2009

An Interesting Week in Predictive Analytics / Data Mining



Well, the big news splashed all over the front pages of the industry is that IBM purchased SPSS for $1.2BN. This is a huge deal for the industry.

Consolidation had been occurring in the more BI focused market with Hyperion, Business Objects and Cognos all being bought in the recent period but moves into the "Predictive Analytics" space was yet to happen. SPSS have in recent years made repeated announcements about their closeness to IBM, but I think the giveaway for some sort of takeover was the re branding of all products so they no longer used the SPSS moniker (Everything changed to PASW ...). Still, I'm surprised at the timings (credit crunch) and the price (42% above share price)!

This article in Computer World shows some of the more interesting stats about the deal.

For SPSS, it means many things but I'm always more curious about the shape of things to come for clever technologies in the marketplace and also things from an application or technical point of view rather than from a story of the companies finances. I realise that SPSS can now start tying its tools and applications very closely to specific technology vendor - hopefully developing bigger, better and more specific applications, allowing it to put predictive analytics into more useful (and no doubt more expensive) situations. I will be curious if this comes at a cost of its open attitude when working with other technologies and databases which I've always considered one of its strong points.

I'm very curious to see what happens and I'd also like to congratulate Jack Noonan for not only becoming a very rich man but also for having sold his company over the $1BN mark which has for so long been his goal.

And in other news...


The Netflix contest has ended - nearly... There's an interesting review of the ending on KDD. The winners still have to be chosen and I do wonder how their performances will fair in the final stages. Obviously the secret Test set has to be similar to the Quiz set but you do have to wonder which bits of the magic 10% that has been achieved by the winners occur due to overfitting/simiarities in the public data that are not to be found in the Test data. I've seen many models and situations where there inability to generalise has left me with a quandary of having to dump my "favoured" model.

In this case, how will Netflix decide who to pay the money to and will this have any interesting discoveries for the community for how we should consider validating our progress. If the current "winners" turn out to be well down the league in terms of test performance, and someone else steals a march on them, it could make for interesting conversations given the effort these teams have no doubt put it.

Who knows, but I'm all ears!

Footer

Add to Technorati Favorites