Datagami at JSConf.asia and CSSConf.asia 2014

To end our trip to Singapore this week, I made a stop at JSConf.asia (and its sibling CSSConf.asia).

There were a range of great talks on topics from developer productivity to flexbox to @mikeal‘s take on the future of node.js (foreshadowing io.js!).

And most important of all, I loved the coffee from Jimmy Monkey:

Predictive APIs and Apps – Barcelona

It all started with a harmless tweet. Next thing we know, Louis Dorard had invited us to be sponsors of his conference, Predictive APIs, in Barcelona. Spain wasn’t the obvious focus for our business development efforts but then how often does an early-stage startup like ours get the chance for this kind of publicity?

We did what any startup guys would do, we procrastinated and wrote more code. Then the call for papers came out and we realised how professionally run this event was going to be. In fact, the external review panel was so eminent we thought we better hedge our bets and put in two proposals. As luck would have it, I was the only person to do adopt this strategy and both got accepted. Talk about being the victim of your own success!

The conference itself was three days of pure delight for a machine learning nerd. All the cool kids were there (check them out!). What was nice was the cameraderie; everyone was on a similar journey of trying to make something real in the face of unending hype about big data. The contrast with Strata Hadoop the following week could not have been more stark.

The core thesis of Louis Dorard’s book, that the commoditification of machine learning through APIs will fundamentally change how we build software, is looking like coming true in spades. It’s an exciting time and we can’t wait for the second installment of the conference in Sydney in August 2015.

ICML 2014 Beijing

As a startup in a hot technical field, staying in touch with the cutting edge of research is vital. Since I found my way into machine learning via the realisation that the optimisation techniques I’d grown up with in molecular physics were the bread and butter of modern ML, the idea of crashing someone else’s academic conference seemed almost reasonable.

So earlier this year, we had our website translated into Mandarin, booked some cheap flights with horrendous connections, and went along to the International Conference on Machine Learning.

Being in Beijing was an added incentive — Australia has boomed in the last two decades digging up dirt and shipping it to China, could we do the same with tech? Turned out the subway was much easier for a newcomer than I’d expected — I only discovered the English signage after spending half an hour committing to memory the characters for my hotel’s stop!

The conference was big – about 700 foreigners and 500 local participants. The talks were intense, 20 minutes each with questions from world experts awaiting the speakers. But people were friendly and surprisingly accepting of a lapsed quantum mechanic trying to wing it in their field. I unexpectedly found myself sitting next to one my heroes, a postdoc from Cambridge whose papers I’d spent weeks poring over the year before. John Langford, the man behind the under-appreciated Vowpal Wabbit, was almost identical to the person I’d imagined from his writings.

There was an amusing scene one evening watching the World Cup with some European blokes at an outdoor bar next to the venue. This extroverted guy was expounding on deep learning and I asked pointedly whether he could help me if I didn’t have millions of samples of well structured data (there have been several presentations on breakthroughs in voice and image analysis using deep nets). We got into a bit of a technical discussion about optimisation techniques and he kind of admitted that maybe deep learning wasn’t the answer to everything, but asserted it was the future. The next day I discovered I’d picked a fight with one of the authors of Word2Vec, one of the biggest developments of 2013. Beer and jetlag are my only defence.

Deep learning was definitely the dominant theme of ICML 2014. Andrew Ng was mobbed like a rockstar when he first arrived – you don’t realise from his Stanford online courses quite how tall he is – and every deep learning session was standing room only. At the conference dinner, he gave an impassioned pitch as to why deep learning was the future and how Baidu Research was the place to do it.

The big idea that was being discussed was combining deep networks, say one trained on images with one trained on text, to produce a system that could model or learn something resembling the semantic content of images. So it was both shocking and unsurprising to see the Neural Image Caption Generator when it was published six months later.

The other mind blowing announcement was the introduction of realtime translation by Skype, which is finally being rolled out.

Meanwhile Gaussian processes are closer to my heart than neural nets, simply because I started out solving forecasting problems on smallish data
sets and the maths is both elegant fairly easy. A nice application was presented by Jasper Snoek and friends from Harvard, using Gaussian processes for function optimisation, such as maximising the likelihood for a learning algorithm. The story goes that they discovered Netflix was using the results of one of their papers and so they decided to sell this as a service. It’s nice to see Whetlab launched with full functionality and some impressive results.

The other cool technical breakthrough I discovered at ICML was “tensor factorisation”. Very roughly, methods like non-negative matrix factorisation (NMF) which have been so successful in topic modelling and other problems are effectively working with second moments of the underlying distribution. To make use of information in the higher moments (and there are many reasons why second moments are insufficient), you end up constructing higher-dimensional analogues of matrices, i.e. tensors. The technical difficulties begin with visualising these objects (cubes of frequency counts anyone?) and extend through to the discovery that your usual linear algebra tricks do not all apply. For example, the notion of “diagonalising” a tensor is somewhat more complicated than for a matrix. Animashree Anandkumar gave an inspirational workshop presentation on orthogonal decompositions of tensors using a system developed at Microsoft Research called REEF. Deep learning might not be the end of the game after all.

Datagami at useR 2014

Earlier this month saw us at UCLA for the 2014 useR conference, which featured useful lessons on R performance and a detailed discussion on using R code in production! (Both topics are dear to our hearts.)

I especially liked the session by DataRobot’s Xavier Conort, on winning Kaggle competitions with the gbm package. Too bad the room was tiny!