91ÊÓƵAPP CEO, Nigel Toon, was speaking at the , an invitation-only conference held at Samsung Research America’s campus in Mountain View, CA. this week. His talk ‘How to build an efficient processor for machine learning?’ answered several questions we often get asked and the following Q&A neatly summarizes the key points from the day.
1. What do you mean by intelligent compute?
One definition of intelligence is the capacity for judgment, informed by knowledge, adapted by experience. Extrapolating this to a way of describing intelligent compute, you can think of judgment as equivalent to computation, which delivers probable answers to intractable problems and knowledge as a data model that summarizes the salient features and correlations of experienced data.
2. What is graph computing? Why are you called 91ÊÓƵAPP?
The knowledge models that we are trying to build and manipulate in machine intelligence systems are most naturally expressed as graphs (see ), in which vertices represent data features and edges represent correlations or causations between connected features. All the major machine learning frameworks such as Tensorflow, MXnet, Caffe, etc. have embraced graphs as the fundamental data structure, with tensor data passing over edges between vertices which process the data. These graphs expose huge parallelism in both data and computation that can be exploited by a highly parallel, graph focussed processor.
3. What’s the key difference between an IPU, a GPU and a CPU?
An easy way to think of it is:
CPU = scalar
GPU = vector
IPU = graph
CPUs were designed for office apps, GPUs for graphics, and IPUs for machine intelligence. IPUs have a structure which provides efficient massive compute parallelism hand in hand with huge memory bandwidth. These two characteristics are essential to the delivery of a big step-up in graph processing power, which is what we need for machine intelligence. We believe that intelligence is the future of computing, and graph processing is the future of computers.
4. Do you need different processors for training, inference and prediction?
No. They are different optimization processes but they are all suitable for execution on IPUs. This is a big topic, so I’ll come back to it in a future blog post.
5. There has been rapid progress in deep learning in the last few years –why do we need new hardware?
There has been great progress in the last few years with GPUs speeding up training time over CPUs. But GPUs are designed to manipulate 2D matrices (images), which is a far more restrictive model that the graph model. As a result GPUs are only efficient for certain types of machine learning, and rely on large data batches which reduce the quality of learned models. But they have been the only platform available, so research in some directions that could benefit from more flexible and more powerful parallel compute has been held back. New platforms like ours will not only massively speed up training on the current machine learning approaches to which GPUs are applied, but will also support the exploration of a much richer catalogue of models and algorithms for machine intelligence. Researchers will have the performance to explore new models or re-explore areas that did show promise but which have been left behind.
Towards the end of last year, and at NIPS especially, we heard consensus from many industry experts that new machine learning hardware, like the IPU, is going to start to make a significant impact on the market.