I spoke briefly on this last pod with Cillian about Google's new TPU (Tensor Processing Unit). I've been able to find some better details about the hardware. Google's been working with these coprocessors since about 2006 and from what I can tell began formally using their TPUs to power their machine learning in 2015. At this point, it seems like the easiest way to think about the TPU is that it is similar to a GPU in that it performs thousands of calculations at once, but its ace-in-the-hole is its incredibly low latency in both pulling and outputting data all at incredibly low wattages compared to traditional GPUs.
On our production AI workloads that utilize neural network inference, the TPU is 15x to 30x faster than contemporary GPUs and CPUs.
The TPU also achieves much better energy efficiency than conventional chips, achieving 30x to 80x improvement in TOPS/Watt measure (tera-operations [trillion or 1012 operations] of computation per Watt of energy consumed).
Google seems to be promising that they will start making these tools available for the public at large. This surprises me as generally companies will keep what differentiates themselves private and allow public access to things which they do not feel they have a significant advantage in the market and wish to commoditize the technology.