Trends in the Compute Requirements of Deep…

Apr 4, 2023

How large and inefficient are deep models getting?

1 Comment

Apr 22, 2023

The use of new hardware architectures like the Wafer Scale Engine (WSE) CS-2 system from Cerebras are achieving remarkable improvements in scaling and efficiency over the Nvidia A-1000 GPU. Here's a link to an article that shows how this has impacted big pharma models and oil industry models.

https://www.nextplatform.com/2022/03/02/cerebras-shows-off-scale-up-ai-performance-for-big-pharma-and-big-oil/

Comparisons of the energy efficiency of the CS-2 with the A-1000 is even more amazing. Even with its exponentially better scaling and performance it achieves this with about half the watts per flop.

https://khairy2011.medium.com/tpu-vs-gpu-vs-cerebras-vs-graphcore-a-fair-comparison-between-ml-hardware-3f5a19d89e38

Expand full comment

Calibrating Uncertainty

Trends in the Compute Requirements of Deep…