4 Comments

I've been tracking some of the progress in hardware design that can address much of the energy loss problems from the current GPU and TPU based architectures mainly from NVIDIA A-1000. The most advanced seems to be the WSE (Wafer Scale Engine) architecture from Cerebras. Rather than printing 49-50 usable NVIDIA GPUs on a wafer and deploying them into separate racks, Cerebras prints a single wafer with 850,000 cores all connected through extremely short conductors. The short distance relative to rack based systems reduces the inductance losses of conventional circuits saving huge amounts of energy while increasing the potential clock speeds and watts/MFLOP performance.

Of course, this comes with a price tag of "several millions" for their lated CS-2 system, but with efficient sharing and scheduling of a central system this cost can be diffused over many users.

A very interesting publication on this comparison is available at this site:

https://khairy2011.medium.com/tpu-vs-gpu-vs-cerebras-vs-graphcore-a-fair-comparison-between-ml-hardware-3f5a19d89e38

Expand full comment
Apr 18, 2023Liked by Dustin Wright

Thanks! It is easy but as you say, you have to care enough to spend the 10 minutes learning how. How do we get people to care enough? Even I find it unreasonably hard to get my students to follow these simple steps, since it would mean having less time for things they *are* rewarded for. What do you do, for example, when you work on or supervise a project that's not itself about efficency/carbon etc.? How do you motivate yourself and others to follow the best practices?

Expand full comment