Mar 26, 2023·edited Mar 26, 2023

Thanks for the very informative survey! I agree with you that Jevon’s paradox is a serious concern - we can develop more efficient ML, but if it means we just apply it indiscriminately then we're worse off in the end. We should think more holistically not just in terms of lifecycle, but also impact - are the externalities of ML worth the gain to society (or the environment if we're talking about ML for positive environmental impact) or is it just for fun/money? This is one of the points we made in our EMNLP paper last year (https://aclanthology.org/2022.emnlp-main.159/). Another point was that as you say, standardized reporting is important, but arguably widespread reporting is even more important, so that we don't need to personally contact dozens of authors like Luccioni and Hernandez-Garcia did to conduct a meta analysis. If you overemphasize accuracy then people will just avoid the headache of reporting unless they really have to. But if it's easy to report some details then they might as well so it (this is also related to the topic of behavior change, which you wrote about in the previous post). That's why we proposed a relatively lightweight model card for this purpose.

Another very recent paper, which was just accepted to NoDaLiDa 2023, proposes an abstract metric to quantity the tradeoff between model development and deployment (https://openreview.net/forum?id=-O-A_6M_oi). I think they make a good point: some times we don't need perfectly accurate measures, but we need to quantify the relative impact of the available alternatives in order to make an informed decision. For this purpose, too, it's better to have estimates for all alternatives even if they're not perfect.

Expand full comment

What strikes me about the question of the CO2 emissions produced by training a machine model as a proportion of its energy input is that it exactly parallels the thermodynamic relationships encountered by all living systems. Karl Friston is a neuroscientist who developed the "Free Energy" theory for biological systems. Free energy is the amount of energy available to do work after subtracting the energy lost in all forms of entropy and heat from total energy. The CO2 produced when hydrocarbons burn is part of the energy lost to entropy.

In a simplified form, biological learning systems continuously compare their internal state model with a predicted state of the environment model and use sensory feedback signals to correct the internal state error distance. The "goal" of living things is to maximize the free energy available to do this. "Goal" implies some agency or intentionality but its really just natural selection at work. Those that are more fit to do this keep evolving, those that are less fit and incur a higher cost go extinct.

An interesting experiment would be to design a machine learning platform that incorporated feedback on its energy consumption and carbon footprint and applied an evolutionary model optimization algorithm to maximize the free energy available to accomplish its model training and inference functions.

If you gave such a system a novel task that required more energy to train, would it first demand more energy before it tried? Would it try to steal energy from other models in the lab? Would some of the models evolve different architectures that combined processing or outsourced functions from other models that reduced its energy consumption? If it got greedy and started crashing energy competitors, could we ethically shut it down? Once machines develop the same energy imperatives as living systems, can we say they are alive?

Expand full comment