No, Machine Learning Does Not Have A Huge Carbon Debt

Invest
Electric Cars
Electric Car Benefits
Electric Car Sales
Solar Energy Rocks
RSS
Advertise
Privacy Policy

Clean Power

Published on November 30th, 2019 |

by Michael Barnard

No, Machine Learning Does Not Have A Huge Carbon Debt

Twitter
LinkedIn
Facebook

November 30th, 2019 by Michael Barnard

As part of the CleanTechnica series on the use of machine learning in advancing our low-carbon future, it would be remiss to not point out the carbon debt. However, it’s not as bad as was reported earlier this year, in my estimation.

Let’s talk about the study itself, and the assumptions it made. The paper that made some headlines was Energy and Policy Considerations for Deep Learning in NLP by Strubell, Ganesh, and McCallum of the University of Massachusetts Amherst, and it was published in June of 2019. Strubell and McCallum are part of the team that built a state-of-the-art natural language processing model, LISA. That stands for linguistically informed self-attention, and as followers of the series will remember, attention is core to machine learning.

Some of the numbers provided for the CO2e emissions were quite large, with one model, an advanced translation model referred to as the Evolved Transformer for neural architecture search, having a calculated carbon debt of 626,155 lbs of CO2e to train and optimize. 300 tons of CO2e is quite a bit, but some context is required, and then a recalculation with likely better assumptions.

As a reminder, neural nets are trained occasionally and often used many times. Taking the Tesla machine learning model, it has over 500,000 cars on the road with its neural net chips, and Tesla’s Autopilot and Autosteer features are used by vastly more people than any competitor. As a result, when thinking about the carbon debt of training neural nets, we have to compare that to the number of times that they are actually executed and for what purpose. Given that each Tesla displaces an internal combustion vehicle and that when using autonomous features the cars are actually more efficient, this is a highly virtuous use of machine learning.

As a different example, an earlier article in the series looked at the CoastalDem machine learning model. That use of machine learning took North American satellite radar coastal elevation data, trained it with ground truth from Lidar, validated it against Australian Lidar, and then ran it for the entire world. The model was executed a few times, but the end result is a static dataset of adjusted coastal elevations which is being referenced around the world for policy and climate action planning. In this instance, the understanding of actual threat from climate change and the multiple reuse of the outputs outweighs the carbon debt.

Not all examples are so beneficial, of course. Recently, an article in the series assessed the Heliogen improvements of focusing concentrated solar power (CSP), and found that while the machine learning portion was interesting and potentially reusable in other domains, the end results were very unlikely to be of any value. Certainly, the purported use cases for its higher heat CSP didn’t stand up to scrutiny.

Let’s look at the assumptions made by the research next. The key one I tested was the paper’s assumption of 0.954 lbs of CO2e per kWh for model training. That’s the US average, and as I looked at that I had a hypothesis that it was likely overstated given where most deep machine learning efforts were being performed.

To that end, I first pulled together the data on current state-by-state CO2e per kWh.

Chart by author from IEA data

As can be seen, the US average conceals a wide variance of potential CO2e debts for compute power. A model which is trained in Washington State on compute resources that are powered off of straight grid electricity would have a tenth of the carbon debt of one trained in Wyoming.

My hypothesis was that many of the models in the report would be California-based. The 0.47 lbs CO2e per kWh that is from California’s grid is only 50% of the carbon debt of the US average.

However, after determining this I then went deeper. I looked at each of the major models with a calculated carbon debt in the paper to see where they were actually trained, assuming that at least one or two would be trained in Google data centers, with Google’s 100% renewable commitment and offsets. The results were substantially at odds with my expectations.

Table by author

These are the models and associated training CO2e burden per the paper. When I dug into the compute resources used, I found that in all but one case they were Google or Azure compute resources used for learning. The 3rd through 6th columns are the variance calculation between what the paper suggested and what was likely accurate. To be clear, the NAS Evolved Transformer model still sees 10 tons of CO2e, which is considerable, but also a tiny fraction of the study’s assertion.

I had performed a rough assessment based on publicly available data earlier this year, What Is The Carbon Debt Of Cloud Computing? My assessment found that of the biggest Cloud providers, Google and Microsoft Azure had the lowest carbon debt by far, having not only a commitment to 100% carbon-neutral electricity that they were working to achieve, but also purchasing high-quality carbon offsets for their operations. That puts the CO2e per kWh down in the 0.033 lb range given the full lifecycle emissions of wind, solar, and hydro. Amazon’s AWS wasn’t as good, but had still achieved 50% renewables for its data centers in 2018, meaning its operations are far below the US average.

The authors of the paper used a different approach to assessing data center loads. They started with a 2017 Greenpeace report on the subject, so it was relatively solid, however it doesn’t cite CO2e per kWh at all, but stays silent. Instead, it reports different mixes of electrical generation actually purchased and provides percentages of those. Unsurprisingly, all the major Cloud providers are buying a lot more low carbon electricity than the average for the grid, but also unsurprisingly, they still have to purchase MWh that have been generated from coal and gas. I won’t quibble with Greenpeace’s methodology, but I do find a substantial variance between the bulk purchasing of renewable electricity by Google and Microsoft and the claims that their data centers run in large part on gas- and coal-generated electricity. I suspect that Google and Microsoft are buying sufficient electricity from renewables for their operations, but Greenpeace isn’t choosing to credit them with it.

But that’s not the largest issue with the assumption made by the paper. That assumption is that since Amazon’s AWS is the most popular Cloud compute platform and its breakdown per Greenpeace was roughly the same as the US breakdown, that the US average was appropriate to use. As can be seen from the resulting assessment in the table above, not one of the models assessed used Amazon, so that’s a bit of a problem with the reliability of their results.

To be clear, I’ve taken an average CO2e for renewables assuming Google and Microsoft have purchased offsets to get them there where they are not directly purchasing renewables, but they also might be purchasing offsets for the low full lifecycle CO2e.

This isn’t to say we should disregard the study.

Chart courtesy openai.com

Open AI — back to Elon Musk again — published an assessment of compute cycles required to train machine learning over the years. What they found is that major advances in machine learning capabilities showed an exponential growth in CPU cycles required, shown as a straight line on this logarithmic chart.

The increase in CPU cycles to advance machine learning has been accompanied by advances in efficiencies of computer technology and lower carbon electricity, but it’s worth paying attention to. It’s only going to increase.

Note: I’ve reached out to the study lead author for comment. Should they get back to me, the article will be updated.
Follow CleanTechnica on Google News.
It will make you happy & help you live in peace for the rest of your life.

About the Author

Michael Barnard is Chief Strategist with TFIE Strategy Inc. He works with startups, existing businesses and investors to identify opportunities for significant bottom line growth and cost takeout in our rapidly transforming world. He is editor of The Future is Electric, a Medium publication. He regularly publishes analyses of low-carbon technology and policy in sites including Newsweek, Slate, Forbes, Huffington Post, Quartz, CleanTechnica and RenewEconomy, and his work is regularly included in textbooks. Third-party articles on his analyses and interviews have been published in dozens of news sites globally and have reached #1 on Reddit Science. Much of his work originates on Quora.com, where Mike has been a Top Writer annually since 2012. He's available for consulting engagements, speaking engagements and Board positions.

Back to Top ↑

Advertisement

Advertise with CleanTechnica to get your company in front of millions of monthly readers.

Top News On CleanTechnica

CleanTechnica Clothing & Cups

Join CleanTechnica Today!

Listen to CleanTech TalkAdvertisement

Advertisement

Follow CleanTechnica Follow @cleantechnica

Our Electric Car Driver Report

Read & share our new report on “electric car drivers, what they desire, and what they demand.”

The EV Safety Advantage

Read & share our free report on EV safety, “The EV Safety Advantage.”
EV Charging Guidelines for Cities

Share our free report on EV charging guidelines for cities, “Electric Vehicle Charging Infrastructure: Guidelines For Cities.”

30 Electric Car Benefits

Our Electric Vehicle Reviews

Tesla News

38 Anti-Cleantech Myths

© 2019 Sustainable Enterprises Media, Inc.

Invest
Electric Cars
Electric Car Benefits
Electric Car Sales
Solar Energy Rocks
RSS
Advertise
Privacy Policy

This..

Daily Crunch: Uber reveals sexual assault numbers

The Daily Crunch is TechCrunch’s roundup of our biggest and most important stories. If you’d like to get this delivered to your inbox every day at around 9am Pacific, you can subscribe here. 1. Uber reveals thousands of sexual assault reports last year Uber just released its first-ever safety report, stating that it received 2,936… Continue reading Daily Crunch: Uber reveals sexual assault numbers

Factbox: Sensor specialist AMS’ bid to buy lighting group Osram

VIENNA (Reuters) – Austrian sensor specialist AMS (AMS.S) succeeded on Friday with its 4.6 billion euro ($5.07 billion) takeover bid for German lighting group Osram (OSRn.DE) at the second attempt, acquiring more than the 55% of shares required. The Austrian group is best known as a supplier of facial recognition technology to Apple (AAPL.O), while… Continue reading Factbox: Sensor specialist AMS’ bid to buy lighting group Osram

Neuron EV Introduces Pure Electric TORQ

IRVINE, Calif., Dec. 6, 2019 /PRNewswire/ — Visionary automotive company Neuron EV has recently unveiled the Neuron EV TORQ, the firm’s electric semi-truck that sets new standards in clean energy vehicles. Sustainability The Neuron EV TORQ is a monumental leap forward in transitioning semi-trucks into the electric vehicle age. It fulfills the needs and expectations… Continue reading Neuron EV Introduces Pure Electric TORQ

Mercedes-Benz Starts Battery Pack Production For PHEVs In Thailand

Daimler put its third (out of nine planned) battery pack production facilities into operation. Mercedes-Benz has announced the start of battery pack production for plug-in electric cars in Bangkok in Thailand, which was promised in 2018. Through an investment of more than €100 million ($111 million), the German manufacturer, together with local partners Thonburi Automotive… Continue reading Mercedes-Benz Starts Battery Pack Production For PHEVs In Thailand

@Ford: Thousands of New Tech Team Members Helping Transform Ford with More on the Way

About Ford Motor Company Ford Motor Company is a global company based in Dearborn, Michigan. The company designs, manufactures, markets and services a full line of Ford cars, trucks, SUVs, electrified vehicles and Lincoln luxury vehicles, provides financial services through Ford Motor Credit Company and is pursuing leadership positions in electrification, autonomous vehicles and mobility… Continue reading @Ford: Thousands of New Tech Team Members Helping Transform Ford with More on the Way

Apex.AI Raises Funding Round

Apex.AI, a Palo Alto, California-based autonomous mobility software company, received a funding round of undisclosed amount.  The amount of the deal was not disclosed. Backers included Volvo Group Venture Capital, Hella Ventures, and Jaguar Land Rover’s venture capital arm InMotion Ventures. The investment will enable the development of a safety-certified software framework for autonomous systems. Founded… Continue reading Apex.AI Raises Funding Round

@BMW: BMW Group sales continue to grow in November, with new all-time high also for electrified vehicles

Munich. BMW Group sales continued their positive trend in November: Worldwide deliveries increased by 1.4% over the same month last year to 225,662 units. Deliveries in the year to the end of November were up 1.7% year-on-year, with a total of 2,296,174 premium BMW, MINI and Rolls-Royce vehicles sold. “After renewed growth in November, we… Continue reading @BMW: BMW Group sales continue to grow in November, with new all-time high also for electrified vehicles

December 5th, 2019 Blog Perception Blog: Technology Comparison – Flash and Scannin…

Perception Blog: Technology Comparison -Flash and Scanning LiDAR

A blog post by
David Brodie BSc (Eng), Sr. Project Manager of Perception and AI, LeddarTech®

My name is David Brodie. With this occasional blog, I hope to create a platform to exchange ideas on technology and solution related to perception. I hope to discuss a selection of topics, some at a high level and some in more technical detail.

To start with let’s consider a high-level analysis of one of the classic problems which every autonomous vehicle faces: detecting debris or small objects.

Consider first a classic scanning lidar that surveys its environment using several scan lines. Consider the following figure.

Figure 1: a scanning lidar’s view of the world

Theoretically, all four small objects could be detected. Objects 1 and 3 are detected as they are directly in the path of some scan line. Although object 2 is nearer than object 3 it is not detected as it falls between scan lines. Similarly, object 4 is not detected through in range as it too falls between scan lines. As a scanning lidar approaches a small object, the object will be detected and lost with increasing frequency until it is close enough to be hit by a scan line in every frame.

The following figure shows the same scene surveyed by a flash lidar like LeddarTech’s Pixell.

Figure 2: flash lidars view of the world

In this case, the entire field of view of the lidar is illuminated. Rather than detecting point reflections, the flash lidar detects reflections from a segment. Once an object becomes detectable it is continuously detected as there are no gaps in the illumination. There are challenges here too. Object 4 only fills a small section of the relevant area or “segment”. Object 3 is larger but split across multiple segments and so also only fills a small portion of each segment. It may not be easy to detect but with good beam steering and signal processing, it is possible. Once an object becomes detectable; it is reliably detected from then on.

Have a question about Leddar technology or Leddar sensors? An expert from LeddarTech will be happy to discuss it with you.

Return to Latest News and Media Coverage

This website places cookies on your device to give you the best possible user experience. By using our websites, you agree to our cookies being saved on your device (unless you have disabled cookies in your settings). For more information, please read our Terms of Use and our Privacy policy.

I Agree