The trouble with carbon credits and counterfactuals

Why the “additionality” of some carbon offsets is putting the entire carbon market at risk, and what we can do about it

Carbon credits sound attractive to many corporations: if you can’t reduce your own emissions (e.g., airlines have a very hard time reducing emissions), you can pay someone else to reduce theirs (i.e., to not emit CO2 that they would have otherwise emitted) or even pay someone to pull carbon dioxide out of the atmosphere.  And they’ve become pervasive — the word “net” in “net zero commitments” represents all the carbon credits countries and companies are counting on to offset their hard-to-reduce emissions.  But the devil is in the details — and below I’ll lay out all the problems in the phrase “CO2 that they would otherwise have emitted” and a proposal for what we can do about it.  This may sound simple at first, but it has everything to do with whether we can meet our commitments of not putting another incremental ton of CO2 in the atmosphere within 30 years.

With the expectation that carbon credits will be a critical part of the solution, headlines like this recent ProPublica article send chills down the spines of all carbon market participants: buyers and sellers; regulators and auditors. 

The subtitle, dramatic as it is, is also compelling. The Massachusetts Audubon Society manages “wildlife sanctuaries.” They claimed they could — in theory — heavily log almost 10,000 acres of preserved forests. And then by NOT logging, they generated over 600,000 carbon credits, which were sold to fossil fuel companies who used them to emit CO2 under California’s climate laws. But would Mass Audubon have ever logged that acreage in any feasible scenario? Highly unlikely — it’s against the organization’s mission and everything it stands for. It’s hard to hear this scenario and come to any conclusion other than that this system is perverse, if not broken entirely. The article suggests as much with a quote from Danny Cullenward, policy director at Carbon Plan: “The nearly universal pattern we see in the data [is that] those [carbon] projects are not delivering real climate benefits.” 

“What if…” and the not-so-little problem of additionality

Two seemingly innocuous words are at the root of Pro Publica’s broad critique of carbon markets: “What if…?”. And in those two words lies one of the biggest impediments to carbon markets scaling to the point where they can be a meaningful part of the climate solution. 

Why “What if...”? For a carbon offset1 to be worth something to a potential buyer — and to the planet — we need to believe that it represents either current carbon being removed from the atmosphere or future carbon not being emitted into it. In purchasing a carbon credit, you want/need to know you’ve altered our atmosphere for the better from what it would have been otherwise. This is where “what if…?” comes in. 

“What ifs” are counterfactuals. They hypothesize that the world could be different than it is. For example: 

  • What if the allies had not won WWII? 

  • What if the Waxman-Markey bill made it to the Senate floor? 

  • What if we raised the federal minimum wage to $15/hour? 

The counterfactual we ask for carbon offsets is: what if a market for carbon credits did not exist? Would this tonne of carbon have been removed (or avoided) anyway? For example, we ask…

  • Would that solar array have been installed (i.e., good for the environment) without carbon financing? 

  • Would that tree have been planted (i.e., good for the environment) without carbon markets to help pay for it? 

  • Would that direct air capture contraption have been built (i.e., good for the environment) without income from carbon credit sales?2

  • And in the Mass Audubon scenario, would those forests have been logged (i.e., bad for the environment) without the purchase of carbon credits to prevent it?

In other words, is the positive environmental impact of the carbon asset incremental or additional to what we could expect if carbon markets didn’t exist? That’s why we call this feature “additionality” in the world of carbon projects and assets.

The ease of proving additionality varies significantly across different types of carbon projects. Carbon Plan, a nonprofit set up for the purpose of “Improving the transparency and scientific integrity of carbon removal and climate solutions through open data and tools”, evaluated 161 carbon removal projects submitted to Microsoft for their first 1M ton carbon removal RFP. Carbon Plan used five criteria in the evaluation, the last being Additionality. They shared their ability to validate a project against the criteria. For forestry and soil projects, they could not confidently answer the additionality question: “Is the project actually benefiting the climate or would it have happened anyway?” But for things like direct air capture or mineralization, they could.  

If we turn this into a “what if”, we can see why it’s so different across project types. If you construct a machine whose sole purpose is to remove carbon from the atmosphere, the reason you’re doing that is to make money, and your sole revenue source is carbon offsets (how else would you make money with a machine that captures carbon from the air?). So, when we ask “What if we didn’t finance this carbon project with carbon credits?”, in this case, the answer is clear — without carbon financing, the direct air capture (DAC) project would not have happened. 

But a project that protects sections of rainforest from being cut down is a much harder counterfactual to prove. What if we didn’t finance this project with carbon credit sales? Can we be sure this segment of the rainforest would have been cut down? If we can’t confidently answer those questions, those credits may be “anyway tonnes” (i.e., tonnes of CO2e that would have been removed or abated anyway, even if you hadn’t bought the carbon credit). 

That, unfortunately, is what some prominent studies have found, such as this one, “Overstated carbon emission reductions from voluntary REDD+ projects in the Brazilian Amazon”.3 This study evaluated all VCS-certified REDD+ projects in the Brazilian Amazon from 2008-2017.4 The REDD+ projects they evaluated set their baselines as a continuation of historical trends.5 But those trends “become unrealistic counterfactuals as the regional economic and political context change.” From 2004-2012, rates of forest loss declined in general across all of Brazil. Great news for the environment! Bad news for those carbon credits. 

The study’s authors used that insight and constructed synthetic controls to create a more accurate counterfactual and concluded: “Overall, we find no significant evidence that voluntary REDD+ projects in the Brazilian Amazon have mitigated forest loss.” Statements like these, which question the additionality of some carbon projects, undermine trust in carbon credits as a whole.  

Without more certainty that purchasing carbon offsets causes positive climate impacts, carbon projects and the carbon market writ large will not warrant investment at scale. We must develop more reliable and valid methods for proving additionality to unlock the potential of carbon offsets as a major part of the climate solution. 

Breaking through the additionality logjam6 

Carbon registries are generally considered the arbiters of “quality” in carbon markets.7 Because we can’t see or touch a carbon credit, there must be clear rules/standards/protocols for defining when a project developer has successfully “manufactured” a carbon offset. Carbon registries approve those rules/standards/protocols for defining what makes a credit a credit.

In doing so, they have to walk a tightrope, balancing rigor / quality on one side, and implementability on the other. If the standard is too rigorous, it will never be implemented.8 If it’s too lenient, it will generate spurious credits that had no positive impact on the climate (recall: “anyway tonnes”) and are arguably detrimental because “these projects impose an indirect cost on legitimate climate change mitigation efforts by undercutting the price of their credits.”9 

Today, the registries walk the tightrope by designing the most rigorous additionality standards that the current cohort of carbon credit “manufacturers” can implement. The organizations tend to be small (usually tens of employees) with thin margins and lean budgets due to the fact that the average carbon offset transacts at ~$3/tonne. On top of that it’s a niche market (~100M tonnes annually) and, even more problematic, an oversupplied market.10 Roughly twice as many credits were issued (i.e., created) in 2019 as were retired (i.e., used).11 Some market participants/critics have described this dynamic as a “race to the bottom”.12 The result is credits that are vulnerable to criticism about their quality. 

But contrast this with how big pharma or big tech companies approach proving the counterfactuals that are critical to their businesses. With their vast resources and the lure of large market potential, they can invest in proving (and improving) the quality of their products. 

When a pharma company asks the FDA for permission to sell a new drug, they provide a tremendous amount of statistical data which proves the counterfactual on a large number of patients — that’s why some people enrolled in a randomized controlled trial receive a placebo, and some receive the actual drug.  That way they can truly prove (within the bounds of statistical confidence) what would have happened if a patient did or did not take a drug. 

For a tech company, A/B testing works similarly. A portion of users coming to the site get the usual experience (the equivalent of the control group or placebo), and the rest get a new experience. The results are compared to understand how new features impact the user experience.

Both these approaches are data intensive and expensive, but they are widely accepted as rigorous, valid methods for proving a counterfactual.13 

The comparison with pharma and tech illustrates the chicken-and-egg problem in the carbon market: there isn’t enough money in the carbon market for carbon credit producers to be able to afford raising the bar on proving additionality; but if we don’t increase the quality of carbon credits by raising the bar on additionality, buyers won’t pay a higher price. 

We must break through this logjam, and we can’t afford to ignore climate land management solutions where additionality is hardest to prove. Given that, I propose three solutions could reduce the likelihood we see scathing articles criticizing dubious or even counterproductive nature-based carbon projects. 

(1) Innovate other investment/financing mechanisms and business models to catalyze nature-based climate solutions. Based on how things are working today, it does appear that bogus carbon projects are putting the climate up as collateral to finance conservation efforts. Julio Friedman, Chief Scientist at Carbon Direct, a carbon removal advisory firm, said, “One of the things that we are not fans of generally is avoided deforestation because we think that's basically a protection racket.”14 I want the Mass Audubon to have more money to do it’s great work preserving wildlife sanctuaries. But not at the expense of the climate. 

Carbon financing is 30+ years old. It’s a space ripe for innovation. Arguably one of the most transformative innovations of the 20th century was the 30-year mortgage. It made the future available today: you could save for 30 years first and then buy a house; or you could buy it today and pay for it over the next 30 years. Part of what made it work was sheer volume. Millions of mortgages opened the opportunity for mortgage insurance and securitization, which are both ways of sharing risk across a vast portfolio of individual products which contain seemingly undiversifiable risk. In order to create carbon credit markets that function as fluidly as mortgage markets, we need scale and risk sharing. 

At its core, the modern mortgage was a bet on the American people that changed the lives of millions. We need to make similar bets on climate to change the lives of billions in future generations. Similarly, this may require government intervention to help this market mature into its full potential at massive scale. 

(2) Incentivize innovation directly tied to a higher standard for proving additionality. What unites the counterfactual approaches of medicine and big tech is significant investment (particularly in people’s time) and massive amounts of data. They do this because the ROI warrants it. We need that sort of investment here, particularly because the problem is so unique and thorny. Synthetic controls are a great start for figuring out this problem, but there is much more room to innovate. 

Though companies like Stripe and Microsoft are starting to provide incentives, it’s not enough. When Stripe ran their RFP, they purchased credits ranging from $75 to $775, which are crazy prices in this market. But they limited it to $1M total. So they only purchased 6,600 credits. Microsoft, on the other hand, purchased 1.3M credits, but at an average price of $20/credit. Granted, $20 is much higher than the average price today of $3/credit, but it’s not transformative. (For the record, I applaud both of these companies. Their efforts are exactly the sort of thing we need more of) 

To ask a counterfactual of my own — what if Google, Amazon, Facebook, Stripe and other companies at the cutting edge of carbon mitigation, got together and pooled their demand for carbon credits? On top of that, what if they committed to paying $100/credit for nature-based carbon offsets that passed a level of rigor in proving additionality that passes the court of public opinion? That would transform the fundamental unit economics of carbon project developers. It would redefine the tradeoff between rigor and implementability. And, though I can’t prove it, I bet it would spark a tremendous amount of activity, investment, and innovation. 

(3) Until we have #1 or #2, we could introduce shades of gray to additionality evaluations. Today, additionality is binary: the credit either is or is not additional based on the evaluation from carbon regulators. Buyers of the credits are left on their own to make more nuanced assessments of additionality, which is why carbon procurement consultants exist to help corporations get under the hood of credits to construct a portfolio. With a more nuanced assessment of carbon offsets, they can be priced according to their underlying risk and quality attributes, just as other securities are priced (e.g., mortgage bonds).

As long as compromised standards for proving additionality are being deployed, articles will be written exposing carbon credit counterfactuals that look ridiculous. The purchase of low quality credits represents misallocated funding: those dollars could have been put towards a solution that actually helps the climate. It also risks the credibility of the entire carbon asset class. But those stories should not be cautionary tales that make corporations stop investing until we have the perfect answer. 

Just the opposite. Our climate crisis is the tragedy of the commons played out on the largest canvas we have. We need to deploy the capital and provide the liquidity that will spark rapid innovation, AND we must shine a bright light on all of these innovations to determine whether they are effective or not. Sunlight is the best disinfectant. It’s also heating up our planet; we need to move quickly and boldly. 


Note I will use “carbon credit”, “carbon offset”, and “carbon asset” interchangeably. There are nuances, but basically these are all the same “carbon product” a corporation could purchase in order to hit their Net Zero targets. This can also be called “carbon financing” for a project. I.e., if a project generates revenue from the sale of carbon credits/offsets, it receives “carbon financing”. 


Answer to solar array: 15 years ago, the answer would have been “no.” But today, with the cost per kilowatt hour dropping so dramatically, the answer may be “yes, it would have been installed anyway”.

Answer to tree: depends! We’ll talk about this more

Answer to direct air capture: almost certainly no, it would not have been built otherwise, but we’ll talk about this more, too


REDD stands for Reduced Emissions from Deforestation and Forest Degradation. 


VCS or Verra is one of the major carbon registries, who act as arbiters of quality in the world of carbon credits 


“Baselines” in essence represent the “business as usual” counterfactual scenario against which the project compares itself to issue its carbon credits. 


Pun intended 


 The major carbon registries are: Climate Action Reserve, Verra, Gold Standard, and American Carbon Registry


For example, this Nitrogen Management Protocol has never been used. And this Rice Cultivation Protocol has only issued 597 credits across two projects. 


According to 2019 data reported by Forest Trends. Note that I’m discussing market averages. This market is opaque and transactions are “over the counter”, so there is a wide dispersion in prices and quality of credits created.




The average cost of phase 1, 2, and 3 clinical trials for the FDA is $4M, $13M, and $20M respectively (see here). Tech companies running A/B tests have enormous teams of product managers, software engineers, and statisticians designing and running these tests.