Yemeni coffee—how genetically diverse is it?

Insights into the land that gave coffee to the world.

I Stock 000015905157 Large original

In the long and storied history of arabica coffee, Yemen holds a very special place. While Ethiopia is rightly hailed as the “birthplace of coffee”—in scientific terms, it is the evolutionary center of origin, where the species first arose from a spontaneous mating between two ancestor species—Yemen is the place that gave coffee to the world.

Historical records indicate that coffee seeds were taken from the coffee forests of Southwestern Ethiopia across the southern tip of the Red Sea to Yemen in the mid-fifteenth century, where it was first cultivated as a commercial crop. Starting in the 18th century, coffee from Yemen began to spread around the world on European trading routes, forming the basis of modern arabica coffee cultivation.

A 2020 study of arabica coffee genetic diversity confirmed the story of Yemeni coffee and established definitively that Yemen is the secondary dispersal center for arabica coffee that originated in Ethiopia. Nearly all of the arabica coffee in the world (that is, all the coffee cultivated outside of Ethiopia), descends from the early coffee farms of Yemen.

An additional study in 2021 by Montagnon, Mahyoub, Solano and Sheibani, published in Genetic Resources and Crop Evaluation, found that Yemen contains unique genetic diversity not found in the rest of the world.

So what do we know about the genetic diversity of Yemeni coffee?

Until recently, very little was known about the diversity of Yemen’s coffees outside of anecdotes and observation. In 2014, WCR partnered with Dr. Al Hakimi of S'ana University to explore the diversity of Yemeni coffees as part of a larger analysis of arabica genetic diversity. The study examined 736 total accessions of C. arabica—including 648 arabicas from the CATIE germplasm collection (most collected from Ethiopian forests and farms in the 1960s and 70s), plus 88 from Yemen provided by Sana’a University—as well as 35 of C. canephora (35) and 10 of C. eugenioides (10). The study was published in Nature Scientific Reports in January 2020.

The samples in the Nature Reports study were analyzed using a method called genotyping by sequencing (GBS) to identify single-nucleotide-polymorphism (SNP) markers. The research was led by World Coffee Research, Istituto di Genomica Applicata (Italy), and CIRAD (France), in collaboration with the Italian Universities of Trieste, Udine, Padova and Verona and with key contributions from CATIE, the University of Sana’a in Yemen, Texas A&M University, and was funded by illycaffè and Lavazza. (Read more about the full study.)

The study enhanced global knowledge about Yemeni genetic diversity in the following ways:

  • First, the study confirmed with genetic analysis the historical understanding that Yemen is a secondary dispersal center. In other words, that arabica coffee originated in Ethiopia, but spread to the world via Yemen. In scientific terms, Yemeni coffees are a sub-population of Ethiopian arabicas.
  • Second, it found that Yemeni coffees as a group were still less diverse than the Ethiopian coffees studied. Ethiopian germplasm had by far the most diversity overall, as well as unique diversity that is not present in the Yemeni and worldwide cultivar samples. Even so, the authors found that arabica coffee overall had some of the lowest genetic diversity reported for any major crop in the world, a consequence of its recent evolutionary origin from a single mating event somewhere around 10,000 years ago.
  • Third, the study found that the Yemeni accessions included in the study overlapped with the main cultivated varieties worldwide. See figure 1 below.
Diversity and bottleneck graphic original

Figure 1. The genetic distance between samples studied in the Nature Scientific Reports, colored by geographical groupings (left). The results illustrate that that Ethiopian accessions have by far the widest genetic diversity of any group, and that the studied Yemeni samples overlap with the cultivated varieties in the landrace and Typica/Bourbon groups. Source: Unpublished figure from lead author Lucile Toniutti. The findings correspond with and corroborate the historical understanding of the movement of arabica coffee out of Ethiopia to Yemen and then the world, with corresponding severe constriction of genetic diversity at each major movement (right). Source: Antony et al. (2002). The origin of cultivated Coffea arabica L. varieties revealed by AFLP and SSSR markers., does Yemen coffee contain new diversity from what the world is already familiar with?

Yes. A new study, published in 2021, revises the findings of the 2020 study. The Yemeni samples included in the first study were taken from a wide area, but did not cover every single growing area in the country. Indeed, in the 2021 study included analysis of a breeding population of 45 trees that includes genotypes (e.g., distinct genetic material) that has not been seen in previous studies. The study finds:

Yemen today is still holding most of the genetic diversity that it delivered to the world 300 years ago. Moreover, Yemen also hosts a unique specific genetic diversity. Indeed, no world wide cultivated varieties in our study belongs to the New-Yemen cluster, meaning that either it spread out of Yemen in the eighteenth century but was lost or counter-selected en route or it simply never left Yemen.

The new study also confirms that Yemen holds most of the known C. arabica genetic diversity outside of Ethiopia.

Why would the novel genetic diversity found in the 2021 study have been missed by prior studies?

There are a number of reasons this could happen—researchers are nearly always looking at "samples," subsets of a whole population. The fact that the new Yemen diversity hasn't been seen in prior studies could indicate that the group has a very narrow and isolated geolocalization, or possibly it is broadly dispersed in the country, but is very infrequent and therefore escaped collection by prior researchers.

As the 2021 paper points out, it is possible that some of this material does overlap with material that exists today in Ethiopia. The most well-studied Ethiopian germplasm is that which was collected by the ORSTOM and FAO missions in the 60s/70s. Those collecting missions didn't collect extensively east of the Rift Valley or around Harrar (see discussion in the section called "Origin and history of the C. arabica coffee varieties cultivated worldwide"). It's possible that there is indeed overlapping material that just hasn't been "seen" by researchers because it's never been collected from the forest or genotyped. This lack of clarity is very common to origin/domestication work in all crops — it's not possible to collect and genetically test every single tree in a given area, so we are left with an incomplete picture. All scientific studies are imperfect snapshots of reality, but the scientific method inches us incrementally closer to reality.

Is it possible that there is variation found among Yemeni coffees that is not found in Ethiopian accessions?

Yes. The 2020 study found that Yemeni accessions were different from the Ethiopian ones (see figure 1). Why could this be? In Yemen, coffee has been cultivated for more than 500 years in very different conditions compared to the moist, densely shaded Ethiopian forests where it first evolved. Yemen is hot and dry and cultivation systems are full-sun. It is likely that very few of the original seeds brought from Ethiopia survived in the early days of Yemeni coffee cultivation. But the trees that did survive would have experienced intense selection pressure for full-sun growing systems and hot/dry conditions. Some of this advantage in the surviving Yemeni trees compared to their Ethiopian parents could have been due to random mutations that were noticed and selected by attentive farmers. The descendents of these trees are the ones that spread worldwide.

I Stock 000015905157 Large original

Arabica coffee evolved in the dense, moist, highland forests of Ethiopia (left), but was primarily domesticated in open-sun cultivation systems in the much hotter, drier highlands of Yemen. Photos: Jeff Kohler, iStock.

This raises interesting questions for future study—can Yemeni trees help us learn more about heat and drought tolerance in coffee plants (traits that are in high demand with the accelerating impacts of climate change)? And can Ethiopian germplasm that did not experience intense selection pressure for full-sun cultivation provide opportunities to breed varieties that will thrive in shaded/agroforestry cultivation—another necessary path in the face of climate change.

So … should I be excited about Yemeni coffee?

Absolutely. If Yemen's genetic diversity is able to create value for Yemeni farmers—who are among the world's poorest and most oppressed, and who face the possible total collapse of coffee production under the weight of war and economic stagnation—it is incredibly meaningful, regardless of the scientific specifics.

Faris Shebani, founder and CEO of Qima Coffee, and co-author of the 2021 study that identified unique genetic diversity in Yemeni coffee, puts it well: "Yemeni farmers have grown, protected and nurtured over generations these trees – what better resource is there than that? Shining the light of science on that, to deliver value to the farmer, is a beautiful opportunity."

This story was updated in December 2021, based on new research published in Genetic Resources and Crop Evaluation.