The coffea arabica genome has been sequenced, and it tells us more than you’d think

Coffea arabica (i.e. the species of coffee responsible for all the tasty coffee on Earth) has finally had its full genome sequenced. So what?

Some time in the middle of 2020 I came across this paper published in the scientific journal Nature. I was intrigued. Not only was the Coffea arabica genome fully sequenced, but using genetic data derived from arabica’s genome, the researchers uncovered some fascinating facts about arabica’s evolutionary history and its current genetic diversity. Nobody I knew in coffee was talking about this, and I wanted to know more, so I reached out to one of the paper’s authors to see if I could record a podcast on the subject.

To my delight, the author introduced me to his boss Prof. Michele Morgante, who is the Full Professor of Genetics at the University of Udine. Morgante is also the Scientific Director of the Institute of Applied Genomics, a non-profit whose technology was used for the sequencing of the arabica genome.

Morgante is a leader in his field and a fantastic communicator, and it was an absolute pleasure to chat to him. If you want to listen to our whole conversation, you can find it here.

If you want the TL;DR version, then read on. I’ve extracted the biggest takeaways from our conversation and listed them below.

Takeaway 1:

Coffea arabica is an allotetraploid, while its parent species (coffea eugenioidies and Coffea canephora) are diploids. This is important for understanding the origin and traits of arabica.

Many species, including arabica’s parents eugenioidies and canephora (robusta), as well as us humans, are diploids. What is a diploid? Simply put, a diploid is an organism that has two copies of each chromosomes in each cell. A quick high school biology reminder: chromosomes are made up from a long chain of DNA, and they contain the genetic blueprint for the creation of all the cells that make up the organism.

Importantly, arabica coffee is NOT a diploid. It is an allotetraploid. This means that each cell contains not two, but four copies of the chromosome in each cell. This is a major difference, and it is related to the genesis of the arabica species. Arabica is the offspring of a hybridisation between the two separate species Coffea eugenioidies and Coffea canephora.

Because these are two separate species, they cannot, by definition, produce fertile offspring. However, nature has a hack to get around this. In rare cases, a cross between two species results in the the full set of chromosomes from each parent plant being retained in the offspring plant, resulting in double the amount of chromosomes. This process creates a fertile new species, and is exactly how arabica was born. The diagram below shows how this occurs.

Takeaway 2:

All arabica plants that have ever existed originated from a single “mother tree” which came into existence between 10,000–50,000 years ago.

This really tripped me out. The entire arabica species is derived from a single plant? Wow. I think of this tree as the “mother tree”, a fertile cross-species hybrid born via the process described above.

Using computer modelling, the researchers were able to calculate when this “mother tree” was born by analysing present levels of genetic diversity within arabica and by calculating how long it would take for evolution to produce that level of diversity. This is how the 10,000–50,000 year old figure came about.

Arabica is a remarkably young species. To put it into context, 10,000 years was when humans began to develop agriculture.

Takeaway 3:

Compared to many other agricultural crops, arabica has an extremely small amount of genetic diversity.

If you think about it, arabica’s lack of inter-species genetic diversity is not particularly surprising. First up, you’ve got an extremely young species. Arabica trees also takes quite a long time to mature, bear fruit, and reproduce (about three years). Not only that, but as we discussed, the entire species is derived from a single plant. Add all that together, end you end up with a species that has very little intra-species genetic diversity to play with when breeding new varieties. This is why Michele believes we need to go back to the drawing board and re-introduce genetic diversity from other sources, including the parent species of arabica.

Takeaway 4:

F1 hybrids are crosses between genetically deviant cultivars within a species.

F1 hybrids have been gaining popularity in arabica coffee breeding, and with good reason. F1 hybrids, for reasons that are not completely understood, tend to have higher vigour than either parent species. In the podcast, Michele discusses different theories as to why F1 hybrids display these characteristics.

Takeaway 5:

Oranges are a cross between pomelos and mandarins!

Ok, I know this is not coffee related, but this also blew my mind. Did you know all sweet oranges, including blood oranges, navel oranges, valencia oranges etc. are all not only the same species, but all have the identical genotype? Put another way, every sweet orange out in the world is an identical twin to all other oranges. The only difference between the different orange varieties are what are called ‘somatic mutations’–these are genetic mutations are not inherited, but can be passed down through cell devision, or by planting a cutting from a tree which has the mutation.

Takeaway 6:

If you’ve read this far, you should go ahead and listen to the podcast!

C’mon keen bean, you know you want to.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store

ECRE. Co-roasting, coffee academy, store, and events. 👉🏻 Sydney, Australia.