Using machine learning to predict post-storage fruit quality with microbiome, climatic and imaging data

Reference: CTP_FCR_2019_1

Lead academic supervisors: Prof Xiangming Xu, Dr Bo Li  (NIAB)
University supervisor: Prof Xia Hong, Department of Computer Science University of Reading

This student is to be registered with the University of Reading.

Apple fruit are usually stored for minim six months in the controlled atmosphere before marketing in the UK.  It is not unusual for the stored fruit to suffer from 10-15% post-storage losses due to various causes, including physiological disorders and fungal rotting. This leads to not only yield losses but also increased cost in sorting fruit post-storage. Fruit storability (i.e. post-storage fruit quality) can be affected by many factors, including flowering time, fruit ripeness at picking, fruit surface microflora, and climatic factors. Furthermore, the relationships of fruit quality with these factors are usually non-linear and the precise causal-relationships have yet to be elucidated. Machine learning [ML] (or deep learning, data mining) is well suited to analyse such big data sets to generate rules for predicting fruit quality pre-storage.

Objectives

To develop predictive rules relating climatic conditions, microbial epiphytes on fruit surfaces and imaging information to post-storage fruit quality through ML

Approaches

Predicting fruit ripeness from climatic data only
Predicting the degree of fruit ripeness is critically important since it has been well established that fruit ripeness at picking could significantly affect fruit storage potential. Previous research on predicting fruit ripeness is based on batches of fruit. We will study whether we could use climatic data to predict the temporal pattern of flowering and fruit ripeness. Historic data collected at NAB EMR over the last 40 years will be used to study the temporal flowering pattern. To understand the variability of fruit ripeness, we shall follow the development of individual fruit to investigate whether ripeness is dependent on post-blossom temperatures as well as the actual flowering time.

Predicting fruit ripeness from imaging data
Under commercial conditions, manually tracking the development of individual fruit is not feasible. However, with recent development in imaging technology, it may be possible in the near future to obtain imaging information for individual fruit at different time points of fruit development. We expect that the imaging information closer to picking dates should lead to better predictions of fruit ripeness. In this WP, we shall obtain imaging information of individual fruit at multiple time points and use ML algorithms to assess (1) whether imaging information alone can lead to better predictions of fruit ripeness than those based on climatic data and (2) whether using both climatic and imaging data could lead to improved predicting accuracy.

Relating microbial epiphytes to latent fungal infection
Pre-harvest latent fungal infection of fruit may lead to 5-10% post-harvest losses. Studies from other fruit crops indicate that epiphyte microbiomes on fruit surfaces could significantly affect the establishment of latent infection in fruit by fungal pathogens. Here, we aim to determine whether post-harvest rot development of individual fruit is related to epiphyte microbiomes on the fruit surface. Amplicon-based meta-barcoding technology will be used to characterize microbiomes on individual fruit surfaces. ML will then be used to investigate whether microbiome data could predict the probability of post-storage rotting of individual fruit.

Predicting post-storage fruit quality with surface microbiomes, climatic and imaging data
In this WP, all the information obtained in the previous three WPs will be integrated together to develop rules predicting post-storage fruit quality. We may need to extend current ML algorithms because of large differences in the nature of predictors.

Applying for this studentship

We are looking for a highly skilled PhD student from either mathematics/statistics with a strong interest in agriculture or biology with excellent skills in mathematics/statistics. Although the project research focuses on the application of machine learning, the post-holder is expected to be directly involved in practical experimental studies to obtain new data for modelling.

The most important eligibility criterion for this funded studentship is residency:

  1. UK students: If you have been ordinarily resident in the UK for three years you will normally be entitled to apply for a full studentship, covering tuition fees and a maintenance stipend.
  2. EU students: If you have been ordinarily resident in another EU country (outside the UK) for three years you will normally be able to apply for a tuition fees-only award (without a maintenance stipend). If you have lived in the UK for three years you may be eligible for a full studentship.

This eligibility is unaffected by Brexit. The UK Government has guaranteed EU eligibility for Research Council funding for PhDs beginning before the end of the 2019-20 academic year.

Anyone interested should contact recruitment@emr.ac.uk for an application form and return the form to recruitment@emr.ac.uk before the deadline of 28th February 2019.