Result: Morphometric trait analysis and machine learning-based yield modeling in wood apple (Feronia limonia L.).
Ahmed M, Babayola M, Bake ID. Role of horticultural crops in food and nutritional security: a review. J Nutr Food Process. 2024;7(8):01–6. https://doi.org/10.31579/2637-8914/226.
Qureshi AA, Kumar KE, Shaista Omer SO. Feronia limonia-a path less travelled. Int J Res Ayur Pharm. 2010;1:98–106.
Ghosh SN, Banik AK, Banik BC, Bera B, Roy S, Kundu A. Conservation, multiplication and utilization of wood Apple (Feronia limonia)-a semi-wild fruit crop in West Bengal (India). Acta Hortic. 2011;948:279–83. https://doi.org/10.17660/ActaHortic.2012.948.32.
Jassim JM, Gurudivya P, Sankar C, Raja V, Sundarrajan RV, Reddy PSK, Thamizhvanan A, Aravind S. Biochemical evaluation of diverse wood Apple (Feronia limonia L.) geno-types under Tamil Nadu conditions. Int J Plant Soil Sci. 2025;37(8):354–61. https://doi.org/10.9734/ijpss/2025/v37i85637.
Kruger RR, Navarro L. Citrus germplasm resources. In: Khan IA, editor. Citrus genetics, breeding and biotechnology. UK: CABI; 2007. pp. 45–140.
Srivastava R, Mishra N, Agarwal S, Mishra N. Pharmacological and phytochemical properties of Kaitha (Feronia limonia L.): a review. Plant Arch. 2019;19(1):608–15.
Sharma P, Tenguria RK. Phytochemical properties and health benefits of Limonia acidissima: A review. Int Res J Plant Sci. 2021;12(3):1–6.
Mohamed Jassim J, Gurudivya P, Sankar C, Sundarrajan RV, Anchana K, Reddy PS, Raja V, Aravind S. Evaluation of wood Apple genotypes for yield traits under Tamil Nadu condition. J Adv Biol Biotechnol. 2025;28(8):670–77. https://doi.org/10.9734/jabb/2025/v28i82741.
Anonymous. 2024. Director of Horticulture, Agriculture, farmers welfare and Co-operation department. Govt Gujarat. Available at https://share.google/YCISBzPEIh7c50ejL. Accessed 9 Dec 2025. https://share.google/YCISBzPEIh7c50ejL.
Thakur N, Chugh V, Dwivedi S. Wood apple: an underutilized miracle fruit of India. Pharma Innov J. 2020;9:198–202.
Shukla U, Lata R, Maji S, Meena RC. Historical background, origin, distribution & present status of wood Apple. J Adv Biol Biotechnol. 2024;27(10):1457–67. https://doi.org/10.9734/jabb/2024/v27i101566.
Yadav V, Singh AK, Mishra DS, Yadav LP, Apparao VV, Sahil A, Ravat P, Rane J, Meena MK, Siddiqui MH, Alamri S, Khan S. Genetic diversity, quality traits, antioxidant properties, and nutrient composition of Feronia limonia accessions from a semiarid region for breeding and quality improvement. Sci Rep. 2025;15:35352. https://doi.org/10.1038/s41598-025-19356-1.
Filippi P, Jones EJ, Wimalathunge NS, Somarathna PD, Pozza LE, Ugbaje SU, Jephcott TG, Pat-erson SE, Whelan BM, Bishop TF. Precis Agric. 2019;20:1015–29. An approach to forecast grain crop yield using multi-layered, multi-farm data sets and machine learning. https://doi.org/10.1007/s11119-018-09628-4.
Cubillas JJ, Ramos MI, Jurado JM, Feito FR. A machine learning model for early prediction of crop yield, nested in a web application in the cloud: a case study in an Olive grove in Southern Spain. Agriculture. 2022;12(9):1345. https://doi.org/10.3390/agriculture12091345.
Xu X, Gao P, Zhu X, Guo W, Ding J, Li C, Zhu M, Wu X. Design of an integrated Climatic as-sessment indicator (ICAI) for wheat production: A case study in Jiangsu Province, China. Ecol Indic. 2019;101:943–53. https://doi.org/10.1016/j.ecolind.2019.01.059.
Beulah R. A survey on different data mining techniques for crop yield prediction. Int J Comput Sci Eng. 2019;7:738–44.
Nasr I, Nassar K, Karray F. 2022. Enhancing fresh produce yield forecasting using vegetation indices from satellite images. IEEE International Conference on Systems, Man, and Cybernetics (SMC). 2022;2863–2868. https://doi.org/10.1109/IJCNN55064.2022.9892192.
Suarez LA, Robson A, Brinkhoff J. Early-Season forecasting of citrus block-yield using time series remote sensing and machine learning: A case study in Australian orchards. Int J Appl Earth Obs Geoinf. 2023;122:103434. https://doi.org/10.1016/j.jag.2023.103434.
Jabed MA, Murad MA. Crop yield prediction in agriculture: A comprehensive review of machine learning and deep learning approaches, with insights for future research and sustainability. Heliyon. 2024;10(24). https://doi.org/10.1016/j.heliyon.2024.e40836.
Coşkun ÖF, Aydın A, Başak H, Mavi K, Yetişir H, Toprak S. Perspectives on morphology, physiology, genetic polymorphism and machine learning in cucumber grafting under zinc toxicity. BMC Plant Biol. 2025;25(1):1647. https://doi.org/10.1186/s12870-025-07709-x.
Toprak S, Coşkun ÖF. Machine learning-based evaluation of nutrient distribution in grafted cucumber plants. PhytoTalks. 2025;2(3):457–66. https://doi.org/10.21276/pt.2025.v2.i3.3.
Panchbhai KG, Lanjewar MG, Naik AV. Modified MobileNet with leaky ReLU and LSTM with balancing technique to classify the soil types. Earth Sci Inf. 2025;18:77. https://doi.org/10.1007/s12145-024-01521-1.
Panchbhai KG, Lanjewar MG. Enhancement of tea leaf diseases identification using modified SOTA models. Neural Comput Applic. 2025;37:2435–53. https://doi.org/10.1007/s00521-024-10758-2.
Panchbhai KG, Lanjewar MG, Malik VV, Charanarur P. Small size CNN (CAS-CNN), and modified MobileNetV2 (CAS-MODMOBNET) to identify cashew nut and fruit diseases. Multimed Tools Appl. 2024;83(42):89871–91. https://doi.org/10.1007/s11042-024-19042-w.
Bhandari HR, Bhanu AN, Srivastava K, Singh MN, Shreya HA. Assessment of genetic diversity in crop plants-an overview. Adv Plants Agric Res. 2017;7(3):279–86.
Ahmad M, Singh Z, Khan AS, Abbasi NA. Influence of canopy characteristics and leaf physiology on yield and fruit quality in grapevines (Vitis vinifera L). Sci Hortic. 2004;99(3–4):257–66. https://doi.org/10.1016/j.scienta.2003.08.004.
Mishra DS, Berwal MK, Singh A, Singh AK, Rao VA, Yadav V, Sharma BD. Phenotypic diversity for fruit quality traits and bioactive compounds in red-fleshed guava: insights from multivariate analyses and machine learning algorithms. S Afr J Bot. 2022;149:591–603. https://doi.org/10.1016/j.sajb.2022.06.043.
Ibell PT, Normand F, Wright CL, Mahmud K, Bally IS. The effects of planting density, training system and cultivar on vegetative growth and fruit production in young Mango (Mangifera indica L.) trees. Horticulturae. 2024;10(9):937. https://doi.org/10.3390/horticulturae10090937.
Orlenko A, Moore JH. A comparison of methods for interpreting random forest models of genetic association in the presence of non-additive interactions. BioData Min. 2021;14:9. https://doi.org/10.1186/s13040-021-00243-0.
Toplu C, Tunç Y, Khadivi A, Mishra DS. Integrative chemometrics, multivariate, and artificial neural network-based modeling of fruit quality, fatty acid composition, and biochemical traits in Olive (Olea Europaea L.) cultivars. LWT. 2025;118750. https://doi.org/10.1016/j.lwt.2025.118750.
Yadav V, Singh AK, Hiwale SS, Singh S, Sharma BD. Thar prabha: A new high-yielding wood Apple variety for dry land. Indian Horti. 2023;68:7–9.
IPGRI-International Plant Genetic Resources Institute. Descriptors for Citrus, Rome. IBPGR. 1999. http://www.bioversityinternational.org/e-library/publications/detail/descriptors-for-citrus-emcitrusem-spp/(1999).
Mishra DS, Singh S, Singh AK, Yadav V. Genetic variability in acid lime accessions from central Gujarat. Indian J Hort. 2018;75(4):703–8. https://doi.org/10.5958/0974-0112.2018.00117.2.
Singh AK, Pandey D, Kumar R, Yadav V, Yadav LP, Gangadhara K, Rane J, Kumar R, Krishna H. Descriptors for characterization and evaluation of Bael (Aegle Marmelos (L.) Correa ex Roxb.) germplasm for utilization in crop improvement. Genet Resour Crop Evol. 2024;71:4209–38. https://doi.org/10.1007/s10722-024-01903-w.
McKinney W. Data structures for statistical computing in Python. In: Proceedings of the 9th Python in Science Conference, 2010;51–56. https://doi.org/10.25080/Majora-92bf1922-00a.
Harris CR, Millman KJ, Van Der Walt SJ, Gommers R, Virtanen P, Cournapeau D, Wieser E, Taylor J, Berg S, Smith NJ, Kern R. Array programming with numpy. Nature. 2020;585(7825):357–62. https://doi.org/10.1038/s41586-020-2649-2.
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J. Scikit-learn: machine learning in python. J Mach Learn Res. 2011;12:2825–30.
Waskom M, Seaborn. Statistical data visualization. J Open Source Softw. 2021;6(60):3021. https://doi.org/10.21105/joss.03021.
Hunter JD, Matplotlib. A 2D graphics environment. Comput Sci Eng. 2007;9(3):90–5. https://doi.org/10.1109/MCSE.2007.55.
Lundberg SM, Lee SI. A unified approach to interpreting model predictions. Adv Neural Inf Process Syst. 2017;30:4765–74.
Cohen J, Cohen P, West SG, Aiken LS. Applied multiple regression/correlation analysis for the behavioral sciences. Routledge. 2013. https://doi.org/10.4324/9780203774441.
Jolliffe IT. Principal component analysis. 2nd ed. Springer; 2002.
Breiman L. Random forests. Mach Learn. 2001;45(1):5–32. https://doi.org/10.1023/A:1010933404324.
Hartigan JA, Wong MA, Algorithm AS. A K-means clustering algorithm. J R Stat Soc Ser C Appl Stat. 1979;136(1):100–8. https://doi.org/10.2307/2346830.
Mishra DS, Yadav V, Yadav LP, Appa Rao VV, Ravat P, Kumar P, Rane J. Optimizing guava (Psidium Guajava L.) breeding programs: the role of genotype selection in enhancing growth, yield and nutrient content. Indian J Arid Hortic. 2025;7:36–44. https://doi.org/10.48165/ijah.7.1.5.
Costes E, Lauri PE, Laurens F, Moutier N, Belouin A, Delort F, Legave JM, Regnard JL. Morphological and architectural traits on fruit trees which could be relevant for genetic studies: a review. Acta Hortic. 2004;663:349–55.
Villalobos FJ, Testi L, Hidalgo J, Pastor M, Orgaz F. Modelling potential growth and yield of Olive (Olea Europaea L.) canopies. Eur J Agron. 2006;24(4):296–303. https://doi.org/10.1016/j.eja.2005.10.008.
Ollerton J, Winfree R, Tarrant S. How many flowering plants are pollinated by animals? Oikos. 2011;120(3):321–6. https://doi.org/10.1111/j.1600-0706.2010.18644.x.
Eli-Chukwu NC. Applications of artificial intelligence in agriculture: A review. Eng Technol Appl Sci Res. 2019;9(4):4377–83.
Sheikh M, Iqra F, Ambreen H, Pravin KA, Ikra M, Chung YS. Integrating artificial intelligence and high-throughput phenotyping for crop improvement. J Integr Agric. 2024;23(6):1787–802. https://doi.org/10.1016/j.jia.2023.10.019.
Li L, Zhang Q, Huang D. A review of imaging techniques for plant phenotyping. Sens. 2014;14(11):20078–111. https://doi.org/10.3390/s141120078.
Perez-Sanz F, Navarro PJ, Egea-Cortines M. Plant phenomics: an overview of image acquisition technologies and image data analysis algorithms. Gigascience. 2017;6(11):gix092. https://doi.org/10.1093/gigascience/gix092.
Meena R, Chaudhary MK, Gurjar PS, Rane J, Singh AK, Yadav LP, Kumawat KL, Singh D, Chaudhary BR, Jatav MK, Singh RS. Morphological diversity assessment in date palm (Phoenix dectylifera L.) germplasms grown under hot arid region of India. BMC Plant Biol. 2025;25(1):1159. https://doi.org/10.1186/s12870-025-07194-2.
Chand R, Singh SK, Verma A. Trait associations and genetic variability in guava (Psidium Guajava L.) for yield and quality improvement. Sci Hortic. 2022;304:111298. https://doi.org/10.1016/j.scienta.2022.111298.
Singh AK, Pandey D, Gangadhara K, Yadav LP, Rane J, Krishna H, Devanand G, Pawar A, Sahil A, Ravat P. Descriptors for characterization and evaluation of Indian gooseberry (Emblica officinalis Gaertn) germplasm for utilization in crop improvement. Genet Resour Crop Evol. 2025;72(3):3289–319. https://doi.org/10.1007/s10722-024-02135-8.
Singh AK, Yadav V, Rao VA, Mishra DS, Yadav LP, Gangadhara K, Rane J, Sahil A, Ravat P, Janani P, Kaushik P. Characterization and evaluation of tamarind (Tamarindus indica L.) germplasm: implications for tree improvement strategies. BMC Plant Biol. 2025;25(1):396. https://doi.org/10.1186/s12870-025-06415-y.
Zhao X, Zhang C, Jing S, Tang Z, Lin M, Xiong R. Prediction model of jujube yield and first-order fruit rate based on BP neural network and SHAP analysis. Agronomy. 2025;15(12):2763. https://doi.org/10.3390/agronomy15122763.
Sankharé M, Diallo AM, Ba HS, Diatta S, Samb CO, Touré MA, Badiane S. Phenotypic diversity of growth, leaf and yield-related traits in cashew (Anacardium occidentale L.): implications for the development of a cashew breeding program in Senegal. Genet Resour Crop Evol. 2025;72:6771–81. https://doi.org/10.1007/s10722-025-02367-2.
Smith MR, Rao IM, Merchant A. Source-sink relationships in crop plants and their influence on yield development and nutritional quality. Front Plant Sci. 2018;9:1889. https://doi.org/10.3389/fpls.2018.01889.
Hawkesford MJ, Griffiths S. Exploiting genetic variation in nitrogen use efficiency for cereal crop improvement. Curr Opin Plant Biol. 2019;49:35–42. https://doi.org/10.1016/j.pbi.2019.05.003.
Simionca Mărcășan LI, Pop R, Somsai PA, Oltean I, Popa S, Sestras AF, Militaru M, Botu M, Sestras RE. Comparative evaluation of pyrus species to identify possible resources of interest in Pear breeding. Agronomy. 2023;13(5):1264. https://doi.org/10.3390/agronomy13051264.
Reta K, Netzer Y, Lazarovitch N, Fait A. Canopy management practices in warm environment vineyards to improve grape yield and quality in a changing climate. A review A vademecum to vine canopy management under the challenge of global warming. Sci Hortic. 2025;341:113998. https://doi.org/10.1016/j.scienta.2025.113998.
Rugini E, Baldoni L, Muleo R, Sebastiani L, Mariotti R. Physiological and molecular aspects of breeding fruit tree crops for adaptation to changing climate. Hortic Rev. 2016;44:1–72.
Tripathi PC, Sane A, Kumar P, Chaturvedi K, Mishra DS, Ravat P. Phenotypic diversity and genetic characterization of Cordia Myxa L. using multivariate analysis. Flora. 2025;323:152673. https://doi.org/10.1016/j.flora.2025.152673.
Sharma A, Das P, Barman S, Sharma SK, Ahmed B, Adhikary T. A comparative study on the predictive ability of machine learning- and deep learning-based yield prediction model in horticulture: a case study of Apple. Appl Fruit Sci. 2025;67:191. https://doi.org/10.1007/s10341-025-01403-w.
Trentin C, Ampatzidis Y, Lacerda C, Shiratsuchi L. Tree crop yield Estimation and prediction using remote sensing and machine learning: A systematic review. Smart Agric Technol. 2024;9:100556. https://doi.org/10.1016/j.atech.2024.100556.
Danilevicz MF, Upadhyaya SR, Batley J, Bennamoun M, Bayer PE, Edwards D. Understanding plant phenotypes in crop breeding through explainable AI. Plant Biotechnol J. 2025. https://doi.org/10.1111/pbi.70208.
Zhao L, Yang Z, Wang C, Jin M, Duan J. Predicting de-handing point in bananas using crown morphology and interpretable machine learning. Agronomy. 2025;15(8):1880. https://doi.org/10.3390/agronomy15081880.
Herms DA, Mattson WJ. The dilemma of plants: to grow or defend. Q Rev Biol. 1992;67:283–335. https://doi.org/10.1086/417659.
Poorter H, Niklas KJ, Reich PB, Oleksyn J, Poot P, Mommer L. Biomass allocation to leaves, stems and roots: meta-analyses of interspecific variation and environmental control. New Phytol. 2012;193(1):30–50. https://doi.org/10.1111/j.1469-8137.2011.03952.x.
Fiorani F, Schurr U. Future scenarios for plant phenotyping. Annu Rev Plant Biol. 2013;64:267–91. https://doi.org/10.1146/annurev-arplant-050312-120137.
Zhang X, Wu Z, Cao C, Luo K, Qin K, Huang Y, Cao J. Design and operation of a deep-learning-based fresh tea-leaf sorting robot. Comput Electron Agric. 2023;206:107664. https://doi.org/10.1016/j.compag.2023.107664.
Asamoah E, Heuvelink GB, Chairi I, Bindraban PS, Logah V. Random forest machine learning for maize yield and agronomic efficiency prediction in Ghana. Heliyon. 2024;10(17). https://doi.org/10.1016/j.heliyon.2024.e37065.
Shawon SM, Ema FB, Mahi AK, Niha FL, Zubair HT. Crop yield prediction using machine learning: an extensive and systematic literature review. Smart Agric Technol. 2025;10:100718. https://doi.org/10.1016/j.atech.2024.100718.
Montgomery DC, Peck EA, Vining GG. Introduction to linear regression analysis. 5th Edition, Wiley, Hoboken, New Jersey.2012.
Li C, Zhao Q, Fei J, Cui L, Zhang X, Yin G. Prediction of vegetation indices series based on SWAT-ML: A case study in the Jinsha river basin. Remote Sens. 2025;17(6):958. https://doi.org/10.3390/rs17060958.
Saleh H, Amer E, Abuhmed T, Ali A, Al-Fuqaha A, El-Sappagh S. Computer aided progression detection model based on optimized deep LSTM ensemble model and the fusion of multivariate time series data. Sci Rep. 2023;13(1):16336. https://doi.org/10.1038/s41598-023-42796-6.
Aydın E, Cengiz MA, Demirsoy L, Demirsoy H. A hybrid analytical framework for cracking and some fruit quality features in sweet cherries. Horticulturae. 2025;11(6):709. https://doi.org/10.3390/horticulturae11060709.
Further information
Background: Wood apple is a hardy yet underutilized fruit tree of the Indian subcontinent, valued for its nutritional, medicinal, and ecological significance. Despite its potential as a climate-resilient fruit species, the determinants of yield variability remain poorly characterized. This study aimed to quantify how morphometric descriptors of canopy architecture, floral, and fruit traits explain yield variation across 62 wood apple genotypes. By integrating multivariate statistics with explainable machine-learning models (Random Forest + SHAP), we provide the first data-driven framework for identifying trait combinations that govern productivity in this underutilized tree species. The approach offers a novel, interpretable path toward ideotype selection and precision orchard design.
Results: Extensive morphometric variability was observed across the 62 genotypes for vegetative, foliar, floral, fruit and seed traits, indicating a broad genetic base. Yield per tree ranged widely from 35 to 127 kg, with a mean of 75 kg tree⁻¹. Principal Component Analysis (PCA) showed that canopy architecture, branch traits, and leaf-fruit attributes collectively explained 31.1% of the total variation. Correlation analysis revealed positive associations of yield with tree shape, pulp colour, and fruit-bearing tendency, whereas ornamental fruit traits and excessive spine density were negatively related. The optimized Random Forest (RF) model achieved strong predictive performance on the test dataset (R² = 0.84; RMSE = 9.45 kg; MAE = 7.12 kg), significantly outperforming Multiple Linear Regression (R² = 0.62), Support Vector Regression (R² = 0.76), and the Deep Learning (MLP) model (R² = 0.71). RF identified tree shape (16%), open flower colour (11.3%), and pulp colour (9.0%) as the most influential predictors of yield. SHAP analysis further clarified the non-linear and interactive effects among traits, highlighting the combined influence of canopy vigour, reproductive efficiency, and fruit-quality attributes on productivity. Hierarchical clustering grouped the genotypes into three clusters, with Cluster 2 characterized by compact canopies, superior reproductive traits, and desirable pulp features showing the highest and most stable yield (mean 84.6 kg tree⁻¹). Cluster 0 displayed moderate-to-high yields (79.7 kg tree⁻¹) but with greater variability, while Cluster 1 comprised the lowest-yielding genotypes (70.4 kg tree⁻¹). These findings confirm that productivity in wood apple is jointly regulated by architectural and reproductive traits through coordinated source-sink dynamics.
Conclusions: Wood apple yield is governed by an integrated suite of architectural and reproductive traits, rather than single descriptors. Genotypes with compact canopies, regular bearing habit, and consumer-preferred pulp characteristics emerge as promising ideotypes for high productivity and orchard efficiency. By combining Random Forest and SHAP, this study demonstrates the practical value of explainable machine-learning tools in identifying actionable trait combinations and providing a robust, trait-based framework to support data-driven breeding and climate-smart orchard design in underutilized perennial fruit crops.
(© 2025. The Author(s).)
Declarations. Ethics approval and consent to participate: Not Applicable. Consent for publication: Not Applicable. Competing interests: The authors declare no competing interests.