2025
Targeted integrating hyperspectral and metabolomic data with spectral indices and metabolite content models for efficient salt-tolerant phenotype discrimination in Medicago truncatula
Background
Plant phenomics has made significant progress recently, with new demand to move from external characterization to internal exploration through data combination. Hyperspectral and metabolomic data, with cause-and-effect relationship, are given priority for integration. However, few efficient integrating methods are available.
Results
Here, we showed the way to explore hyperspectral data through combining with upper-level metabolomic data and perform higher-level-data-guided dimension reduction in target-trait-oriented manner to obtain high analysis efficiency. To verify its feasibility, two-stage pipeline combining hyperspectral and metabolic data was designed to discriminate salt-tolerant phenotype for Medicago truncatula mutants. Centered on salt tolerance, data are combined through constructing metabolite-based spectral indices outlining tolerance-related metabolic changes in primary screening, and models converting hyperspectral data to metabolite content for detailed characterizing in secondary screening. Target phenotype could be discriminated after five-day salt-treatment, much earlier than phenotypic difference appearance. 20 mutants with salt-tolerant phenotype were successfully identified from about 1000 mutants, almost tripled that of unintegrated analysis. Accuracy rate, confirmed with salt-tolerance analysis for experimental verification, reached 90 %, which can be optimized to 100 % theoretically utilizing results from hierarchical-clustering-assisted Principal Component Analysis.
Conclusions
Mutant-screening pipeline provided here is a practical example for targeted data integration and data mining under the guide of upper-layer omic data. Targeted combination of phenomic and metabolomic data provides the ability for accurate phenotype discrimination and prediction from both external and internal aspects, providing a powerful tool for phenotype selection in new-generation crop breeding.