Skip to content
2000
Volume 16, Issue 1
  • ISSN: 2772-574X
  • E-ISSN: 2772-5758

Abstract

Accurate prediction of breeding values is challenging due to the genotype-phenotype relationship is crucial and necessary for producing crops with elite genotypes. This paper is about investigating and predicting the phenotypic trait Height and Yeild in a genotype.

Background

Most of the existing studies focus on genetic methods or Machine learning models, in this, we implemented a hybrid combination of genetic methods and machine learning models that accurately predicted phenotypic trait yield, height and subpopulation.

Methodology

Our proposed methodology for genomic prediction of yield in (rice) involves a two-level classification approach. First, we classify biological sequences and cluster them using the UPGMA algorithm on a phylogenetic tree. Then, we use advanced machine learning techniques like Random Forest, and K-Nearest Neighbours to predict GEBVs with 85-95% accuracy on rice subpopulations.

Results

We achieved an accuracy of 93% when compared with other stated literature in this paper.

Conclusion

This approach overcomes limitations and effectively enhances crop breeding by capturing the genotype-phenotype relationship.

Loading

Article metrics loading...

/content/journals/rafna/10.2174/012772574X281849240130120235
2024-02-15
2025-03-01
Loading full text...

Full text loading...

References

  1. KhushG.S. JenaK. Current status and future prospects for research on blast resistance in rice (Oryza sativa L.). WangG.L. ValentB. Advances in Genetics, Genomics and Control of Rice Blast Disease.DordrechtSpringer200910.1007/978‑1‑4020‑9500‑9_1
    [Google Scholar]
  2. SpindelJ. BegumH. AkdemirD. VirkP. CollardB. RedoñaE. AtlinG. JanninkJ.L. McCouchS.R. Genomic selection and association mapping in rice (Oryza sativa): Effect of trait genetic architecture, training population composition, marker number and statistical model on accuracy of rice genomic selection in elite, tropical rice breeding lines.PLoS Genet.2015112e100498210.1371/journal.pgen.1004982 25689273
    [Google Scholar]
  3. SafaeiS.M.H. DadpasandM. MohammadabadiM. AtashiH. StavetskaR. KlopenkoN. KalashnykO. An origanum majorana leaf diet influences myogenin gene expression, performance, and carcass characteristics in lambs.Animals20221311410.3390/ani13010014 36611623
    [Google Scholar]
  4. Jafari AhmadabadiS.A.A. Askari-HemmatH. MohammadabadiM. AsadiM. MansouriM. The effect of Cannabis seed on DLK1 gene expression in heart tissue of Kermani lambs.Agricult. Biotechnol. J.2023151217234
    [Google Scholar]
  5. MohammadabadiM. MasoudzadehS.H. KhezriA. KalashnykO. StavetskaR.V. KlopenkoN.I. OleshkoV.P. TkachenkoS.V. Fennel (Foeniculum vulgare) seed powder increases Delta-Like Non-Canonical Notch Ligand 1 gene expression in testis, liver, and humeral muscle tissues of growing lambs.Heliyon2021712e0854210.1016/j.heliyon.2021.e08542 34917815
    [Google Scholar]
  6. ShokriS. KhezriA. MohammadabadiM. KheyrodinH. The expression of MYH7 gene in femur, humeral muscle and back muscle tissues of fattening lambs of the Kermani breed. Agricultural.Biotechnol. J.2023152217236
    [Google Scholar]
  7. BarazandehA. MohammadabadiM.R. Ghaderi-ZefreheiM. Nezamabadi-pourH. Genome-wide analysis of CpG islands in some livestock genomes and their relationship with genomic features.Czech J. Anim. Sci.2016611148749510.17221/78/2015‑CJAS
    [Google Scholar]
  8. HickeyJ.M. DreisigackerS. CrossaJ. HearneS. BabuR. PrasannaB.M. GrondonaM. ZambelliA. WindhausenV.S. MathewsK. GorjancG. Evaluation of genomic selection training population designs and genotyping strategies in plant breeding programs using simulation.Crop Sci.20145441476148810.2135/cropsci2013.03.0195
    [Google Scholar]
  9. MasoudzadehS.H. MohammadabadiM.R. KhezriA. Kochuk-YashchenkoO.A. KucherD.M. BabenkoO.I. BushtrukM.V. TkachenkoS.V. StavetskaR.V. KlopenkoN.I. OleshkoV.P. TkachenkoM.V. TitarenkoI.V. Dlk1 gene expression in different Tissues of lamb.Iran. J. Appl. Anim. Sci.202010669677
    [Google Scholar]
  10. ShahsavariM. MohammadabadiM. KhezriA. BorshchO. BabenkoO. KalashnykO. AfanasenkoV. KondratiukV. Effect of fennel ( Foeniculum vulgare ) seed powder consumption on insulin-like growth factor 1 gene expression in the liver tissue of growing lambs.Gene Expr2022000(000), 000.10.14218/GE.2022.00017
    [Google Scholar]
  11. MohammadinejadF. MohammadabadiM. RoudbariZ. SadkowskiT. Identification of Key Genes and Biological Pathways Associated with Skeletal Muscle Maturation and Hypertrophy in Bos taurus, Ovis aries, and Sus scrofa.Animals20221224347110.3390/ani12243471 36552391
    [Google Scholar]
  12. JeongS. KimJ.Y. KimN. GMStool: GWAS-based marker selection tool for genomic prediction from genomic data.Sci. Rep.20201011965310.1038/s41598‑020‑76759‑y 33184432
    [Google Scholar]
  13. MeuwissenT.H. HayesB.J. GoddardM. Prediction of total genetic value using genome-wide dense marker maps.Genetics2001157418191829
    [Google Scholar]
  14. BushW.S. MooreJ.H. Chapter 11: Genome-wide association studies.PLOS Comput. Biol.2012812e100282210.1371/journal.pcbi.1002822 23300413
    [Google Scholar]
  15. ZhangQ. ZhangQ. JensenJ. Association studies and genomic prediction for genetic improvements in agriculture.Front. Plant Sci.20221390423010.3389/fpls.2022.904230 35720549
    [Google Scholar]
  16. MohammadabadiM.R. Asadollahpour NanaeiH. Leptin gene expression in raini cashmere goat using real-time PCR.Agricult. Biotechnol. J.202113197214
    [Google Scholar]
  17. SaadatabadiL.M. MohammadabadiM. NanaeiH.A. GhanatsamanZ.A. StavetskaR.V. KalashnykO. Kochuk-YashchenkoO.A. KucherD.M. Unraveling candidate genes related to heat tolerance and immune response traits in some native sheep using whole genome sequencing data.Small Rumin. Res.202322510701810.1016/j.smallrumres.2023.107018
    [Google Scholar]
  18. VanRadenP.M. Efficient methods to compute genomic predictions.J. Dairy Sci.200891114414442310.3168/jds.2007‑0980 18946147
    [Google Scholar]
  19. VasanthaS.V. KiranmaiB. Machine learning-based breeding values prediction system (ML-BVPS).Proceedings of Data Analytics and Management: ICDAM 2021.Springer2022
    [Google Scholar]
  20. XuY. MaK. ZhaoY. WangX. ZhouK. YuG. LiC. LiP. YangZ. XuC. XuS. Genomic selection: A breakthrough technology in rice breeding.Crop J.20219366967710.1016/j.cj.2021.03.008
    [Google Scholar]
  21. LiuY. WangD. HeF. WangJ. JoshiT. XuD. Phenotype prediction and genome-wide association study using the deep convolutional neural network of soybean.Front. Genet.201910109110.3389/fgene.2019.01091 31824557
    [Google Scholar]
  22. RohflF.J. Phylogenetic models and reticulations.J. Classif.2000172185189
    [Google Scholar]
  23. BartholoméJ. PrakashP.T. CobbJ.N. Genomic prediction: Progress and perspectives for rice improvement.Methods Mol. Biol.20222467569617
    [Google Scholar]
  24. KalerA.S. PurcellL.C. BeissingerT. GillmanJ.D. Genomic prediction models for traits differing in heritability for soybean, rice, and maize.BMC Plant Biol.20222218710.1186/s12870‑022‑03479‑y 35219296
    [Google Scholar]
  25. MiguelR. Bioinformatics Algorithms: Design and Implementation in Python.Academic Press2018
    [Google Scholar]
  26. Stefan Van DongenT. WinnepenninckxB. Multiple UPGMA and neighbor-joining trees and the performance of some computer packages.Mol. Biol. Evol.1996132309313
    [Google Scholar]
  27. LiM. SillanpääM.J. Bayesian marker selection in high-dimensional generalized linear models.J. Am. Stat. Assoc.2012107498565576 23329858
    [Google Scholar]
  28. BhavsarH. PanchalM.H. A review on support vector machine for data classification.Int. J. Adv. Res. Comput. Eng. Technol.2012110185189
    [Google Scholar]
  29. PearsonW.R. Using the FASTA program to search protein and DNA sequence databases.Methods Mol. Biol.199424Part I307331 8205202
    [Google Scholar]
  30. Available from: https://ricevarmap.npcgr.cn
  31. (a XiaX. Distance-based phylogenetic methods.Bioinformatics and the Cell.Springer2018
    [Google Scholar]
  32. (b GianolaD. A unified view of genomic prediction.Genetics20061822753755
    [Google Scholar]
  33. ChapmanB. ChangJ. Biopython.ACM SIGBIO Newsletter2000202151910.1145/360262.360268
    [Google Scholar]
  34. FASTA format descriptionAvailable from: https://www.bioinformatics.nl/tools/crab_fasta.html
  35. KramerO. KramerO. K-nearest neighbors.Dimensionality Reduction with Unsupervised Nearest Neighbors.Springer2013
    [Google Scholar]
  36. KaurJ. BhambriP. GuptaO.P. Distance based phylogenetic trees with bootstrapping.Int. J. Comput. Appl.2012472461010.5120/7502‑0364
    [Google Scholar]
  37. Available from: https://machinelearningmastery.com/bagging-and-random-forest-ensemble-algorithms-for-machine-learning/
  38. What is Regression?Available from: https://www.investopedia.com/terms/r/regression.asp#:~:text=Investopedia%20%2F%20Joules%20Garc
  39. KiranmaiB. DamodaramA. A review on evaluation measures for data mining tasks.Int. J. Eng. Comput. Sci.20143772177220
    [Google Scholar]
  40. HossinM. SulaimanM. N. A review on evaluation metrics for data classification evaluations.Int. J. Data Mining Knowl. Manag. Proc.2015520111
    [Google Scholar]
  41. HabierD. FernandoR.L. GarrickD.J. Genomic BLUP decoded: A look into the black box of genomic prediction.Genetics2013194359760710.1534/genetics.113.152207 23640517
    [Google Scholar]
  42. de los CamposG. HickeyJ.M. Pong-WongR. DaetwylerH.D. CalusM.P.L. Whole-genome regression and prediction methods applied to plant and animal breeding.Genetics2013193232734510.1534/genetics.112.143313 22745228
    [Google Scholar]
  43. DudoitS. YangY.H. CallowM.J. SpeedT.P. Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments.Stat. Sin.2002111139
    [Google Scholar]
  44. MittagF. Identification of genomic regions associated with phenotypic variation between dog breeds using selection mapping.PLoS Genet.2012812e1002890
    [Google Scholar]
  45. OkserS. Regularized machine learning in the genetic prediction of complex traits.PLoS Genet.20141011e100475410.1371/journal.pgen.1004754
    [Google Scholar]
  46. OgutuJ.O. PiephoH.P. Schulz-StreeckT. A comparison of random forests, boosting and support vector machines for genomic selection.BMC Proc.20115S3S1110.1186/1753‑6561‑5‑S3‑S11 21624167
    [Google Scholar]
  47. ZieglerA. Methods for meta-analysis of genetic data.Eur. J. Hum. Genet.2007157740746
    [Google Scholar]
  48. IsidroJ. JanninkJ.L. AkdemirD. PolandJ. HeslotN. SorrellsM.E. Training set optimization under population structure in genomic selection.Theor. Appl. Genet.2015128114515810.1007/s00122‑014‑2418‑4 25367380
    [Google Scholar]
  49. GrenierC. CaoT.V. OspinaY. QuinteroC. ChâtelM.H. TohmeJ. CourtoisB. AhmadiN. Accuracy of genomic selection in a rice synthetic population developed for recurrent selection breeding.PLoS One2015108e013659410.1371/journal.pone.0136594 26313446
    [Google Scholar]
  50. CuiY. LiR. LiG. ZhangF. ZhuT. ZhangQ. AliJ. LiZ. XuS. Hybrid breeding of rice via genomic selection.Plant Biotechnol. J.2020181576710.1111/pbi.13170 31124256
    [Google Scholar]
  51. BonaccorsoG. Machine learning algorithms.Packt Publishing Ltd.2017
    [Google Scholar]
/content/journals/rafna/10.2174/012772574X281849240130120235
Loading
/content/journals/rafna/10.2174/012772574X281849240130120235
Loading

Data & Media loading...


  • Article Type:
    Research Article
Keyword(s): genomic prediction; genotype; machine learning; Oryza sativa; phenotype; UPGMA
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test