02425naa a2200229 a 450000100080000000500110000800800410001902400520006010000140011224501660012626000090029252016850030165000120198665000150199865000100201365000140202365300170203765300260205465300230208070000250210377300670212821289752021-01-05 2020 bl uuuu u00u1 u #d7 ahttps://doi.org/10.1007/s11295-020-01430-62DOI1 aSIMIQUELI aEntropy and mutual information in genome-wide selectionbthe splitting of k-fold cross-validation sets and implications for tree breeding.h[electronic resource] c2020 aRandom k-fold cross-validation in genome-wide selection (GWS) can help to estimate predictive ability (ryy^). Predictive ability tends to be higher when training, and validation sets present a high degree of kinship. However, many tree breeding populations are less genetically related to the training sets and have different levels of phenotypic diversity. Therefore, this study proposes methods of splitting k-fold cross-validation sets to optimize ryy^ estimates that are consistent with the breeding population and verify the impact of phenotypic and genotypic distribution on GWS. Using a simulated Eucalyptus trait (h2=0.5) and Pinus taeda L. data for diameter at breast height (h2=0.31), six methods were developed based on mutual information (I) and entropy (H) for measuring genetic similarity and phenotypic dissimilarity, respectively. All methods were evaluated for ryy^, bias, minimum squared error of prediction, and genomic heritability. The Pearson correlations of these parameters with the kinship coefficient, and I and H between and within training and validation sets were also estimated. Our results show that closer genetic similarity did not significantly increase ryy^ and that a lower H reduced ryy^ and overestimated genomic breeding values. Consequently, phenotypic diversity (high H) should be added to tree breeding populations to increase genetic gain and reduce bias. The new methods accurately fitted models according to the entropy of tree breeding populations and their genetic relationship to the training sets. Therefore, these methods provided usable estimates of genetic gain to produce consistent success of long-term tree breeding programs. aEntropy aEucalyptus aPinus aEucalipto aGenetic gain aGenome-wide selection aMutual information1 aRESENDE, M. D. V. de tTree Genetics & Genomesgv. 16, article number 37, 2020. 14 p.