• The VMD decomposition algorithm is applied to wind speed prediction. • A novel double-layer echo state network (D-ESN) has been developed and applied in wind speed prediction. • A genetic algorithm is used to optimize the hyperparameters of the D-ESN model. • The proposed model is verified using 6 wind speed datasets and 10 comparison models. Due to the strong randomness of wind speed, wind power generation is difficult to integrate into the grid. It is very important to predict wind speed reliably and accurately so that wind energy can be utilized effectively. In this study, to obtain accurate wind speed prediction results, a combined VMD-D-ESN model based on variational mode decomposition (VMD), a double-layer staged training echo state network (D-ESN) and genetic algorithm (GA) optimization is proposed. First, the VMD-D-ESN model preprocesses the original wind speed data with VMD and then uses the D-ESN model to predict each decomposed subsequence. Lastly, the final prediction value is obtained by combining all of the predicted subsequences. In the D-ESN model’s double-layer structure, the first layer selects the length of the training set, and the second layer has the ability to correct the prediction error in the first layer. In a practical application of case prediction using six different data collection sites, ten models are established to compare the performance of the proposed model. Compared with other traditional models, the results show that the model that combines the VMD decomposition algorithm with the D-ESN structure achieves high prediction accuracy and strong stability on all available datasets. Additionally, the model also shows that the use of the VMD decomposition algorithm strongly improves the prediction ability of the model.