Enhancing Forecast Accuracy: The Impact of Data Transformation in Time Series Models

Andrzej Dudek

doi:10.15611/eada.2025.4.03

Authors

Andrzej Dudek ✉️ Wroclaw University of Economics and Business
andrzej.dudek@ue.wroc.pl https://orcid.org/0000-0002-4943-8703

DOI:

https://doi.org/10.15611/eada.2025.4.03

Keywords:

forecasting, stochastic process models, recurrent neural networks

Abstract

Aim: The aim of the article was formulate suggestions on which preprocessing method is preferable for various forecasting algorithms, including machine learning approaches, particularly for forecasting stock values.

Methodology: Research study on actual stock values prediction on an example of 10 average NYSE enterprises, comparing five scenarios of data preparation.

Results: The results confirm theoretical assumptions and recommendations for the proper design of benchmark studies and real forecasting models.

Implications and recommendations: As stated in the literature of the subject data transformation for models based on stochastic processes, such as ARIMA and GARCH, transforming data to rates of return (a form of differentiation) is a desirable approach. For machine learning models, especially recurrent neural networks, such as the Long Short-Term Memory Network and the Gated Recurrent Unit, the min-max normalisation data transformation should be applied. For exponential Smoothing and Brownian motion methods, the best results were achieved for non-transformed (raw) data. The guidance relevant to benchmark studies and real forecasting models is presented in the final section of the paper. The central thesis was to emphasise, through the example of stock values forecasting, that proper benchmark studies and real-life applications should be designed in a way that ensures proper preprocessing is used for the given model. Using the same preprocessing for different models may sometimes yield misleading results.

Originality/value: The topic of data preparation and transformation, although commonly present in the literature of the subject, is rarely confirmed by research studies on real datasets. To the best of the author's knowledge, this type of analysis has not been conducted on real data to date.

Downloads

Download data is not yet available.

References

Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics, 31(3), 307-327.

Box, G. E. P., & Jenkins, G. M. (1970). Time series analysis: Forecasting and control. Holden-Day.

Box, G. E. P., Jenkins, G. M., Reinsel, G. C., & Ljung, G. M. (2016). Time series analysis: Forecasting and control (5th ed.). Wiley.

Brownlee, J. (2019). Deep learning for time series forecasting: Predict the future with MLPs, CNNs and LSTMs in Python. Machine Learning Mastery.

Cho, K., van Merriënboer, B., Bahdanau, D., & Bengio, Y. (2014). On the properties of neural machine translation: Encoder–decoder approaches. In Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation (pp. 103-111). Association for Computational Linguistics. https://doi.org/10.3115/v1/W14-4012

Chollet, F., & Allaire, J. J. (2018). Deep learning with R. Manning.

Einstein, A. (1905). On the movement of small particles suspended in stationary liquids required by the molecular-kinetic theory of heat. Annalen der Physik, 17, 549-560.

Enders, W. (2015). Applied econometric time series (4th ed.). Wiley.

Engle, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of the variance of UK inflation. Econometrica, 50(4), 987-1007.

Fischer, T., & Krauss, C. (2018). Deep learning with long short-term memory networks for financial market predictions. European Journal of Operational Research, 270(2), 654-669. https://doi.org/10.1016/j.ejor.2017.11.054

Francq, C., & Zakoïan, J.-M. (2019). GARCH models: Structure, statistical inference and financial applications (2nd ed.). Wiley.

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press.

Hamilton, J. D. (1994). Time series analysis. Princeton University Press.

Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735-1780. https://doi.org/10.1162/neco.1997.9.8.1735

Holt, C. C. (1957). Forecasting Seasonals and Trends by Exponentially Weighted Moving Averages. ONR Memorandum, Vol. 52. Carnegie Institute of Technology.

Tsay, R. S. (2010). Analysis of financial time series (3rd ed.). Wiley.

Wiener, N. (1923). Differential space. Journal of Mathematics and Physics, 2(1), 131-174.

Winters, P. R. (1960). Forecasting sales by exponentially weighted moving averages. Management Science, 6(3), 324-342.

Zhang, D., Chen, S., & He, Z. (2020). A deep learning framework for financial time series using stacked autoencoders and long–short term memory. PLOS ONE, 15(1), e0226997. https://doi.org/10.1371/journal.pone.0226997

Enhancing Forecast Accuracy: The Impact of Data Transformation in Time Series Models

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

Categories

License

Make a Submission

Archive

Indexing Services

Points