A correlation-embedded attention approach to mitigate multicollinearity in foreign exchange data using LSTM
Technologies currently drive the collection of big data in various fields, including algorithmic trading. This leads to a notable increase in the collection and storage of variables and data points (observations). While this offers opportunities to enhance the modeling of relationships between predi...
Saved in:
Main Author: | |
---|---|
Format: | Final Year Project / Dissertation / Thesis |
Published: |
2023
|
Subjects: | |
Online Access: | http://eprints.utar.edu.my/6119/1/MBA_2023_LMHS.pdf http://eprints.utar.edu.my/6119/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Technologies currently drive the collection of big data in various fields, including algorithmic trading. This leads to a notable increase in the collection and storage of variables and data points (observations). While this offers opportunities to enhance the modeling of relationships between predictors and response variables, it also presents challenges in data analysis, such as the multicollinearity problem. Multicollinearity refers to the situation where two or more independent variables exhibit an approximately linear relationship. Existing feature selection methods might undermine efforts to gather more data, since it results in the exclusion of new data. This, in turn, can lead to the loss of important and relevant information. Recent studies indicate that neural networks are more adept at handling data with multicollinearity compared to statistical estimators. Consequently, this study proposes two improvements for the Long Short-Term Memory neural network (LSTM). These improvements involve the integration of the attention mechanism and vector embeddings of correlation to address multicollinearity without eliminating features. This innovative approach enables the handling of multicollinearity without discarding variables. The study compares the performance of regression and classification in predicting the direction of the foreign exchange market, using the EUR/GBP, EUR/USD, GBP/USD, and NZD/USD data sets over a 6-year period from 1 January 2015 to 31 December 2020. Specifically, it evaluates the accuracy of predictions and their impact on trading returns under high multicollinearity settings. Furthermore, the study assesses the difference between LSTM models with and without the proposed module. The results indicate that classification enhances regression accuracy by 23.33% and trading return by 132.62% over the test set. Additionally, the proposed module offers a further improvement of 59.53% in trading returns. These findings demonstrate the superiority of classification as a problem formulation in high multicollinearity scenarios. The experimental results also reveal that neural networks can learn the relevance and redundancy of financial data to enhance classification performance. |
---|