Article Preview
TopIntroduction
Financial enterprises play a crucial role in the modern economy, with their financial health directly impacting market stability and investor confidence. These enterprises face a myriad of financial risks, encompassing market risk, credit risk, and operational risk. All of these risks are characterized by high complexity and uncertainty (Oyewo, 2022). Traditional financial risk management methods, while effective in certain contexts, often fall short when dealing with large-scale, multidimensional, and dynamically changing data (Xiaoli & Nong, 2021).
Rapid progress in data science and artificial intelligence, particularly the successful implementation of deep learning across various domains, has driven financial enterprises to adopt these emerging technologies to enhance their risk management capabilities (Wang et al., 2024). Convolutional neural networks (CNNs) and transformer models have attained notable success in image recognition and natural language processing (NLP), owing to the advantages they offer regarding handling complex data structures and capturing deep features (Bensalah et al., 2021; Hu et al., 2023).
Relying solely on one deep learning model to address the financial risk prediction problem for financial enterprises presents numerous challenges. Financial data is characterized by its time series nature, non-stationarity, and high noise levels—making model robustness and accuracy critical issues (Han et al., 2024; H. Li et al., 2020). It is imperative, therefore, to develop a hybrid model that leverages multiple deep learning techniques to fully exploit their individual strengths; this has become an important research topic in financial risk prediction.
A CNN is a deep learning architecture that demonstrates superior performance in processing image data, by extracting and classifying features through convolutional, pooling, and fully connected layers (Singh & Sabrol, 2021). While effective in image tasks, CNNs have limitations in handling time series data (Wang et al., 2021). Transformer models, leveraging self-attention mechanisms—and originally designed for NLP—efficiently handle long-sequence data and capture extended dependencies (Zhu et al., 2021). Transformer models are, however, computationally intensive and sensitive to noise and missing data. Wavelet transform (WT) is a signal processing technique that decomposes signals into different frequency components for multiscale analysis, extracting local features and details. WT is particularly effective in denoising and compressing signals (Guo et al., 2022) but limited in handling nonlinear signals. WT also requires appropriate wavelet bases and decomposition levels (Silik et al., 2021).
Motivated by the limitations in current financial risk prediction models, this study proposes and validates an HDL model that integrates CNN, transformer, and WT to enhance the accuracy and robustness of financial enterprise risk prediction. Current models, such as long short-term memory (LSTM) and GRU, while effective for sequential data, often struggle with capturing long-range dependencies; they also fail to handle multiscale features in complex financial data. CNNs are well-known for their powerful feature extraction capabilities. Transformers excel in capturing long-range dependencies through their self-attention mechanism. WT’s multiscale analysis effectively decomposes and reconstructs signals to extract critical features. By combining and integrating these three techniques, the proposed hybrid deep learning (HDL) model (CNN-Transformer-WT) aims to overcome the limitations of a single model, thereby offering a more accurate and stable solution for financial risk prediction.
Section Outline
This article begins with reviewing related literature, including conventional risk prediction methods, the use of deep learning techniques in financial risk prediction, and the latest research developments. This is followed by detailing the design and implementation of the suggested CNN-Transformer-WT hybrid model, explaining the functionality, structure, and interactions of its components. Finally, the researchers present experimental design and results analysis, including an explanation of dataset selection, model training and evaluation. Finally, the article presents a summary of research findings—discussed are the model’s potential applications, and suggestions for future research directions.