A deep learning framework for aviation risk classification and high-order coupled risk modeling

Aviation risk analysis can be a useful empirical foundation using narrative incident reports gathered by the Aviation Safety Reporting System (ASRS), but due to its long-form format, class imbalance, and domain-specific semantics, automated modelling can be a challenging problem. To respond to these...

Full description

Saved in:
Bibliographic Details
Main Authors: Li, Xirui, Romli, Fairuz Izzuddin, Ali, Syaril Azrad Md, Md Zhahir, Amzari, Tang, Junqi
Format: Article
Language:en
Published: Elsevier 2026
Subjects:
Online Access:http://psasir.upm.edu.my/id/eprint/123078/1/123078.pdf
http://psasir.upm.edu.my/id/eprint/123078/
https://www.sciencedirect.com/science/article/pii/S0951832026000931
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Aviation risk analysis can be a useful empirical foundation using narrative incident reports gathered by the Aviation Safety Reporting System (ASRS), but due to its long-form format, class imbalance, and domain-specific semantics, automated modelling can be a challenging problem. To respond to these challenges, this study develops a domain-adapted deep learning model built upon the Robustly Optimized Bidirectional Encoder Representations from Transformers pretraining approach (RoBERTa) for multi-label identification of contributing factors in aviation safety reports. The proposed model improves multi-label classification performance by integrating four modules: instruction-based large language models (LLMs) data augmentation to reduce imbalance, a merging module to jointly model the narrative text and metadata, a composite loss to strengthen robustness in case of label imbalance, and domain adaptive pretraining on corpora. The experimental results indicate that the model achieves reliable improvements, while ablation experiments further clarify impact of each module. Based on the predicted contributing factors, an N-K model is constructed to quantify interaction strength, and a Bayesian network is used to model directed risk propagation. By accounting for both structural coupling and propagation probability, the framework identifies and ranks risk pathways that correspond to plausible accident developments. A case study demonstrates that the proposed approach can extract high-order, multi-domain propagation paths from narrative data, enabling structured interpretation of plausible accident evolution patterns. Taken together, the proposed framework provides a pipeline that converts incident narratives into actionable safety information, offering a scalable and structured basis for proactive aviation risk analysis.