This study presents a statistical analysis of train delays in the Swedish railway system. The focus of the study is to identify the best-fitting probability distributions for train arrival times across different stations and travel directions. Using the Kolmogorov-Smirnov (K-S) test, we evaluate the goodness of fit for common distributions—gamma, log-normal, and inverse Gaussian—to capture delay patterns at ten stations. Our findings reveal significant variability across stations, with the log-normal distribution providing the best fit for 70% of cases. However, some stations exhibited direction-specific deviations, emphasizing the need for localized analysis. Traditionally, train delays in Sweden have been assumed to be uniformly distributed across the network, an oversimplification frequently used in generating synthetic datasets for AI-based timetable rescheduling systems. This study challenges that assumption, demonstrating that delay distributions vary by station and direction. By incorporating station- and direction-specific modeling, our results contribute to the development of more accurate synthetic datasets. These insights support data-driven approaches to predictive modeling, operational efficiency improvements, and increased reliability in railway networks. Based on the best-fitting distributions identified through statistical testing, we generate synthetic data using maximum likelihood estimates and direct sampling. Our study systematically assesses the distributional characteristics of train arrivals across stations and directions in the southern Swedish railway network, aiming both to understand operational variability and to generate realistic synthetic data for AI-based rescheduling. Building on this analysis, our method produces datasets that preserve the statistical characteristics of real train delays, ensuring they are more suitable for training and evaluating AI-based rescheduling algorithms.