CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Automated Traffic Time Series Prediction
Blekinge Institute of Technology, Faculty of Computing, Department of Creative Technologies.ORCID iD: 0000-0001-5824-425X
2018 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Intelligent transportation systems (ITS) are becoming more and more effective. Robust and accurate short-term traffic prediction plays a key role in modern ITS and demands continuous improvement. Benefiting from better data collection and storage strategies, a huge amount of traffic data is archived which can be used for this purpose especially by using machine learning.

For the data preprocessing stage, despite the amount of data available, missing data records and their messy labels are two problems that prevent many prediction algorithms in ITS from working effectively and smoothly. For the prediction stage, though there are many prediction algorithms, higher accuracy and more automated procedures are needed.

Considering both preprocessing and prediction studies, one widely used algorithm is k-nearest neighbours (kNN) which has shown high accuracy and efficiency. However, the general kNN is designed for matrix instead of time series which lacks the use of time series characteristics. Choosing the right parameter values for kNN is problematic due to dynamic traffic characteristics. This thesis analyses kNN based algorithms and improves the prediction accuracy with better parameter handling using time series characteristics.

Specifically, for the data preprocessing stage, this work introduces gap-sensitive windowed kNN (GSW-kNN) imputation. Besides, a Mahalanobis distance-based algorithm is improved to support correcting and complementing label information. Later, several automated and dynamic procedures are proposed and different strategies for making use of data and parameters are also compared.

Two real-world datasets are used to conduct experiments in different papers. The results show that GSW-kNN imputation is 34% on average more accurate than benchmarking methods, and it is still robust even if the missing ratio increases to 90%. The Mahalanobis distance-based models efficiently correct and complement label information which is then used to fairly compare performance of algorithms. The proposed dynamic procedure (DP) performs better than manually adjusted kNN and other benchmarking methods in terms of accuracy on average. What is better, weighted parameter tuples (WPT) gives more accurate results than any human tuned parameters which cannot be achieved manually in practice. The experiments indicate that the relations among parameters are compound and the flow-aware strategy performs better than the time-aware one. Thus, it is suggested to consider all parameter strategies simultaneously as ensemble strategies especially by including window in flow-aware strategies.

In summary, this thesis improves the accuracy and automation level of short-term traffic prediction with proposed high-speed algorithms.

Place, publisher, year, edition, pages
Karlskrona: Blekinge Tekniska Högskola, 2018.
Series
Blekinge Institute of Technology Doctoral Dissertation Series, ISSN 1653-2090 ; 10
Keywords [en]
Machine Learning, Time Series, Traffic Engineering
National Category
Computer Sciences Transport Systems and Logistics
Identifiers
URN: urn:nbn:se:bth-17210ISBN: 978-91-7295-360-4 (print)OAI: oai:DiVA.org:bth-17210DiVA, id: diva2:1260232
Public defence
2018-11-30, J1650, Valhallav. 1, Karlskrona, 13:30 (English)
Opponent
Supervisors
Available from: 2018-11-02 Created: 2018-11-01 Last updated: 2018-12-14Bibliographically approved
List of papers
1. An Improved k-Nearest Neighbours Method for Traffic Time Series Imputation
Open this publication in new window or tab >>An Improved k-Nearest Neighbours Method for Traffic Time Series Imputation
Show others...
2017 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Intelligent transportation systems (ITS) are becoming more and more effective, benefiting from big data. Despite this, missing data is a problem that prevents many prediction algorithms in ITS from working effectively. Much work has been done to impute those missing data. Among different imputation methods, k-nearest neighbours (kNN) has shown excellent accuracy and efficiency. However, the general kNN is designed for matrix instead of time series so it lacks the usage of time series characteristics such as windows and weights that are gap-sensitive. This work introduces gap-sensitive windowed kNN (GSW-kNN) imputation for time series. The results show that GSW-kNN is 34% more accurate than benchmarking methods, and it is still robust even if the missing ratio increases to 90%.

Place, publisher, year, edition, pages
IEEE, 2017
Keywords
Traffic Time Series, Gap-Sensitive Windowed k-Nearest Neighbours (GSW-kNN), Missing Data Imputation
National Category
Transport Systems and Logistics Computer Sciences
Identifiers
urn:nbn:se:bth-15209 (URN)000427816107081 ()978-1-5386-3524-7 (ISBN)
Conference
Chinese Automation Congress (CAC), Jinan
Available from: 2017-09-28 Created: 2017-09-28 Last updated: 2023-12-28Bibliographically approved
2. Correcting and complementing freeway traffic accident data using mahalanobis distance based outlier detection
Open this publication in new window or tab >>Correcting and complementing freeway traffic accident data using mahalanobis distance based outlier detection
2017 (English)In: Technical Gazette, ISSN 1330-3651, E-ISSN 1848-6339, Vol. 24, no 5, p. 1597-1607Article in journal (Refereed) Published
Abstract [en]

A huge amount of traffic data is archived which can be used in data mining especially supervised learning. However, it is not being fully used due to lack of accurate accident information (labels). In this study, we improve a Mahalanobis distance based algorithm to be able to handle differential data to estimate flow fluctuations and detect accidents and use it to support correcting and complementing accident information. The outlier detection algorithm provides accurate suggestions for accident occurring time, duration and direction. We also develop a system with interactive user interface to realize this procedure. There are three contributions for data handling. Firstly, we propose to use multi-metric traffic data instead of single metric for traffic outlier detection. Secondly, we present a practical method to organise traffic data and to evaluate the organisation for Mahalanobis distance. Thirdly, we describe a general method to modify Mahalanobis distance algorithms to be updatable. © 2017, Strojarski Facultet. All rights reserved.

Place, publisher, year, edition, pages
Strojarski Facultet, 2017
Keywords
Accident data, Data labelling, Differential distance, Mahalanobis distance, Outlier detection, Traffic data, Updatable algorithm, Accidents, Data mining, Statistics, User interfaces, Mahalanobis distances, Data handling
National Category
Communication Systems Computer and Information Sciences
Identifiers
urn:nbn:se:bth-15472 (URN)10.17559/TV-20150616163905 (DOI)000417100300037 ()2-s2.0-85032512786 (Scopus ID)
Note

Funded by National Natural Science Foundation of China

Funding nr. 61364019

Available from: 2017-11-10 Created: 2017-11-10 Last updated: 2023-12-28Bibliographically approved
3. Anomaly-Aware Traffic Prediction Based on Automated Conditional Information Fusion
Open this publication in new window or tab >>Anomaly-Aware Traffic Prediction Based on Automated Conditional Information Fusion
2018 (English)In: Proceedings of 21st International Conference on Information Fusion, IEEE conference proceedings, 2018Conference paper, Published paper (Refereed)
Abstract [en]

Reliable and accurate short-term traffic prediction plays a key role in modern intelligent transportation systems (ITS) for achieving efficient traffic management and accident detection. Previous work has investigated this topic but lacks study on automated anomaly detection and conditional information fusion for ensemble methods. This works aims to improve prediction accuracy by fusing information considering different traffic conditions in ensemble methods. In addition to conditional information fusion, a day-week decomposition (DWD) method is introduced for preprocessing before anomaly detection. A k-nearest neighbours (kNN) based ensemble method is used as an example. Real-world data are used to test the proposed method with stratified ten-fold cross validation. The results show that the proposed method with incident labels improves predictions up to 15.3% and the DWD enhanced anomaly-detection improves predictions up to 8.96%. Conditional information fusion improves ensemble prediction methods, especially for incident traffic. The proposed method works well with enhanced detections and the procedure is fully automated. The accurate predictions lead to more robust traffic control and routing systems.

Place, publisher, year, edition, pages
IEEE conference proceedings, 2018
Keywords
Information Fusion, Intelligent Transportation Systems (ITS), k-Nearest Neighbours (kNN), Short-Term Traffic PredictionTime Series, Decomposition
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:bth-16942 (URN)10.23919/ICIF.2018.8455244 (DOI)000495071900314 ()9780996452762 (ISBN)
Conference
21st International Conference on Information Fusion, Cambridge, UK
Available from: 2018-08-27 Created: 2018-08-27 Last updated: 2023-12-28Bibliographically approved
4. Short-Term Traffic Forecasting Using Self-Adjusting k-Nearest Neighbours
Open this publication in new window or tab >>Short-Term Traffic Forecasting Using Self-Adjusting k-Nearest Neighbours
2018 (English)In: IET Intelligent Transport Systems, ISSN 1751-956X, E-ISSN 1751-9578, Vol. 12, no 1, p. 41-48Article in journal (Refereed) Published
Abstract [en]

Short-term traffic forecasting is becoming more important in intelligent transportation systems. The k-nearest neighbours (kNN) method is widely used for short-term traffic forecasting.However, kNN parameters self-adjustment has been a problem due to dynamic traffic characteristics. This paper proposes a fully automatic dynamic procedure kNN (DP-kNN) that makes the kNN parameters self-adjustable and robust without predefined models or training. We used realworld data with more than one-year traffic records to conduct experiments. The results show that DP-kNN can perform better than manually adjusted kNN and other benchmarking methods with regards to accuracy on average. This study also discusses the difference between holiday and workday traffic prediction as well as the usage of neighbour distance measurement.

Place, publisher, year, edition, pages
Institution of Engineering and Technology, 2018
Keywords
intelligent transportation systems; short-term traffic forecasting; road traffic; DP-kNN; dynamic procedure kNN; self-adjusting k-nearest neighbours
National Category
Computer Sciences Transport Systems and Logistics
Identifiers
urn:nbn:se:bth-15727 (URN)10.1049/iet-its.2016.0263 (DOI)000426045200006 ()
Available from: 2018-01-09 Created: 2018-01-09 Last updated: 2023-12-28Bibliographically approved
5. Flow-Aware WPT k-Nearest Neighbours Regression for Short-Term Traffic Prediction
Open this publication in new window or tab >>Flow-Aware WPT k-Nearest Neighbours Regression for Short-Term Traffic Prediction
2017 (English)In: Proceedings - IEEE Symposium on Computers and Communications, Institute of Electrical and Electronics Engineers (IEEE), 2017, Vol. 07, p. 48-53, article id 8024503Conference paper, Published paper (Refereed)
Abstract [en]

Robust and accurate traffic prediction is critical in modern intelligent transportation systems (ITS). One widely used method for short-term traffic prediction is k-nearest neighbours (kNN). However, choosing the right parameter values for kNN is problematic. Although many studies have investigated this problem, they did not consider all parameters of kNN at the same time. This paper aims to improve kNN prediction accuracy by tuning all parameters simultaneously concerning dynamic traffic characteristics. We propose weighted parameter tuples (WPT) to calculate weighted average dynamically according to flow rate. Comprehensive experiments are conducted on one-year real-world data. The results show that flow-aware WPT kNN performs better than manually tuned kNN as well as benchmark methods such as extreme gradient boosting (XGB) and seasonal autoregressive integrated moving average (SARIMA). Thus, it is recommended to use dynamic parameters regarding traffic flow and to consider all parameters at the same time.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2017
Series
IEEE Symposium on Computers and Communications ISCC, ISSN 1530-1346
Keywords
Flow-Aware, Weighted Parameter Tuples, kNearest Neighbours Regression, Short-Term Traffic Prediction
National Category
Computer Sciences Transport Systems and Logistics
Identifiers
urn:nbn:se:bth-15208 (URN)10.1109/ISCC.2017.8024503 (DOI)000426895800009 ()9781538616291 (ISBN)
Conference
2017 IEEE Symposium on Computers and Communication (ISCC), Heraklion, Crete
Available from: 2017-09-28 Created: 2017-09-28 Last updated: 2023-12-28Bibliographically approved
6. An Overview of Parameter and Data Strategies for K-Nearest Neighbours Based Short-Term Traffic Prediction
Open this publication in new window or tab >>An Overview of Parameter and Data Strategies for K-Nearest Neighbours Based Short-Term Traffic Prediction
2017 (English)In: ACM International Conference Proceeding Series Volume Part F133326, Association for Computing Machinery (ACM), 2017, p. 68-74Conference paper, Published paper (Refereed)
Abstract [en]

Modern intelligent transportation systems (ITS) requires reliable and accurate short-term traffic prediction. One widely used method to predict traffic is k-nearest neighbours (kNN). Though many studies have tried to improve kNN with parameter strategies and data strategies, there is no comprehensive analysis of those strategies. This paper aims to analyse kNN strategies and guide future work to select the right strategy to improve prediction accuracy. Firstly, we examine the relations among three kNN parameters, which are: number of nearest neighbours (k), search step length (d) and window size (v). We also analysed predict step ahead (m) which is not a parameter but a user requirement and configuration. The analyses indicate that the relations among parameters are compound especially when traffic flow states are considered. The results show that strategy of using v leads to outstanding accuracy improvement. Later, we compare different data strategies such as flow-aware and time-aware ones together with ensemble strategies. The experiments show that the flowaware strategy performs better than the time-aware one. Thus, we suggest considering all parameter strategies simultaneously as ensemble strategies especially by including v in flow-aware strategies.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2017
Keywords
Short-Term Traffic Prediction, k-Nearest Neighbours Regression, Parameter and Data Strategies
National Category
Computer Sciences Transport Systems and Logistics
Identifiers
urn:nbn:se:bth-15210 (URN)10.1145/3157737.3157749 (DOI)9781450353762 (ISBN)
Conference
International Conference on Intelligent Traffic and Transportation (ICITT), Zurich
Available from: 2017-09-28 Created: 2017-09-28 Last updated: 2023-12-28Bibliographically approved

Open Access in DiVA

fulltext(22976 kB)14061 downloads
File information
File name FULLTEXT03.pdfFile size 22976 kBChecksum SHA-512
81ce0ab0055d0bb014818c1bb745cb7f9373d02ad8362731c3e781b087ba27353ff508e2e526145e076134efcf29bb10130c8ea329a4babb0b9b2e1eec1d9b9a
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Sun, Bin
By organisation
Department of Creative Technologies
Computer SciencesTransport Systems and Logistics

Search outside of DiVA

GoogleGoogle Scholar
Total: 14108 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1112 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf