Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Toward Automatic Data-Driven Traffic Time Series Prediction
Blekinge Institute of Technology, Faculty of Computing, Department of Creative Technologies.ORCID iD: 0000-0001-5824-425X
2017 (English)In: 5th Swedish Workshop on Data Science, 2017, Vol. 12, article id 12Conference paper, Poster (with or without abstract) (Refereed)
Abstract [en]

Short-term traffic prediction on freeways has been an active research subject in the past several decades. Various algorithms covering a broad range of topics regarding performance, data requirements and efficiency have been proposed. However, the implementation of machine learning based algorithms in traffic management centres is still limited. Two main reasons for this situation are, the data is messy or missing, and the parameter tuning requires experienced engineers.

The main objective of this thesis was to develop a procedure that can improve the performance and automation level of short-term traffic prediction.

Missing data is a problem that prevents many prediction algorithms in ITS from working effectively. Much work has been done to impute those missing data. Among different imputation methods, k-nearest neighbours (kNN) has shown excellent accuracy and efficiency. However, the general kNN is designed for matrix instead of time series so it lacks the usage of time series characteristics such as windows and weights that are gap-sensitive. We introduce gap-sensitive windowed kNN (GSW-kNN) imputation for time series. The results show that GSW-kNN is 34% more accurate than benchmarking methods, and it is still robust even if the missing ratio increases to 90%.

Lacking accurate accident information (labels) is another problem that prevents huge amount of traffic data to be fully used. We improve a Mahalanobis distance based algorithm to be able to handle differential data to estimate flow fluctuations and detect accidents and use it to support correcting and complementing accident information. The outlier detection algorithm provides accurate suggestions for accident occurring time, duration and direction. We also develop a system with interactive user interface to realize this procedure. There are three contributions for data handling. Firstly, we propose to use multi-metric traffic data instead of single metric for traffic outlier detection. Secondly, we present a practical method to organise traffic data and to evaluate the organisation for Mahalanobis distance. Thirdly, we describe a general method to modify Mahalanobis distance algorithms to be updatable.

For automatic parameter tuning, the experiments show that the flow-aware strategy performs better than the time-aware one. Thus, we use all parameter strategies simultaneously as ensemble strategies especially by including window in flow-aware strategies.

Based on the above studies, we have developed online-orientated and offline-orientated algorithms for real-time traffic forecasting. The online automatic tuned version is performing near the optimal manual tuned performance. The offline version gives the performance that cannot be achieved using the manual tuning. It is also 3.05% better than XGB and 11.7% better than traditional SARIMA.

Place, publisher, year, edition, pages
2017. Vol. 12, article id 12
National Category
Computer Sciences Transport Systems and Logistics
Identifiers
URN: urn:nbn:se:bth-15725OAI: oai:DiVA.org:bth-15725DiVA, id: diva2:1172031
Conference
5th Swedish Workshop on Data Science, Gothenburg
Available from: 2018-01-09 Created: 2018-01-09 Last updated: 2018-01-13Bibliographically approved

Open Access in DiVA

No full text in DiVA

Authority records BETA

Sun, Bin

Search in DiVA

By author/editor
Sun, Bin
By organisation
Department of Creative Technologies
Computer SciencesTransport Systems and Logistics

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 24 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf