The power grid is a build-up of a mesh of thousands of sensors, embedded devices, and terminal units that communicate over different media. The heterogeneity of modern and legacy equipment calls for attention towards diverse network security measures. The critical infrastructure employs different security measures to detect and prevent adversaries, e.g., through signature-based tools. These approaches lack the potential to identify unknown attacks. Machine learning has the prospective to address novel attack vectors. This paper systematically evaluates the efficacy of learning algorithms from different families for intrusion detection in IEC 60870-5-104 protocol. One-class SVM and k-Nearest Neighbour unsupervised learning models show small potential when being tested on the IEC 104 unseen dataset with Area Under the Curve score 0.64 and 0.59, in the same order; and Matthews Correlation Coefficient value 0.3 and 0.2, respectively. The experimental results suggest little feasibility of the evaluated unsupervised learning approaches for anomaly detection in IEC 104 communication and recommend coupling it with other anomaly detection techniques. © 2021 IEEE.
Network anomaly detection for critical infrastructure supervisory control and data acquisition (SCADA) systems is the first line of defense against cyber-attacks. Often hybrid methods, such as machine learning with signature-based intrusion detection methods, are employed to improve the detection results. Here an attempt is made to enhance the support vector-based outlier detection method by leveraging behavioural attribute extension of the network nodes. The network nodes are modeled as graph vertices to construct related attributes that enhance network characterisation and potentially improve unsupervised anomaly detection ability for SCADA network. IEC 104 SCADA protocol communication data with good domain fidelity is utilised for empirical testing. The results demonstrate that the proposed approach achieves significant improvements over the baseline approach (average F1F1 score increased from 0.6 to 0.9, and Matthews correlation coefficient (MCC) from 0.3 to 0.8). The achieved outcome also surpasses the unsupervised scores of related literature. For critical networks, the identification of attacks is indispensable. The result shows an insignificant missed-alert rate (0.3%0.3% on average), the lowest among related works. The gathered results show that the proposed approach can expose rouge SCADA nodes reasonably and assist in further pruning the identified unusual instances.
Real-time video streaming suffer from bandwidth limitation that are unable to handle the high amount of video data. To reduce the amount of data to be streamed, we propose an adaptive technique to crop the important part of the video frames, and drop the part that are outside the important part; this part is called the Region of Interest (ROI). The Sum of Absolute Differences (SAD) is computed to the consecutive video frames on the server side to identify and extract the ROI. The ROI are extracted from the frames that are between reference frames based on three scenarios. The scenarios been designed to position the reference frames in the video frames sequence. Linear interpolation is performed from the reference frames to reconstruct the part that are outside the ROI on the mobile side. We evaluate the proposed approach for the three scenarios by looking at the size of the compressed videos and measure the quality of the videos by using the Mean Opinion Score (MOS). The results show that our technique significantly reduces the amount of data to be streamed over wireless networks with acceptable video quality are provided to the mobile viewers.
Streaming video over a wireless network faces several challenges such as high packet error rates, bandwidth variations, and delays, which could have negative effects on the video streaming and the viewer will perceive a frozen picture for certain durations due to loss of frames. In this study, we propose a Time Interleaving Robust Streaming (TIRS) technique to significantly reduce the frozen video problem and provide a satisfactory quality for the mobile viewer. This is done by reordering the streaming video frames as groups of even and odd frames. The objective of streaming the video in this way is to avoid the losses of a sequence of neighbouring frames in case of a long sequence interruption. We evaluate our approach by using a user panel and mean opinion score (MOS) measurements; where the users observe three levels of frame losses. The results show that our technique significantly improves the smoothness of the video on the mobile device in the presence of frame losses, while the transmitted data are only increased by almost 9% (due to reduced time locality).
Real time video streaming suffers from lost, delayed, and corrupted frames due to the transmission over error prone channels. As an effect of that, the user may notice a frozen picture in their screen. In this work, we propose a technique to eliminate the frozen video and provide a satisfactory quality to the mobile viewer by splitting the video frames into sub- frames. The multiple descriptions coding (MDC) is used to generate multiple bitstreams based on frame splitting and transmitted over multichannels. We evaluate our approach by using mean opinion score (MOS) measurements. MOS is used to evaluate our scenarios where the users observe three levels of frame losses for real time video streaming. The results show that our technique significantly improves the video smoothness on the mobile device in the presence of frame losses during the transmission.
The main challenge of real time video streaming over a wireless network is to provide good quality service (QoS) to the mobile viewer. However, wireless networks have a limited bandwidth that may not be able to handle the continues video frame sequence and also with the possibility that video frames could be dropped or corrupted during the transmission. This could severely affect the video quality. In this study we come up with a mechanism to eliminate the frozen video and provide a quality satisfactory for the mobile viewer. This can be done by splitting the video frames to sub-frame and transmitted over multiple channels. We will present a subjective test, the Mean Opinion Score (MOS). MOS is used to evaluate our scenarios where the users can observe three levels of frame losses for real time video streaming. The results for our technique significantly improves the indicate perceived that video quality.
Transmitting a real time video streaming over a wireless network cannot guarantee that all the frames could be received by the mobile devices. The characteristics of a wireless network in terms of the available bandwidth, frame delay, and frame losses cannot be known in advanced. In this work, we propose a new mechanism for streaming video over a wireless channel. The proposed mechanism prevents freezing frames in the mobile devices. This is done by splitting the video frame in two sub-frames and combines them with another sub-frame from different sequence position in the streaming video. In case of lost or dropped frame, there is still a possibility that another half (sub-frame) will be received by the mobile device. The receiving sub-frames will be reconstructed to its original shape. A rate adaptation mechanism will be also highlight in this work. We show that sever can skip up to 50% of the sub-frames and we can still be able to reconstruct the receiving sub-frame and eliminate the freezing picture in the mobile device.
Real-time video transmission over wireless channels has become an important topic in wireless communication because of the limited bandwidth of wireless network that should handle high amount of video frames. Video frames must arrive at the client before the playout time with enough time to display the contents of the frames. Real-time video transmission is particularly sensitive to delay as it has a strict bounded end-to-end delay constraint; video applications impose stringent requirements on communication parameters, such as frame lost and frame dropped due to excessive delay are the primary factors affecting the user-perceived quality. In this study we investigate ways of obtaining a graceful and controlled degradation of the quality, by introducing redundancy in the frame sequence and compensating this by limiting colourcoding and resolution. The effect of that is to use double streaming mechanism, in this way we will obtain less freezing at the expense of limited colours and resolution. Our experiments, applied to scenarios where users can observe three types of dropping load for real time video streaming, the analytical measurements tools are used in this study to evaluate the video quality is the mean opinion score and we will demonstrate this and argue that the proposed technique improves the use perceived of the video quality.
Automated static code analysis is an efficient technique to increase the quality of software during early development. This paper presents a case study in which mature software with known vul-nerabilities is subjected to a static analysis tool. The value of the tool is estimated based on reported failures from customers. An average of 17% cost savings would have been possible if the static analysis tool was used. The tool also had a 30% success rate in detecting known vulnerabilities and at the same time found 59 new vulnerabilities in the three examined products.
Software security can be improved by identifying and correcting vulnerabilities. In order to reduce the cost of rework, vulnerabilities should be detected as early and efficiently as possible. Static automated code analysis is an approach for early detection. So far, only few empirical studies have been conducted in an industrial context to evaluate static automated code analysis. A case study was conducted to evaluate static code analysis in industry focusing on defect detection capability, deployment, and usage of static automated code analysis with a focus on software security. We identified that the tool was capable of detecting memory related vulnerabilities, but few vulnerabilities of other types. The deployment of the tool played an important role in its success as an early vulnerability detector, but also the developers perception of the tools merit. Classifying the warnings from the tool was harder for the developers than to correct them. The correction of false positives in some cases created new vulnerabilities in previously safe code. With regard to defect detection ability, we conclude that static code analysis is able to identify vulnerabilities in different categories. In terms of deployment, we conclude that the tool should be integrated with bug reporting systems, and developers need to share the responsibility for classifying and reporting warnings. With regard to tool usage by developers, we propose to use multiple persons (at least two) in classifying a warning. The same goes for making the decision of how to act based on the warning.
Code reviews with static analysis tools are today recommended by several security development processes. Developers are expected to use the tools' output to detect the security threats they themselves have introduced in the source code. This approach assumes that all developers can correctly identify a warning from a static analysis tool (SAT) as a security threat that needs to be corrected. We have conducted an industry experiment with a state of the art static analysis tool and real vulnerabilities. We have found that average developers do not correctly identify the security warnings and only developers with specific experiences are better than chance in detecting the security vulnerabilities. Specific SAT experience more than doubled the number of correct answers and a combination of security experience and SAT experience almost tripled the number of correct security answers.
Automatic classification of environmental sounds, such as dog barking and glass breaking, is becoming increasingly interesting, especially for mobile devices. Most mobile devices contain both cameras and microphones, and companies that develop mobile devices would like to provide functionality for classifying both videos/images and sounds. In order to reduce the development costs one would like to use the same technology for both of these classification tasks. One way of achieving this is to represent environmental sounds as images, and use an image classification neural network when classifying images as well as sounds. In this paper we consider the classification accuracy for different image representations (Spectrogram, MFCC, and CRP) of environmental sounds. We evaluate the accuracy for environmental sounds in three publicly available datasets, using two well-known convolutional deep neural networks for image recognition (AlexNet and GoogLeNet). Our experiments show that we obtain good classification accuracy for the three datasets. © 2017 The Author(s).
Cluster validation measures are designed to find the partitioning that best fits the underlying data. In this paper, we show that these well-known and scientifically proven validation measures can also be used in a different context, i.e., for filtering mislabeled instances or class outliers prior to training in super-vised learning problems. A technique, entitled CVI-based Outlier Filtering, is proposed in which mislabeled instances are identified and eliminated from the training set, and a classification hypothesis is then built from the set of remaining instances. The proposed approach assigns each instance several cluster validation scores representing its potential of being an outlier with respect to the clustering properties the used validation measures assess. We examine CVI-based Outlier Filtering and compare it against the LOF detection method on ten data sets from the UCI data repository using five well-known learning algorithms and three different cluster validation indices. In addition, we study two approaches for filtering mislabeled instances: local and global. Our results show that for most learning algorithms and data sets, the proposed CVI-based outlier filtering algorithm outperforms the baseline method (LOF). The greatest increase in classification accuracy has been achieved by combining at least two of the used cluster validation indices and global filtering of mislabeled instances. © 2018 IEEE.
In this work, we report an ongoing study that aims to apply cluster validation measures for analyzing email communications at an organizational level of a company. This analysis can be used to evaluate the company structure and to produce further recommendations for structural improvements. Our initial evaluations, based on data in the forms of emails logs and organizational structure for a large European telecommunication company, show that cluster validation techniques can be useful tools for assessing the organizational structure using objective analysis of internal email communications, and for simulating and studying different reorganization scenarios.
In this work, we apply cluster validation measures for analyzing email communications at an organizational level of a company. This analysis can be used to evaluate the company structure and to produce further recommendations for structural improvements. Our evaluations, based on data in the forms of email logs and organizational structure for a large European telecommunication company, show that cluster validation techniques can be useful tools for assessing the organizational structure using objective analysis of internal email communications, and for simulating and studying different reorganization scenarios.
Many multiprocessor systems are based on distributed shared memory. It is often important to statically bind threads to processors in order to avoid remote memory access, due to performance. Finding a good allocation takes long time and it is hard to know when to stop searching for a better one. It is sometimes impossible to run the application on the target machine. The developer needs a tool that finds the good allocations without the target multiprocessor. We present a tool that uses a greedy algorithm and produces allocations that are more than 40% faster (in average) than when using a binpacking algorithm. The number of allocations to be evaluated can be reduced by 38% with a 2% performance loss. Finally, an algorithm is proposed that is promising in avoiding local maxima.
Efficient performance tuning of parallel programs for multiprocessors is often hard. When it comes to assigning threads to processors there is not much support from commercial operating systems, like the Solaris operating system. The only known value is, in best case, the total execution time of each thread. The developer is left to the bin packing algorithm with no knowledge about the interactions and dependencies between the threads. The bin packing algorithm assigns, in the worst case, the threads to the processors such that the program will have the longest possible execution time. A simple example of such a program is shown. We present here a way of retrieving more information and a test mechanism that makes it possible to compare two different assignments of threads on processors also with regard to the interactions and dependencies between the threads. Also an algorithm is proposed that gives the best assignment of threads to processors in the case above where the bin packing algorithm gave the worst possible assignment. The algorithm uses shadow-processors and requires more processors than on the target machine during some allocation steps. Thus, a simulation tool like the one presented here must be used.
Efficient performance tuning of parallel programs is often hard. Optimization is often done when the program is written as a last effort to increase the performance. With sequential programs each (executed) code segment will affect the total execution time of the program. Thus, any code segment that is optimized in a sequential program will decrease the execution time. In the case of a parallel program executed on a multiprocessor this is not always true. This is due to dependencies between the different threads. As a result, certain code segments of the execution may not affect the total execution time of the program. Thus, optimization of such code segments will not increase the performance. In this paper we present a new approach to perform the optimization phase. Our approach finds the critical path of the multithreaded program and the optimization is only done on those specific code segments of the program. We have implemented the critical path analysis in a performance optimization tool.
The use of multiprocessors is an important way to increase the performance of a parallel program. This means that. the program has to be parallelized to make use of the multiple processors. The parallelization is unfortunately not an easy task. Development tools supporting parallel programs are important. Further, it is the customer that decides the number of processors in the target machine, and as a result the developer has to make sure that the program runs efficiently on any number of processors. Many simulation tools support the developer by simulating any number of processors and predict the performance based on a uniprocessor execution trace. This popular technique gives reliable results in many cases. Based on our experience from developing such a tool, and studying other (commercial) tools, we have identified three basic simulation models. Due to the flexibility of general purpose programming languages and operating systems, like C/C++ and Sun Solaris, two of the models may cause deadlock in a deadlock-free program. Selecting the appropriate model is difficult, we show that the three models have significantly different accuracy when using real world programs. Based on the findings we present a practical scheme when to use the models.
The use of multiprocessors is an important way to increase the performance of a supercom-puting program. This means that the program has to be parallelized to make use of the multi-ple processors. The parallelization is unfortunately not an easy task. Development tools supporting parallel programs are important. Further, it is the customer that decides the number of processors in the target machine, and as a result the developer has to make sure that the pro-gram runs efficiently on any number of processors. Many simulation tools support the developer by simulating any number of processors and predict the performance based on a uni-processor execution trace. This popular technique gives reliable results in many cases. Based on our experience from developing such a tool, and studying other (commercial) tools, we have identified three basic simulation models. Due to the flexibility of general purpose programming languages and operating systems, like C/C++ and Sun Solaris, two of the models may cause deadlock in a deadlock-free program. Selecting the appropriate model is difficult, since we in this paper also show that the three models have significantly different accuracy when using real world programs. Based on the findings we present a practical scheme when to use the three models.
Efficient performance tuning of parallel programs is often hard. We present a performance prediction and visualization tool called VPPB. Based on a monitored uni-processor execution, VPPB shows the (predicted) behaviour of a multithreaded program using any number of processors and the program behaviour is visualized as a graph. The first version of VPPB was unable to handle I/O operations. This version has, by an improved tracing technique, added the possibility to trace activities at the kernel level as well. Thus, VPPB is now able to trace various I/O activities, e.g., manipulation of OS internal buffers, physical disk I/O, socket I/O, and RPC. VPPB allows flexible performance tuning of parallel programs developed for shared memory multiprocessors using a standardized environment; C/C++ programs that lues the thread package in Solaris 2.X.
Efficient performance tuning of parallel programs is often hard. In this paper we describe an approach that uses a uni-processor execution of a multithreaded program as reference to simulate a multiprocessor execution. The speed-up is predicted, and the program behaviour is visualized as a graph, which can be used in the performance tuning process. The simulator considers scheduling as well as hardware parameters, e.g., the thread priority, no. of LWPs, and no. of CPUs. The visualization part shows the simulated execution in two graphs: one showing the threads’ behaviour over time and the other the amount of parallel-ism over time. In the first graph is it possible to relate an event in the graph to the code line causing the event. Validation using a Sun multiprocessor with eight processors and five scientific parallel applications shows that the speed-up predictions are within +/-6% of a real execution.
The cluster systems used today usually prohibit that a running process on one node is reallocated to another node. A parallel program developer thus has to decide how processes should be allocated to the nodes in the cluster. Finding an allocation that results in minimal completion time is NP-hard and (non-optimal) heuristic algorithms have to be used. One major drawback with heuristics is that we do not know if the result is close to optimal or not. In this paper we present a method for finding a guaranteed minimal completion time for a given program. The method can be used as a bound that helps the user to determine when it is worth-while to continue the heuristic search. Based on some parameters derived from the program, as well as some parameters describing the hardware platform, the method produces the minimal completion time bound. The method includes an aggressive branch-and-bound algorithm that has been shown to reduce the search space to 0.0004%. A practical demonstration of the method is presented using a tool that automatically derives the necessary program parameters and produces the bound without the need for a multiprocessor. This makes the method accessible for practitioners.
Today, Apache Cassandra, an highly scalable and available NoSql datastore, is largely used by enterprises of each size and for application areas that range from entertainment to big data analytics. Managed Cassandra service providers are emerging to hide the complexity of the installation, fine tuning and operation of Cassandra datacenters. As for all complex services, human assisted management of a multi-tenant cassandra datacenter is unrealistic. Rather, there is a growing demand for autonomic management solutions. In this paper, we present an optimal energy-aware adaptation model for managed Cassandra datacenters that modify the system configuration orchestrating three different actions: horizontal scaling, vertical scaling and energy aware placement. The model is built from a real case based on real application data from Ericsson AB. We compare the performance of the optimal adaptation with two heuristics that avoid system perturbations due to re-configuration actions triggered by subscription of new tenants and/or changes in the SLA. One of the heuristic is local optimisation and the second is a best fit decreasing algorithm selected as reference point because representative of a wide range of research and practical solutions. The main finding is that heuristic’s performance depends on the scenario and workload and no one dominates in all the cases. Besides, in high load scenarios, the suboptimal system configuration obtained with an heuristic adaptation policy introduce a penalty in electric energy consumption in the range [+25%, +50%] if compared with the energy consumed by an optimal system configuration.
Platforms for big data includes mechanisms and tools to model, organize, store and access big data (e.g. Apache Cassandra, Hbase, Amazon SimpleDB, Dynamo, Google BigTable). The resource management for those platforms is a complex task and must account also for multi-tenancy and infrastructure scalability. Human assisted control of Big data platform is unrealistic and there is a growing demand for autonomic solutions. In this paper we propose a QoS and energy-aware adaptation model designed to cope with the real case of a Cassandra-as-a-Service provider.
Analysis of anomalies reported during testing of a project can tell a lot about how well the processes and products work. Still, organizations rarely use anomaly reports for more than progress tracking although projects commonly spend a significant part of the development time on finding and correcting faults. This paper presents an anomaly metrics model that organizations can use for identifying improvements in the development process, i.e. to reduce the cost and lead-time spent on rework-related activities and to improve the quality of the delivered product. The model is the result of a four year research project performed at Ericsson. © 2008 Elsevier Inc. All rights reserved.
Successful software process improvement depends on the ability to analyze past projects and determine which parts of the process that could become more efficient. One typical data source is the faults that are reported during product development. From an industrial need, this paper provides a solution based on a measure called faults-slip-through, i.e. the measure tells which faults that should have been found in earlier phases. From the measure, the improvement potential of different parts of the development process is estimated by calculating the cost of the faults that slipped through the phase where they should have been found. The usefulness of the method was demonstrated by applying it on two completed development projects at Ericsson AB. The results show that the implementation phase had the largest improvement potential since it caused the largest faults-slip-through cost to later phases, i.e. 81 and 84 percent of the total improvement potential in the two studied projects.
This paper aims to address data labelling issues in process data to support in-situ process monitoring of additive manufactured components. For this, we adopted an active learning (AL) approach to minimise the manual effort for data labelling for classification models. In this study, we present an approach that utilises pre-trained models to extract deep features from images, and clustering and query by committee sampling to select the representative samples to build defect classification models. We conduct quantitative experiments to evaluate the proposed method's performance and compare it with other selected state-of-the-art AL approaches using a dataset of additive manufacturing (AM) and a publicly available dataset. The experimental results show that the proposed approach outperforms AL with committee based sampling, and AL with clustering and random sampling. The results of the statistical significance test show that there is a significant difference between the studied AL approaches. Hence, the proposed AL approach can be considered an alternative method to reduce labelling costs when building defects classification models, whose generalizability is most likely plausible.
Large amount of data are generated from in-situ monitoring of additive manufacturing (AM) processes which is later used in prediction modelling for defect classification to speed up quality inspection of products. A high volume of this process data is defect-free (majority class) and a lower volume of this data has defects (minority class) which result in the class-imbalance issue. Using imbalanced datasets, classifiers often provide sub-optimal classification results i.e. better performance on the majority class than the minority class. However, it is important for process engineers that models classify defects more accurately than the class with no defects since this is crucial for quality inspection. Hence, we address the class-imbalance issue in manufacturing process data to support in-situ quality control of additive manufactured components. For this, we propose cluster-based adaptive data augmentation (CADA) for oversampling to address the class-imbalance problem. Quantitative experiments are conducted to evaluate the performance of the proposed method and to compare with other selected oversampling methods using AM datasets from an aerospace industry and a publicly available casting manufacturing dataset. The results show that CADA outperformed random oversampling and the SMOTE method and is similar to random data augmentation and cluster-based oversampling. Furthermore, the results of the statistical significance test show that there is a significant difference between the studied methods. As such, the CADA method can be considered as an alternative method for oversampling to improve the performance of models on the minority class.
In this paper we briefly present an existing database system with its advantages and disadvantages. The system uses an approach that significantly improves flexibility and maintainability. A performance model has been developed for the system, thus making it possible to quantitatively compare the performance reduction caused by the increased maintainability. Validations using real-world scenarios and data show that the performance model is very accurate. The model and validation show that the performance loss of using the flexible approach is substantial. Based on insights gained from the performance model, we improve performance by using different data-striping techniques. We show the effect of standard RAID 0 striping, and then further improve our results by using range partitioning.
The report summarises the first year of the research project 'Design for Design in Use of Database Applications'. It focuses on end user tailoring and adaptable systems.
Web Services is a new concept that promises flexibility and interconnection between different systems. The communication in Web Services uses SOAP $Simple Object Access Protocol, which is based on XML. We have together with an industrial partner made experiments with SOAP in a Web Service environment to find out the response time using SOAP compared to CORBA. It turns out that a direct and naive use of SOAP would result in a response time degradation of a factor 400 compared to CORBA. We identified the major reasons for the poor performance of SOAP and evaluated some performance improvement techniques. After applying these the techniques, the performance of CORBA is 7 times better compared to SOAP.
This conceptual paper focuses on revealing challenges and offering concepts associated with the incorporation of the Quality of Experience (QoE) paradigm into the design of mobile video systems. The corresponding design framework combines application, middleware and networking layer in a unique cross-layer approach, in which all layers shall jointly analyse the quality of the video and its delivery in face of volatile conditions. Particular ingredients of the framework are efficient video processing, advanced realtime scheduling, and reduced-reference metrics on application and network layer.
We present two strategies to balance the load in a system with multiple virtual machines (VMs) through automated live migration. When the push strategy is used, overloaded hosts try to migrate workload to less loaded nodes. On the other hand, when the pull strategy is employed, the light-loaded hosts take the initiative to offload overloaded nodes. The performance of the proposed strategies was evaluated through simulations. We have discovered that the strategies complement each other, in the sense that each strategy comes out as “best” under different types of workload. For example, the pull strategy is able to quickly re-distribute the load of the system when the load is in the range low-to-medium, while the push strategy is faster when the load is medium-to-high. Our evaluation shows that when adding or removing a large number of virtual machines in the system, the “best” strategy can re-balance the system in 4–15 minutes.
In C++, the memory allocator is often a bottleneck that severely limits performance and scalability on multiprocessor systems. The traditional solution is to optimize the C library memory allocation routines, An alternative is to attack the problem on the source code level, i.e. modify the applications source code. Such an approach makes it possible to achieve more efficient and customized memory management. To implement and maintain such source code optimizations is however both laborious and costly, since it is a manual procedure. Applications developed using object-oriented techniques, such as frameworks and design patterns, tend to use a great deal of dynamic memory to offer dynamic features. These features are mainly used for maintainability reasons, and temporal locality often characterizes the run-time behavior of the dynamic memory operations. We have implemented a pre-processor based method, named Amplify, which in a completely automated procedure optimizes (object-oriented) C++ applications to exploit the temporal locality in dynamic memory usage. Test results show that Amplify can obtain significant speed-up for synthetic applications and that it was useful for a commercial product.