Images usually exhibit regions that particularly attract the viewer's attention. These regions are typically referred to as regions-of-interest (ROI) and the underlying phenomenon in the human visual system is known as visual attention (VA). In the context of image quality one may expect that distortions occuring in the ROI are perceived as being more annoying, as compared to distortions in the background. However, VA is only seldom taken into account in existing image quality metrics. In this paper, we thus provide a VA framework to extend existing image quality metrics with a simple VA model. The performance of the framework is evaluated on three contemporary image quality metrics. We further consider the context of wireless imaging where a broad range of artifacts can be observed. To facilitate the VA based metric design, we conducted subjective experiments to both obtain a ground truth for the subjective quality of a set of test images and to identify ROI in the corresponding reference images. A methodology is further discussed to optimize the VA metrics with respect to quality prediction accuracy and generalization ability. It is shown that the quality prediction performance of the three considered metrics can be significantly improved by deploying the proposed framework.