dc.description.abstract | The path-breaking advancements in technology involving high-resolution imaging devices and inexpensive storage make an all-easy acquisition, storing, and sharing of high-quality digital images. But the constant growth of image manipulation harms the success of digital imaging. Nowadays, there exists much software that allows even relatively inexperienced users to edit digital images or create fake images with such perfection without leaving any trace of tampering, deceiving the human eyes that it is hard to distinguish these fake images from the original ones. And hence, we are no longer in a world where seeing is believing. This brings into account the significance of assessing the trustworthiness of an image through digital image forensic techniques, by applying scientific methods to investigate the identity of a digital image. This Thesis is a computational study in the direction of digital image forensics to distinguish natural images taken by a camera from the computer generated images such as computer graphics and Generative Adversarial Network (GAN) images, and also to understand the perspectives of algorithmic fairness of the forensic systems classifying natural and computer generated images.The initial contribution of the Thesis is a digital image forensics algorithm, MC- EffNet, that distinguishes natural images from computer generated images, including both computer graphics and GAN images. The algorithm employs a parallel fusion of three fine-tuned EfficientNet networks that operate in different colorspaces chosen after studying the efficacy of a variety of colorspace transformations specifically towards this image forensics problem. The experimental results of this study in the Thesis shows that the proposed model could obtain high performance accuracies and outperform the state-of-the-art baselines. The study compares the performance of the proposed algorithms with a manual classification performance and points out the necessity of computational algorithms for the task of distinguishing natural images from computer generated images. The study also analyzes the behavior of the proposed model by visualizing image regions responsible for the model’s decisions and compares these model explanations with manual explanations provided by human participants.Despite the forensic task of distinguishing natural and computer generated images achieving high accuracies with the support and advancements in deep neural networks and transformer based architectures, these forensic models are seen to fail over post-processed images. Post-processing operations such as JPEG compression,xgaussian noising, etc., are usually performed over the images to trick the forensic algorithm. Hence the second contribution of the Thesis proposes an approach towards distinguishing natural and computer generated images including both computer graphics and GAN generated images, that produces high classification accuracies as well as is highly robust against the post-processing operations. The proposed model uses a fusion of two vision transformers where each of the transformer networks operates in different color spaces. The experimental results of this study shows that the proposed approach achieves higher performance, robustness, and generalizability when compared to the state-of-the-art baselines. Also, the features of the proposed model are observed to attain higher inter-class separability than the baseline features. Visualizing the attention maps of the networks of the fused model shows that the proposed methodology can capture more image information relevant to the forensic task of classifying natural and generated images.As like in any other machine learning based algorithms, biased forensic algorithms
can cause serious societal harm and security concerns. Hence, besides developing forensic algorithms it is also essential to identify any bias in such forensics systems. Accordingly, the third contribution of this Thesis is in the direction of exploring algorithmic fairness in forensic systems particularly built using the large visual transformers, which are the most commonly employed recent deep learning architectures due to their capability to produce high classification performances. The study tries to identify gender, racial, affective, and intersectional biases in forensics systems classifying natural and computer generated images using a bias evaluation corpora and a vast set of bias evaluation measures. The study uses a two phase evaluation setting to examine whether the most common post-processing operation of image compression in any way influences the model biases, and observes that image compression impacts model biases. | en_US |