Mapping Air Pollution Sources with Sequential Transformer Chaining: A Case Study in South Asia (Papers Track)

Hafiz Muhammad Abubakar (Beaconhouse National University); Raahim Arbaz (Beaconhouse National University); Hasnain Ahmad (Beaconhouse National University); Mubasher Nazir (Solve Agri Pak Private Limited); Usman Nazir (Beaconhouse National University)

Paper PDF NeurIPS 2024 Recorded Talk Cite
Climate Justice Health

Abstract

This study presents a comprehensive framework for detecting pollution sources, specifically factory and brick kiln chimneys, in major South Asian cities using a combination of remote sensing data and advanced deep learning techniques. We first identify hotspots of Acute Respiratory Infections (ARI) by correlating health data with air pollutant concentrations, including sulfur dioxide (SO_2), nitrogen dioxide (NO_2), and carbon monoxide (CO). For these identified hotspots, both low-resolution and high-resolution satellite imagery are acquired. Our approach employs a sequential process, beginning with a Vision Transformer model that utilizes high resolution satellite imagery and a broad range of text inputs with a lower confidence threshold to initially filter the data. This is followed by the application of the Remote CLIP model, which is run twice in succession using satellite imagery paired with different text inputs to refine the detection further. This sequential tranformer chaining filter out 99% of irrelevant data from high-resolution imagery. The final step involves manual annotation on the remaining 1% of the data, ensuring high accuracy and minimizing errors. Additionally, a novel multispectral chimney index is developed for detecting chimneys in low-resolution imagery. The study introduces a unique, annotated chimney detection dataset capturing diverse chimney types, which improves detection accuracy. The results provide actionable insights for public health interventions and support regulatory measures aimed at achieving the United Nations' Sustainable Development Goal 3 on health and well-being. We plan to make the dataset and code publicly available following the acceptance of this paper.