Marine scientists use remote underwater video recording to survey fish\nspecies in their natural habitats. This helps them understand and predict how\nfish respond to climate change, habitat degradation, and fishing pressure. This\ninformation is essential for developing sustainable fisheries for human\nconsumption, and for preserving the environment. However, the enormous volume\nof collected videos makes extracting useful information a daunting and\ntime-consuming task for a human. A promising method to address this problem is\nthe cutting-edge Deep Learning (DL) technology.DL can help marine scientists\nparse large volumes of video promptly and efficiently, unlocking niche\ninformation that cannot be obtained using conventional manual monitoring\nmethods. In this paper, we provide an overview of the key concepts of DL, while\npresenting a survey of literature on fish habitat monitoring with a focus on\nunderwater fish classification. We also discuss the main challenges faced when\ndeveloping DL for underwater image processing and propose approaches to address\nthem. Finally, we provide insights into the marine habitat monitoring research\ndomain and shed light on what the future of DL for underwater image processing\nmay hold. This paper aims to inform a wide range of readers from marine\nscientists who would like to apply DL in their research to computer scientists\nwho would like to survey state-of-the-art DL-based underwater fish habitat\nmonitoring literature.\n