DeepAstroUDA

screenshot of DeepAstroUDA

Universal Domain Adaptation for cross-survey classification, regression and anomaly detection

Overview

DeepAstroUDA is an innovative Universal Domain Adaptation method tailored for the astronomical field, designed to manage the complexities of semi-supervised domain alignment. In an age where big data from astronomical surveys offers immense potential, this tool stands out for its ability to adapt across diverse datasets with differing class overlaps. Whether facing known classes or unknown anomalies, DeepAstroUDA promises to enhance classification, regression, and anomaly detection tasks, bridging gaps between datasets from different observational sources.

Its capabilities were notably demonstrated through applications in morphological classification of galaxies, revealing not only a robust performance in known categories but also adeptness in identifying anomalies. This positions DeepAstroUDA not just as a tool, but as a gateway to new scientific discoveries in astronomy by harnessing the power of artificial intelligence on varied data domains.

Features

  • Versatile Domain Adaptation: Effectively aligns datasets with various class overlaps, accommodating extra classes in either source or target domains.
  • Robust Anomaly Detection: Capable of identifying unknown classes, enhancing the model's adaptability to real-world scenarios.
  • Multi-functional Applications: Suitable for classification, regression, and clustering tasks, expanding its usability beyond standard applications.
  • Efficient Architecture: Built on a ResNet50 architecture optimized for performance with early stopping criteria to preserve computational resources.
  • Domain-Specific Batch Normalization: Eliminates style information leakage between domains, ensuring cleaner data processing and improved outcomes.
  • Ease of Installation: Available as a Python package via PyPI, allowing users to easily integrate and use it within their existing workflows.
  • Rapid Training Time: On average, the model converges in around 5 hours when trained on powerful GPU setups, making it efficient for researchers.
  • Comprehensive Sample Access: Includes template files and sample datasets, streamlining the initial setup and testing processes for new users.