BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//pretalx//pretalx.com//adass2023//talk//EJPSMN
BEGIN:VTIMEZONE
TZID:MST
BEGIN:STANDARD
DTSTART:20000101T000000
RRULE:FREQ=YEARLY;BYMONTH=1
TZNAME:MST
TZOFFSETFROM:-0700
TZOFFSETTO:-0700
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
UID:pretalx-adass2023-EJPSMN@pretalx.com
DTSTART;TZID=MST:20231107T110000
DTEND;TZID=MST:20231107T111500
DESCRIPTION:With the exponential growth of the amount of astronomical data 
 with time\, finding the needles in the haystack is getting increasingly mo
 re difficult. Traditionally\, archives have described their observations w
 ith metadata and made those searchable through web interfaces as well as p
 rogrammatically. The next frontier for science archives is to also allow s
 earches on the content of the observations themselves. As a step into this
  direction\, we have implemented a prototype of a recommender system for t
 he ALMA Science Archive. We use self-supervised affine-transformation-inde
 pendent representation learning of source morphologies for the similarity 
 estimation through contrastive learning with a deep neuronal network. Once
  the neuronal network is trained\, the feature vectors for all images - bo
 th for continuum images and for peak-flux images of datacubes - are evalua
 ted. In a next step\, we compute the similarity matrix holding for each im
 age the corresponding 1000 most similar images\, ordered by their pairwise
  similarity. A kd-tree is used to speed up that computation from O(N^2) to
  O(N log(N)). Our prototype interface then shows the most-similar images o
 f which the archival researcher can select the most interesting ones. When
  they do select an image on the interface\, we use a scoring algorithm to 
 instantaneously compute the combined similarity of the all already selecte
 d images and reorder the displayed remaining images accordingly. Each sele
 ction thus further refines the similarity display. Finally\, we use k-mean
 s clustering on the feature vectors of the displayed images to provide sel
 ectable 'source morphology categories' for a quick-select option. We concl
 ude from the prototype that an image similarity interface can be a valuabl
 e asset to science archives and we are looking forward to discussing this 
 work and related ideas with the ADASS community.
DTSTAMP:20260611T032507Z
LOCATION:Talks
SUMMARY:"You might also like these images": unsupervised affine-transformat
 ion-independent representation learning for the ALMA Science Archive - Fel
 ix Stoehr
URL:https://pretalx.com/adass2023/talk/EJPSMN/
END:VEVENT
END:VCALENDAR
