Markus Demleitner
Sessions
MOCs, the HEALPix-based Multi-Order Coverage maps, are a powerful tool
for representing arbitrary shapes on the sphere with user-selectable
fidelity and remarkable compactness. Their most salient feature is that
operations such as union and intersection, nightmarish with conventional
geometries, are just a few lines of readable code with MOCs. This makes
MOC support a natural complement for the Virtual Observatory's Astronomy
Data Query Language ADQL. Indeed, the latest version of the ADQL-based
Registry discovery protocol RegTAP already makes use of MOCs. This
poster describes some proposed ADQL extensions to make MOCs even more
useful in TAP and ADQL and discusses these extensions' implementation
status.
The VO Registry is a set of about 30,000 metadata records of astronomical resources. It is queryable using the powerful RegTAP protocol requiring users to write ADQL. A friendlier interface using that protocol has recently been written a part of the pyVO astropy affiliated package. In this poster, we briefly introduce the standards this is implemented against. The new API can be used by astronomers to discover data based on various constraints, ranging from free text to physical concepts and areas in space, time, and spectrum.
ADQL is a language defined by the IVOA for querying astronomical data. It stands for Astronomical Data Query Language. It is a fork of SQL-92 in which only the query features are used. Astronomical functions and operators have been added, in particular to query data by position. This language is mainly used in the IVOA protocol called TAP for querying relational astronomical data.
Until now, the ADQL grammar has been described in a BNF (Backus-Naur-Form)-inspired formalism, largely following SQL-92 itself. However, in its current form it is not actually used by any implementation for several reasons, including some minor mistakes and lacunae, which were in practice filled borrowing from other SQL implementations. Also, the lack of stringent tokenisation rules resulted in differing interpretations of corner cases (e.g. the string literal '49a').
In the next version of ADQL, it is therefore proposed to change the ADQL language notation from BNF to PEG (Parsing Expression Grammar). In PEG, parsing and tokenisation are specified in a uniform way, and PEG's own grammar is standardised sufficiently well that multiple interoperating implementations can be used off the shelf to obtain parse trees of ADQL clauses (at least conceptually).
This poster aims to show what are the difference between the two notations and what improvements this implies for ADQL.