next up previous contents
Next: Gridding with XY_MAP Up: IRAM Memo 2016-2 Gridding Previous: Efficient transpositions   Contents


CLASS tables in details

CLASS can save a set of spectra with the command TABLE. It creates a 2D table using the Gildas Data Format (a header and rows of data). By default, the per-column format used is:

  1. the X offset position,
  2. the Y offset position,
  3. the spectrum weight W (e.g. for different integration times),
  4. first channel intensity
and so on for all channels.

Such a table is ordered Velocity-Position (hereafter: VP), i.e. X, Y, W and then the channels are contiguous in the file for each spectrum, and the spectra are concatenated one after the other. This has some advantages and disadvantages which are described in the table [*] and compared to the opposite Position-Velocity ordering (hereafter: PV).


Table: Advantages (+ or ++) and disadvantages (-) of VP- and PV-ordered Class tables
Table: Advantages (+ or ++) and disadvantages (-) of VP- and PV-ordered Class tables
# Velocity-Position Position-Velocity
1 ++ natural ordering for Class Data Format to table conversion, i.e. the Class file (input) and table (output) are traversed once in parallel, the table may not fit in memory - would need non-contiguous access either to the Class file or to the output table in memory, the table must fit in memory
2 ++ appending new spectra (TABLE APPEND mode) is straightforward and low-cost - TABLE APPEND is not easy, costs reading and duplicating old data
3 - one needs to traverse the whole table to read the X Y W columns + X Y W arrays are contiguous, reading them has a marginal cost
4 + the VP order is already the correct one for gridding (see section [*]) - the PV order need to be transposed in an intermediate buffer before use in gridding
5 - reading by block of velocities is unefficient, i.e. one needs to traverse the file as many times as the number of blocks ++ reading by velocity blocks is efficient as each block is a contiguous piece of file


From the items 1 and 2 in table [*], it is obvious that the result of the TABLE command must remain a Velocity-Position table. However, the item 5 tends to show that Position-Velocity tables may be interesting for efficient1gridding, especially for tables which do not fit in memory and must be read by blocks.

Transposing a Velocity-Position table can be done with the command SIC$\backslash$TRANSPOSE; the advantages and disadvantages are detailed in the table [*]. The item 6 can be explained like this: both VP items 3 and 5 (table [*]) and stand-alone transposition (table [*]) need to traverse the input table several times. However, because the former are part of a larger problem (reading the table but also convolving, writing the cube, memory transpositions if any), they can use only a smaller amount of memory so they traverse the table more times (smaller blocks and more of them are needed) than the stand-alone transposition. In other words, and assuming the table does not fit in memory and that IO dominate the problem, VP gridding is slower than transposition + PV gridding. This is even more true if gridding is repeated several times on the same table. A demonstration of these conclusions is shown in table [*].


Table: Advantages (+) and disadvantages (-) of VP to PV stand-alone transposition
Table: Advantages (+) and disadvantages (-) of VP to PV stand-alone transposition
# VP to PV
6 + file-to-file transposition is a stand-alone task which can involve a maximum of the machine ressources, so that it is faster than the VP overheads (items 3 and 5) when performed within the XY_MAP command.
7 - the transposed table is a duplicate of the same data, it doubles the disk usage,
8 - the transposition must be repeated each time the original table changes or new data is appended.



Table: Gridding of a table of 61263 spectra x 20737 channels (4.8 GB). Each command was allowed to use only 2048 MB. The output cube (1.7 GB with the command defaults) is ordered LMV so that it is traversed only once for writing. The system cache was emptied before each run.
Action Traversing table Elapsed time
XY_MAP (VP) 1 (XYW) + 7 blocks 8.2 min
TRANSPOSE (VP to PV) 4 (in) + 1 (out) 5.0 min
XY_MAP (PV) 1 1.9 min

From these conclusions, there is no unique answer to what kind of table should be feed in the gridding engine. The answer is a mixture of efficiency when creating or appending the table, and reading it. However, if the whole table fits in memory, i.e. it can be traversed once, then it is probably better to use a VP table. Then, if it must be read by parts, the PV order should be prefered.


next up previous contents
Next: Gridding with XY_MAP Up: IRAM Memo 2016-2 Gridding Previous: Efficient transpositions   Contents
Gildas manager 2023-06-01