4.5  :  Interfacing Python with the Millennium Database

Here we provide a python module which can be used to query the database. This uses the built in urllib2 module to submit queries via the web interface and returns the result in the form of a numpy record array.

virgodb.py

To use the module to execute a query and return the result:

>>> from virgodb import VirgoDB
>>> vdb = VirgoDB("my_username")
>>> result = vdb.execute_query("select * from snapshots..mr7")

By default this will query the Millennium database at the ICC in Durham. You can also set the database URL explicitly. E.g.

>>> vdb = VirgoDB("my_username", url="http://virgodb.dur.ac.uk:8080/MyMillennium")
or to query the database at MPA
>>> vdb = VirgoDB("my_username", url="http://gavo.mpa-garching.mpg.de/MyMillennium/")

The output is a record array where the fields correspond to columns in the result of the SQL query. In the example above we can print the snapshot number and redshift columns as follows:

>>> print result["snapnum"]
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
 50 51 52 53 54 55 56 57 58 59 60 61]
>>> print result["z"]
[  4.99995900e+01   3.00000610e+01   1.99156880e+01   1.82437230e+01
   1.67245250e+01   1.53430730e+01   1.40859140e+01   1.29407800e+01
   1.18965690e+01   1.09438640e+01   1.00734620e+01   9.27791500e+00
   8.54991200e+00   7.88320400e+00   7.27218800e+00   6.71158600e+00
   6.19683300e+00   5.72386400e+00   5.28883300e+00   4.88844900e+00
   4.51955600e+00   4.17946900e+00   3.86568300e+00   3.57590500e+00
   3.30809800e+00   3.06041900e+00   2.83118200e+00   2.61886200e+00
   2.42204400e+00   2.23948600e+00   2.07002700e+00   1.91263300e+00
   1.76633600e+00   1.63027100e+00   1.50363600e+00   1.38571800e+00
   1.27584600e+00   1.17341700e+00   1.07787500e+00   9.88708000e-01
   9.05463000e-01   8.27699000e-01   7.55036000e-01   6.87109000e-01
   6.23590000e-01   5.64177000e-01   5.08591000e-01   4.56577000e-01
   4.07899000e-01   3.62340000e-01   3.19703000e-01   2.79802000e-01
   2.42469000e-01   2.07549000e-01   1.74898000e-01   1.44383000e-01
   1.15883000e-01   8.92880000e-02   6.44930000e-02   4.14030000e-02
   1.99330000e-02   0.00000000e+00]

To see a list of columns in the output:

>>> print result.dtype.fields.keys()
['a', 'Hz', 'lookBackTime', 'snapnum', 'z']

If the query fails for some reason (e.g. invalid SQL) the module raises an exception with the text of the response from the database.

This module can also be used to download query results to a file:

>>> from virgodb import VirgoDB
>>> vdb = VirgoDB("my_username")
>>> result = vdb.query_to_file("output_file_name.txt", "select * from snapshots..mr7", format="text")
The format parameter can be set to 'text', for a plain text file, or 'hdf5', in which case the output file will contain one 1D HDF5 dataset for each column in the query result. HDF5 output requires that you have the h5py module installed.