SQL Plugin
The SQL plugin is able to serve data from any relational database with a DB-API 2 compliant module; out–of–the–box it works with SQLite (via pysqlite), PostgreSQL (via psycopg2), MySQL (via MySQLdb) and Oracle (using cx_Oracle), but support for other compliant modules can be added in minutes.
Installation is straightforward, provided you to have already installed the proper libraries and header files depending on the database that will be used. After that, the plugin and the required Python modules can be installed using EasyInstall:
$ easy_install dap.plugins.sql[sqlite]
$ easy_install dap.plugins.sql[postgresql]
$ easy_install dap.plugins.sql[mysql]
$ easy_install dap.plugins.sql[oracle]
To serve data, you need to create a simple INI-like file pointing to your database. This file will be accessed by the clients, making a link between them and the stored data, and defines the connection to the database, the columns that will be published and the dataset metadata. Here's a simple example, in a file called coastline.sql:
; coastline.sql
; defines a SQLite connection to coastline
; data stored in file coastline.db
[database]
dsn: sqlite://%(here)s/coastline.db
[metadata]
name: coastline
description: World Coastline from the World Data Bank II
author: Roberto De Almeida
[latitude]
col: coastline.latitude
type: Float32
units: degrees_north
missing_value: -9999
[longitude]
col: coastline.longitude
type: Float32
units: degrees_east
missing_value: -9999
Let's go through that file in detail.
Defining the connection
The first block defines the DSN for out database:
[database]
dsn: sqlite://%(here)s/coastline.db
In this example, we're setting up a connection to a SQLite database stored in file coastline.db (SQLite is not client—server based; the database is simply a file). This file can be located anywhere in the filesystem. The %(here)s variable points to the directory where coastline.sql is located, so both files are in the same directory in this example.
For other databases, say PostgreSQL, the syntax is a little different:
[database]
dsn: postgresql://user:pass@host:port/db
The example above defines a connection to a PostgreSQL database db located at a given host, specifying a username and password. Quite simple.
Apart from the DSN, we can also add arbitrary metadata to the dataset:
[metadata]
name: coastline
description: World Coastline from the World Data Bank II
author: Roberto De Almeida
Here, name is the name of the main Sequence in the generated dataset, and description is, well, a description of the dataset. These two values are recommended, and arbitraty additional values (like the author key in this example) can be added as necessary.
Defining the columns
The last step consists of defining the rows from our database that we want to add to the dataset. To do this, we create a section for each column with the name of the variable, and use the col key to specify the table and column name:
[longitude]
col: coastline.longitude
type: Float32
units: degrees_east
missing_value: -9999
Here, we're creating a variable named longitude pointing to the column of same name in the table coastline. We also specify the variable type as Float32, and some additional metadata. Specifying the variable type is optional except for SQLite, but recommended.
A small detail is that if the columns are not all in the same table you need to specify an id to join the values together. For example:
[salinity]
col: salt.salt
id: timestamp
[temperature]
col: temp.temp
id: timestamp
In the example above, the plugin will join the values of temperature and salinity where temp.timestamp is equal to salt.timestamp. By default the plugin does an INNER JOIN between all columns, but you can specify the join form with the join key in the database section:
[database]
dsn: mysql://localhost/db
join: FULL OUTER JOIN
Converting data
A nice feature of the SQL plugin is that it will convert data automatically for you, according to the metadata. In the coastline example, when we specify:
[longitude]
col: coastline.longitude
missing_value: -9999
In this case, missing values in the database will be automatically converted to -9999 (or any other value you specify). The same is true for values stored in the database as DATE or DATETIME; simply specify the desired COARDS time unit and the plugin will do the conversion:
[time]
col: casts.timestamp
units: hours since 1990-01-01 12:00:00 -01:00
type: Float32
This way, the plugin will read the values as DATETIME, for example, and return them as floats. This is necessary since the DAP does not define data types for dates. If you don't specify a unit, DATETIME values will be converted to years since 0001-01-01 by default.