WDK Version: 1.6 (08/09/05)

Go to the GUS WDK Home for general project information.

Go to the download site for Release Notes and downloads.

Please also see the WDK Model Javadoc. For the JSTL API to use in JSP pages, focus on the jspwrap package.

Table of Contents

#Introduction

#Background

#Overview of the WDK Design

#Installing the WDK

#Creating your project

#Creating a model

#Configuring and customizing the view

#Upcoming features


Introduction

The GUS WDK is designed to accelerate the creation of "data mining" websites. It can work on any relational database system and on any schema.

You use the WDK to

For the more technical minded, the data model that you define in the WDK can be thought of as "a configurable Data Transfer Object (DTO) layer". A DTO is an object that brings together into one object data that may come from many detailed entities in the database. The database is typically structured in a normalized fashion, so that there are many tables with deep relationships that form the structure of the data. It is considered good practice to provide to a web site or other high-level data consumers with objects at a coarser granularity. For example, a Gene DTO may bring together data from many tables that contain information relating to a Gene.

From this perspective, the WDK model offers an abstraction on how to request DTOs (questions) and the DTOs themselves (records).

Background

The WDK evolved out of long standing development efforts at the Computational Biology and Informatics Lab (CBIL) at the Penn Center for Bioinformatics (PCBI), University of Pennsylvania. The predecessor to the GUS WDK, the WDK-Classic, has been in use since 2001, and is the web framework used by three well recognized sites, http://plasmodb.org, http://www.allgenes.org and http://www.cbil.upenn.edu/epcondb.

We are rewriting the WDK, in collaboration with the Pathogen Sequencing Unit (PSU) at the Wellcome Trust Sanger Institute (and YOU if you would like to join the effort).

The new WDK will:

The GUS WDK is a free and open source project.

Note: In this document we use PlasmoDB.org to demonstrate our points. PlasmoDB.org is still using the WDK-Classic. There is plenty of functionality on the site which will only be available in future releases of the GUS WDK. (See #Upcoming features.)

Soon: we will provide a link to a real site using the new WDK (eg ApiDoTS), and to the WDK code that drives it.

Overview of the WDK Design

The objectives of the WDK are to

The end user's perspective: Questions, Summaries, Records

The WDK uses a Question-Summary-Record paradigm to organize a web site. End users are provided with a set of questions to chose from. They run a particular question by specifying values for the question's parameters. The result is a summary of the entities found by the query. Each element in the summary offers a link to the full record for the entity.

A tour of questions, summaries and records will serve to orient you for the following discussions.

Questions

In the WDK we call the inquiries that users pose questions as opposed to queries. This is because in the WDK questions return a set of records while queries are commonly understood in database systems to return tables (columns and rows.) As we describe below, the WDK does use queries, but, they are hidden from users.

The home page of Plasmodb.org presents to the user about 30 canned questions. Some can be run directly from the home page. But, most are run from a dialogue page the user goes to by selecting a question from a pulldown list. Our discussion here starts from those pulldown lists. The WDK lets you define questions and put them into different lists. This way, you can offer your user one or more pulldowns, each containing a list of questions to chose from.

When the user choses a question, he/she is brought to a question page. On PlasmoDB.org, if the user selects the "Transmembrane domain" question from the "gene sequence features" list, he or she will arive at the Transmembrane domain question page. On this page the user will fill in parameter values and submit the question.

Summaries

Once the question is run, the user is brought to a page showing a summary of the result. Here is the summary for the query we ran on PlasmoDB . The summary displays a row for each entity that matched the question (the GUS WDK will allow for more flexible displays of summaries in future releases). The first column is a link to a detailed record for the entity.

Records

When the user clicks on the link to a record, the record page appears. In our example, the user will go to the gene record. The record presents detailed information about the entity, including:

MVC: Model-View-Controller Architecture

The WDK uses the Model-View-Controller architecural pattern. This design allows us to cleanly separate what we are describing (the Model) from how we display it (the View).

The Model

The WDK model describes the questions, summaries and records that will appear on the website. The model is defined in an XML file that you will create (you may borrow heavily from model files supplied with the release). Creating a model is described in the section #Creating a model. Because you define the model separately from the view, you can use command line tools, described in #Testing on the command line, to test the model. This liberates your testing from the difficult environment of a web application server (Tomcat).

The View

The WDK view creates a website that displays the model. It does so by using JavaServer Pages (JSP) to present the model. The JSP pages have access to the questions and records in the model and display them. The view also defines custom JSP tags that help create JSP pages.

In the current release of the WDK, only a default view of the model is supported. After you set up Tomcat, install the WDK and define a model, you will immediately have a working web site. As described below, you configure it to give is a "branded" look. In upcoming releases the view will offer much more powerful means for you to customize your site.

The Controller

The WDK controller functions internally to the WDK. It conforms to standard Model 2 architecture.

Installing the WDK

Installing the WDK is a multistep process. Expect it to take on the order of an hour to complete.

Understanding the install targets

There are two targets of the installation

To get started, you will install and test the WDK model facilities. For this, you can set your GUS_HOME to be anywhere, such as a gushome directory in your home directory.

Once these facilities are in place, you will be able to design your own model against the data in your database. You will be able to do significant development and testing of your model without running in the trickier context of a web application.

After you are satisfied with your model, it will be time to move to the web application server. To do so, set up a GUS_HOME that is in the same file system as your web application directories. (You will be creating symbolic links from the web application directory into GUS_HOME.) Then follow the instructions below for installing the webapp.

System requirements

Using the GUS install system

The WDK uses the GUS install system. Follow the GUS Installer instructions to set it up.

Downloading

Get the latest release of the WDK from here

Unpack it into the $PROJECT_HOME you set up for the GUS install system.

Then, install the WDK into $GUS_HOME

build WDK install -append

Installing the database driver

The current release supports Oracle and PostgreSQL.

The PostgreSQL JDBC driver is included in the distribution

But, for licensing reasons, the Oracle JDBC driver is not included in the distribution. To include it:

Installing the Toy model

The WDK release includes a Toy Site. You will need to install and play with it before you are ready to build your own site. The Toy Site has a model and a view (just like yours will). We discuss the model first.

After you have installed the WDK into $GUS_HOME, there are three additional steps to installing the Toy model. You must configure the model, create a query cache and create the toy database.

Understanding the cache

Before going on to configure the model and then create the query cache, it will help if you understand the purpose of the cache.

The WDK model stores query results in a "cache." The cache is in your database. (In the section #Configuring the model you will configure the model to tell it where to create the cache tables.) The main table is called "QueryInstance" by default (though you can name it whatever you want). Each row in that table represents the running of a query. The row stores the name of the query and the parameter values it was run with. It also stores the name of a result table that holds the actual result. If the same query is requested again with the exact same parameter values (regardless of which user requests it), the result is fetched from the cache, avoiding the expense of running the query all over again.

When you define your model (see #The Model XML File), you will designate which queries to cache and not to cache, based on your expectations of your system's use. By default, queries that are used in Questions are cached, and you can turn that off by setting the isCacheable attribute to false. Queries that are used in Records are not cached.

NOTE: Whenever you change the definition in your model of a cached query then you must reset the cache (see #Creating and managing the cache).

Configuring the Toy model

To configure the Toy model, you must edit its configuration file:

$GUS_HOME/config/toyModel-config.xml

NOTE: some of the properties you set control the database connection pooling in the WDK. The WDK is delivered with the Jakarta Commons DBCP connection pooler.

The properties you set in the file are:

login your login for the database
password your password for the database
connectionUrl the information describing how to connect to your database
queryInstanceTable the name of the table that the cache system should use to store information about queries that have been run
maxQueryParams the maximum number of parameters that a query is allowed to have. This controls the number of columns created in the cache's queryInstanceTable. The table has columns to hold parameter values with names param1, param2, etc. If you chose 20 (suggested) the table will be able to hold 20 parameter values per query (which is probably more than enough). This means that you may not define any queries in your model that have greater than 20 parameters. (see #The Model XML File).
platformClass

a Java class which provides RDBMS-specific functionality. The WDK comes with these classes:

Oracle org.gusdb.wdk.model.implementation.Oracle
PostgreSQL org.gusdb.wdk.model.implementation.PostgreSQL
MySQL not yet
initialSize connection pool: the number of connections that are created on startup.
maxActive connection pool: the maximum number of active connections that can be allocated from this pool at the same time, or zero for no limit.
maxIdle connection pool: the maximum number of active connections that can remain idle in the pool, without extra ones being released, or zero for no limit.
minIdle connection pool: the minimum number of active connections that can remain idle in the pool, without extra ones being created, or zero to create none.
maxWait connection pool: the maximum number of milliseconds that the pool will wait (when there are no available connections) for a connection to be returned before throwing an exception, or -1 to wait indefinitely.

Creating and managing the cache

After you have edited the model config file, create an empty cache by using the wdkCache command.

Here is its usage:

% wdkCache
 usage: wdkCache -model model_name -new|-reset|-drop

Create, reset or drop a query cache. The name of the cache table is found
in the Model config file (the table is placed in the schema owned by login).
Resetting the cache drops all results tables and deletes all rows from the
cache table.  Dropping the cache first resets it then drops the cache table
and sequence.
Options:
 -model <model>             the name of the model.  This is used to find the Model config file ($GUS_HOME/config/model-name-config.xml)
 -drop                      drop the query cache
 -new                       create a new query cache
 -reset                     reset the query cache

To set up the Toy model's cache use this command:

% wdkCache -model toyModel -new

Setting up the toy database

The WDK release includes a toy database. It is packaged in a set of files in $GUS_HOME/data/WDK/ToyModel/testTables. The model regression test makes use of the toy database (as does the toy website). When you run the regression test it optionally creates the toy database in your RDBMS, moving the data from the files into tables. If you want to manage the toy database yourself, use the wdkTestDb command:

% wdkTestDb

Create a toy database to use in testing the WDK.  (The database is created from files included in the WDK distribution.)

usage: wdkTestDb -model model_name [-create | -drop]

Options:
-model <model>         The name of the model.  This is used to find the Model config file ($GUS_HOME/config/model_name-config.xml)

Use this command to create the toy database:

% wdkTestDb -model toyModel -create



Running the Model regression test

The regression test runs a set of wdk commands, and compares the results to a previously run, manually-validated expected result. If the output matches, the test passes, otherwise it fails. The test ensures that expected behavior of the codebase remains consistent even when behind-the-scenes changes are made. The WDK release provides a standard regression test to use with the toy model; if this test fails when run after installing the toy model, something is seriously wrong. The test can be found at

$GUS_HOME/data/WDKToyModel/regressionTest/sampleRegressionTestCommands


The test uses data in the toy database. Set the --commandListFile flag to point to this file when running the wdkRegressionTest command:

usage:  Runs wdk executables from provided file and compares output to an expected result for testing purposes

wdkRegressionTest
        --configFile (database configuration file to use with wdk model)
        --loadNewDatabase! (flag to create database from provided test flat files; set this flag
          when running the regression test for the first time and when data in the files have changed)
        --createNewExpectedResult! (overwrite existing regression test expected result)
        --outputDirectory (all results of test including error files will be placed in this directory)
        --commandListFile (file that contains lists of commands to run in test)
        --verbose (prints out names of shell commands being executed)



In typical usage, you should run the regression test provided with the WDK release. If you find a reason to create your own regression test, then create a separate command list file. The command list file includes one or more tests. Each test is composed of three lines, and the next test, if any, follows beginning on the fourth line. The expected input for a test is as follows:

name="[name of test]"
command="[command to run with parameters conforming to the normal usage of the command, EXCEPT any global parameters (configFile) passed to the wdkRegressionTest commmand]"
critical="[true or false; if true, the regression test will immediately exit on failure, without running any more of the provided tests]"


Installing the Toy View

Configuring Tomcat

The WDK uses a slightly unconventional installation strategy. It installs a complete and working GUS_HOME in the same file system as the web application directory, and uses symbolic links to link into it from the web application directory. The reason is that this provides a working GUS_HOME so that you can test your model using the extensive command line tools available for the model. Because it is linked in, you know for sure that the model you are testing is the same model that is running on your site.

The complication with this strategy is that it requires you to configure Tomcat to allow symlinks. This is not a default configuration because Tomcat feels that it may open a security issue. We believe that as long as nobody on your team makes symlinks to bad places, this is not a security problem. However, if you do not want to use links, then replace them with direct copies.

To configure Tomcat to allow symlinks

<host ...>
<Context path="/wdktoysite"
           docBase="/www/wdktoysite/webapp"
           debug="1"
           privileged="false" allowLinking="true" >
     <Logger className="org.apache.catalina.logger.~FileLogger"
           prefix="wdktoysite-log."
           suffix=".txt"
           timestamp="true"
           docBase="/www/wdktoysite/logs/" />
</Context>

Building the WDK on your site

Here are the steps required

webappTargetDir=your_webapp_directory
build WDKToySite webinstall -append -webPropFile your_web_prop_file
cd your_webapp_directory
$ ls
error.jsp       images/    misc/         questionSets.jsp  summary.jsp
error.user.jsp  index.jsp  question.jsp  record.jsp        WEB-INF/

Testing the toy website

The WDK is now installed on your site. Bring up the toy WDK site to test it.

Here are the steps

cd your_webapp_directory
cd WEB-INF
cp web.xml.toy web.xml
cd tags/site
cp footer.tag.toy footer.tag
cp header.tag.toy header.tag

Creating your project

Now that you have installed and tested the Toy Site you are ready to create your own WDK-based project. The first step is to set up a directory structure that is compatible with CVS and the GUS installer. This way you will be able to keep your project safe and also use the GUS installer to install your project along with the WDK.

Using WDKToySite as a template

To get started use the WDKToySite as a template. You will need to chose the name of your site (referred to below as TheNameOfMySite?). You will also need to chose the name of your model, (referred to below as myModelName).

Start out by doing this:

 % cd $PROJECT_HOME
 # init your project with WDKToySite
 % cp -r WDKToySite TheNameOfMySite
 % cd TheNameOfMySite/Model
 # if your copy of the WDK came from CVS, remove the CVS dirs
 % rm -r `find . -name CVS`
 # remove unneeded data/ dir
 % rm -r data
 % cd config
 # rename config files so they use your model name
 % mv toyModel.xml myModelName.xml
 % mv toyModel.prop myModelName.prop
 % mv toyModel-config.xml myModelName-config.xml
 % mv toyModel-sanity.xml myModelName-sanity.xml
 # remove unneeded regression test file
 % rm regressionTestModelConfig.xml
 % cd $PROJECT_HOME/TheNameOfMySite/Site/webapp/WEB-INF
 # move the toy's web.xml file to a template file for your project
 % mv web.xml.toy web.xml.template
 % cd tags/site
 # move the toy's header and footer into files for your project
 % mv header.tag.toy header.tag
 % mv footer.tag.toy footer.tag

Editing the template

Now your project's directory structure is pretty close to what you need. To finish it off, you need to:

Building your project

To build your project, follow the directions in either #Downloading (to install the Model alone without the view) or in #Building the WDK on your site to install the project into a web site. In both cases, substitute TheNameOfMySite for WDKToySite in the build command to build your project instead of the Toy Site.

Storing your project in CVS

It is recommended that you store your project in CVS. To do so, do this:

% cd $PROJECT_HOME
% mv TheNameOfMySite TheNameOfMySite.bak
% cd TheNameOfMySite.bak
% cvs -d TheNameOfMyRepository import -m "Start of project" TheNameOfMySite dontcare start
% cd ..
% cvs -d TheNameOfMyRepository co TheNameOfMySite

Creating a model

The first step in using the WDK (after installing it) is to create a model. Because the WDK is schema independent, we cannot provide you with a model that is ready to go. We do supply a model for a sample site against a sample database to give you ideas. But, the basic task of deciding what questions, summaries and records you want on your site must be done by you.

NOTE: Whenever you change the definition in your model of a cached query then you must reset the cache (see #Creating and managing the cache).

The Model XML File

The model XML file defines the model. In it you specify all the details of the questions, summaries and records the end users will see. The WDK reads the file and creates your site from it.

The model XML file must be named a certain way:

$PROJECT_HOME/TheNameOfMySite/Model/config/myModelName.xml

Your installation contains a sample model XML file which will prove very useful in orienting you.

$PROJECT_HOME/WDKToySite/Model/config/toyModel.xml

In the model you define these kinds of objects:

<query> A query is a request for tabular data from a data source. The result is a list of rows, each with a set of columns. You define queries in the model for many purposes. As described below, some are used in questions, some in records and some in parameters.
<parameter> A value that must be specified for a query to run. Queries may have zero, one or many parameters.
<recordClass> A template for creating records of a particular type of entity, such as a Gene. The type of entity a record describes is determined by the type of ID the record is parameterized by. A Gene record expects a Gene ID, an RNA record expects an RNA ID. (More details on this below).
<question> A pairing of a query (one that returns a set of entity IDs) and a record. The record is used to describe each of the entities returned.

Sets and names

The top level of the model contains a number of sets. You create these first, and then add objects, as described in the previous section, to the sets. Here are the types of sets:

<querySet> A set of queries.
<parameterSet> A set of parameters.
<recordClassSet> A set of record classes.
<questionSet> A set of questions. A question set can be marked "internal" for the use in nested records and nested record lists (see below).

You may have more than one set of each type. An object may be in only one set. Each set has a name. All sets must have unique names, regardless of the type of the set. The elements of a set, eg, queries or records, must have names that are unique within the set (but not across sets).

In this way, any element of any set can be uniquely identified by its two part name: setname.elementname.

It is recommended that you to divide your model into convenient sets. For example, queries that perform a similar function go well together in a set. Or, all Gene records might go together.

References

Some of the objects you define in the model refer to other objects you have defined in the model. For example, a question refers to a query and to a record. The reference uses the unique two part name of the referent.

Queries in the model

Queries are defined inside a query set. There are different types of queries to accomodate different types of data sources:

<sqlQuery> issued to a relational database
<flatFileQuery> issued against flat files (see #Upcoming features)
<processQuery> runs a process and returns a result (see #Upcoming features)

A query has these constituents:

<paramRef> References to parameters used by the query. Not only must each query have a unique two-part name, but, each query must also have a unique element name (the second part of the two-part name).
<column> Each <column> describes a column that is part of the result. Additionally, a column has the attribute truncateToRef, which, when set to an integer value, restricts the displayed value of the column to that many characters. This is useful for long sequences and descriptions that interfere with neat table layout. The default truncate value is 100.
<sql> If the query is an <sqlQuery>, the sql statement to run. Embed parameter values in the sql by using the parameter's name (the second part of its two-part name) surrounded by '$$', eg, '$$taxon$$'. NOTES: (1) the columns in the select clause must match the columns declared in the <column>s. (2) all columns in the where clause must have a prefix to indicate what table they come from, even if the column name is unique across the tables used.

The model maintains the order of the query result for later use.

As discussed in the following sections, queries are used for different purposes. Some are used by records, some are used by questions and some are used by parameters.

Parameters in the model

Parameters are defined inside a parameter set. There are different types of parameters (more on the way):

<stringParameter> A parameter whose value is a string.
<flatVocabParameter> A parameter that provides a list of choices. It does so by issuing a query to find the list of choices. The query must return two columns "term" and "internal." The terms are shown to the user. The internal value is the value that is embedded into the query that uses this parameter. Set the multiChoice attribute to "true" to allow the user to chose more than one element from the list.

Records in the model

A record provides information about an entity in the database. The entity must have a primary key. The primary key is used in queries to retrieve information about the entity.

In the model, you define a <recordClass>. It is a template for creating records of a certain type. A record class has the following attributes:

idPrefix The value that will be prepended to the primary key of each record
type The type of this recordClass for display purposes
attributeOrdering comma-separated list of attributes that will be used to determine the order in which the columns of the contained records will be displayed. Attributes not specified in this ordering will be appended to ordered columns in no guaranteed order. Note that the primary key of the record will always be the first column displayed.

And a recordClass contains one or more of each of these kinds of elements:

<attributesQuery> a reference to a query that returns a single row of information about the record's entity. This query must return exactly one row or an error will be thrown. Use an attributes query to return information such as the entity's name, location, organism or any other attribute that the entity has only one of. Each value returned by this query is considered an "attribute" of the record. The "name" of the attribute is taken form the name of its column. An attributes query must take exactly one parameter, and its name must be "primaryKey." It will be passed the primary key of the record which contains the attributes query.
<tableQuery> a reference to a query that returns zero, one or more rows of information about the record's entity. Use a table query to return tabular information about the entity, such as the entity's Gene Ontology functions. A table query must take exactly one parameter, and its name must be "primaryKey." It will be passed the primary key of the record which contains the attributes query.
<textAttribute> a block of text that all records of this type will have. This may be a description of the kind of entity the record represents. The text can have the values of other attributes (including other text attributes) embedded in it. To do that, embed a attribute's name surrounded by '$$', for example: "This Gene is from organism $$organism$$."
<linkAttribute> an attribute that is a hyperlink. The <linkAttribute> has a visible="" attribute that describes how the link should be displayed. It also has a <url> element that describes what URL should be used. If your URL contains funny characters, then enclose it in <!CDATA[ your url here...?]>. Embed into the visible and url text values from other attributes by using $$attribute_name$$ as described in <textAttribute>. It is recommended that you place all URLs in the model properties file described in #The model properties file. Doing so places all URLs in one file so that they are easy to maintain (URLs are notorious for changing!). If you use the $$ mechanism to embed values in an URL that is stored in the properties file, you must escape each $ with \\. Also, you may want to hide the values that you embed in the URL from the user because they are internal identifiers. In that case, set the isInternal="true" attribute of the attribute or column that is supplying the internal value.
<nestedRecord> a record inside of this record. It is returned by an internal question that use values from the containing record as parameters and return one record to be embedded here.
<nestedRecordList> a list of records. Same as above except that the internal question may return more than one record.

Questions in the model

A question is "a query that returns records." You define a question by pairing an "id query" and a record class. The id query defines the question asked, and returns a list of IDs of entities that match the query. The record class defines the type of record returned for each entity.

A question contains these attributes:

queryRef a reference to a query that will return a single column containing the primary keys of a set of entities.
recordClassRef a reference to a record that will describe each element of the query's result. The primary key that the record expects must match the type of primary keys returned by the query.
displayName the name of the question for display purposes
summaryAttributesRef comma-separated list of attributes that will be displayed in the question's summary. If this property is not set, the summary will display all attributes belonging to the question's query.

And, a question contains these elements:

description a helpful one sentence or so description of the question.
help a lengthy description of the question that provides help about it.

The detailed specification for a WDK model file

The WDK model file is validated against a "schema file"" You can find the file in your installation at

$GUS_HOME/lib/rng/wdkModel.rng

The schema file provides the exact syntactic specification for what is allowed in a WDK model file. It is written in the RELAX NG (RNG) format.

The model properties file

The WDK model uses a properties file so that you can configure your model with values that may change over time. You should always use the properties file for changeable values, rather than "hard-coding" them in the model XML file. The properties file is in standard "properties file" format, for example:

ourDataVersion=1.20
ourProjectId=22
TAXON_URL=http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?name=\\$\\$taxon_name\\$\\$

You may embed any properties you define in the properties file in your model XML file by surounding the name of the property with '@', for example "@ourDataVersion@."

Testing on the command line

Test your WDK model on the command line before you deploy it to a website.

wdkXml

Use the wdkXml command to parse your xml file. It will detect any syntactic problems, such as malformed xml, xml that does not conform to the schema file or invalid references.

% wdkXml

usage: wdkXml -model model_name

Parse and print out a WDK Model xml file.
Options:
 -model <model>   the name of the model.  This is used to find the Model XML file ($GUS_HOME/config/model_name.xml) the Model property file
                  ($GUS_HOME/config/model_name.prop) and the Model config file ($GUS_HOME/config/model_name-config.xml)

Here is an example using the toy model:

% wdkXml -model toyModel

wdkQuery

Use the wdkQuery command to run any query in your model XML file.

% wdkQuery

usage: wdkQuery -model model_name -query full_query_name [-dontCache]
                [-returnTable -rows start end | -returnSize | -showQuery] [-params
                param_1_name param_1_value ...]

Run a query found in a WDK Model xml file.  If run without -params,
displays the parameters for the specified query
Options:
 -model <model>     the name of the model.  This is used to find the Model XML file ($GUS_HOME/config/model_name.xml) the Model property file
                    ($GUS_HOME/config/model_name.prop) and the Model config file ($GUS_HOME/config/model_name-config.xml)
 -dontCache         Do not use the cache for this query (even if it is cache enabled).
 -params <params>   space delimited list of param_name param_value ....
 -query <query>     The full name (set.element) of the query to run.
 -returnSize        For pageable queries only: return the total size of the result.
 -returnTable       Place the result in a table and return the name of the table.
 -rows              For pageable queries only: provide the start and end rows to return.
 -showQuery         Show the query as it will be run (with parameter values in place).

Here is an example using the toy model:

% wdkQuery -model toyModel -query RnaIds.ByDbESTLib
  -params NumEstLibs 6 AssemblyConsistency 80

wdkRecord

Use the wdkRecord command to test a record.

% wdkRecord

usage: wdkRecord -model model_name -record full_record_name -primaryKey primary_key [-project project_id]

Print a record found in a WDK Model xml file. Options:
 -model <model>             the name of the model.  This is used to find the Model XML file ($GUS_HOME/config/model_name.xml) the Model property file ($GUS_HOME/config/model_name.prop) and the Model config file ($GUS_HOME/config/model_name-config.xml)
 -primaryKey <primaryKey>   The primary key of the record to find.
 -project <projectId        The project the record belongs to.  Optional.
 -record <record>           The full name (set.element) of the record to print.

Here is an example using the toy model:

% wdkRecord -model toyModel -record RnaRecordClasses.RnaRecordClass -primaryKey 92484673

wdkSummary

Use the wdkSummary command to test a summary

% wdkSummary

usage: wdkSummary -model model_name -question full_question_name -rows start end [-showQuery] -params param_1_name param_1_value ...

Print a summary found in a WDK Model xml file. Options:
 -model <model>         the name of the model.  This is used to find the Model XML file ($GUS_HOME/config/model_name.xml) the Model property file    ($GUS_HOME/config/model_name.prop) and the Model config file ($GUS_HOME/config/model_name-config.xml)
 -params <params>       Space delimited list of param_name param_value ....
 -question <question>   The full name (set.element) of the question to run.
 -rows                  The start and end pairs of the summary rows to return
 -showQuery             Show the query as it will be run (with parameter values in place).

Here is an example using the toy model:

% wdkSummary -model toyModel -question RnaQuestions.ByNumSeqs -rows 1 20
-params NumSeqs 10 ApiTaxon "Neospora caninum"

The model sanity test

The model sanity test exercises all the queries, records, and summary (question) in a model to make sure they will run correctly and produce an output within an expected range. You should always run a sanity test before you deploy your model (typically calling it very often as you develop a model).

Creating a sanity test

A sanity test is an xml file consisting of a number of individual tests, one or more for each record and query in the model. You must include in the file at least one test for each query and record in the model or the sanity test will fail. You can provide more if you think it will be useful.

A sanity test contains three kinds of elements:

<sanityQuery> A test of a query in the model
<sanityRecord> A test of a record in the model
<sanityQuestion> A test of a question (summary) in the model

A <sanityQuery> contains the following attributes:

ref The name of the query. Formatted as "querySetName.queryName."
minOutputLength The minimum number of rows the query is expected to return when run with the given parameters. This value must be at least one (see #Sanity query restriction below)
maxOutputLength The maximum number of rows the query is expected to return when run with the given parameters

A <sanityQuery> contains one or more <sanityParam> elements.

A <sanityParam> contains these attributes:

name The name of the parameter required by the query
value Value to be used for the parameter in the sanity test

<sanityRecord> contains the following attributes:

twoPartName The name of the record. Formatted as "recordSetName.recordName"
primaryKey The primary key of the entity that the record represents

A <sanityQuestion> contains the following attributes:

ref The name of the question. Formatted as "questionSetName.questionName."
pageStart The start row of the summary to include in this page
pageEnd The end row of the summary to include in this page
minOutputLength The minimum number of rows the query is expected to return when run with the given parameters. This value must be at least one (see #Sanity query restriction below)
maxOutputLength The maximum number of rows the query is expected to return when run with the given parameters

A <sanityQuestion> contains one or more <sanityParam> elements (see above).

Running a sanity test

The sanity test will print the results of each provided test. When a test fails, you will be notified and be given a command to run to debug the failed query or record individually.

usage: wdkSanityTest -model model_name -verbose

Run a test on all queries and records in a wdk model, using a provided sanity model, to ensure that the course of development hasn't dramatically affected wdk functionality.
Options:
 -model <model>   the name of the model.  This is used to find the Model XML file ($GUS_HOME/config/model_name.xml), the Model property file   ($GUS_HOME/config/model_name.prop), the Sanity Test file ($GUS_HOME/config/model_name-sanity.xml) and the Model config file ($GUS_HOME/config/model_name-config.xml)
 -verbose         Print out more information while running test.

Here is an example using the toy model:

% wdkSanityTest -model toyModel

Sanity query restriction

A sanity test that runs a query must return at least one row. You must provide parameters that fulfill this requirement. This ensures that the query is capable of producing output. When a query is run outside of the sanity test, it may return zero rows.

Configuring and customizing the view

The WDK provides a "default" view that will display any model you create. It also allows you to brand the default view, as described here.

Logo Provide a logo in your_webapp_directory/WEB-INF/images
Stylesheet Provide the file your_webapp_directory/WEB-INF/misc/style.css
Header and footer Provide a header.jsp and footer.jsp in your_webapp_directory/WEB-INF/tags/site. Make sure they point to your logo.
Model Edit your_webapp_directory/WEB-INF/web.xml to use the model file you created

If you are not satisfied with the default view, you can customize you view as following:

* Allow questionSets to be displayed either as a flattened list of quesitons
  (cp your_webapp_directory/index.jsp.flat to your_webapp_directory/index.jsp) or as drop down lists, each of which
  representing a questionSet (cp your_webapp_directory/index.jsp.sets to your_webapp_directory/index.jsp).

* Place custom pages under your_webapp_directory/customPages/ so that they are used instead of the default:
     I) site wide customizations:
        questionSets.jsp      --  customQuestionSets.jsp
        question.jsp          --  customQuestion.jsp
        questionSetsFlat.jsp  --  customQuestionSetsFlat.jsp
        summary.jsp           --  customSummary.jsp
        record.jsp            --  customRecord.jsp
     II) record-type specific customizations:
        question.jsp          --  <RecordClassSet>.<RecordClass>.question.jsp
        summary.jsp           --  <RecordClassSet>.<RecordClass>.summary.jsp
        record.jsp            --  <RecordClassSet>.<RecordClass>.jsp
     III) question-name specific customizations:
        question.jsp          --  <QuestionSet>.<Question>.jsp
        summary.jsp           --  <QuestionSet>.<Question>.summary.jsp
     Note: 1) there is no .question in <QuestionSet>.<Question>.jsp
           2) there is no .record in <RecordClassSet>.<RecordClass>.jsp
           3) more specific customization take precedence when more than one
              levels of customization exist for the same default view.

For example, you will find the following in file in the WDKToySite:

$PROJECT_HOME/WDKToySite/Site/webapp/custom_view/EstRecordClasses.EstRecordClass.jsp

Upcoming features

Model

Query history

Being able to view queries ran in the past and use boolean operations to combine the answers.

Query context

The query context stores parameter values to apply universally to all queries. The canonical example of this is "taxon." The context is applied to parameters that belong to user's queries and also to lower level queries, such as those used by controlled vocabulary parameters. This way controlled vocabularies reflect the context.

The model will make a query context available for reading and writing.

Hints in records

A design goal of the wdk is that it supply a generic view that can display a model with minimal configuration. More advanced WDK users will write particular views (in .jsp) for particular record types.

In order to support the generic view, the model needs to provide some display hints in Records

Sample values for parameters

We would like query dialogues to show sample values for parameters. This means that Parameter will gain a getSampleValue() property. The trickiness is in how to set that value. We have been thinking that a model XML file could be reusable across projects and/or sites. This suggests that sample values should probably be set in the model properties file. (This solution ignores the complication that a given parameter may be reused by many queries, and so might want to have context specific sample values.)

View

Customization such as configurable/declarative page navigation controll flow.