Installing GUS from ScratchSeptember 15, 2003This GUS documentation section details the steps in installing and initializing GUS on a newly created Oracle database. The database may be empty. It is important to keep in mind that the GUS distribution and its contents are under development -- occasionally a bug will slip through testing but not slip through your installation undetected.The instructions outlined here are general. Experiences to date with these instructions have found no system related issues. A number of the steps outlined here and additional details appear at the GUS Platform Web Site for setup notes and schema setup. 1.0 Creating the Oracle DatabaseThe GUS database may be created in Oracle using the dbca utitlity. Select the data warehouse option. For other parameters the defaults will work, but some parameters can clearly be changed for the better, such as percent memory utilization. A report by Jessica Kissinger discusses database configuration.1.1 Creating Required GUS Users in OracleAssume now that a database has been created in Oracle, it is empty without tables and users, and has the name gus. Two GUS accounts must be created before going further in the installation. One account requires read/write access to the database, and the other has only read access. SQL to establish the accounts gusrw and gusdevreadonly is
The accounts require privileges as provided by
the following SQL commands in the user initialization script.
These commands grant privileges to the GUS accounts. The commands may
be placed in a script (say, userinit.sql) and executed
at the SQLPLUS prompt as @userinit.sql.
This script is not part of the GUS distribution.
In the next step tablespaces are created for the
GUS namespaces CORE, SRES, DoTS, TESS and RAD3,
along with index tables for each. This can be accomplished with
the SQL commands in the table initialization script.
These commands create the relational tables for the GUS namespaces.
This script is not part of the GUS distribution.
This section gives instructions to install the GUS distribution from
the CVS repository distribution into gus, the Oracle database created above.
GUS installation includes the GUS schema, users, and
the CBIL distribution, also distributed with GUS. The GUS installation
consists of a source part and an executable part.
The source part is rooted in the directory called $PROJECT_HOME.
Installation is performed from $PROJECT_HOME, with
"executable GUS" going to $GUS_HOME.
The installation is interrupted at a couple of places
to update parameter files in $GUS_HOME,
as discussed below in building GUS.
First, there are some additional details to attend to.
The GUS distribution can be installed in any account, but it can be handy to dedicate an account for GUS installation and management. For example, a GUS development account and a GUS production account could be useful. Each of account has its own Oracle database and Linux user account. Key locations are pointed to by shell environment variables. This snippet from from a .tcshrc script initializes the environment variables used by GUS. Note the requirements for Java and Ant. For further details about the use of these systems see the GUS Platform Web Site. 2.2 Obtaining and installing the GUS distributionThe GUS distribution and required CBIL project distribution can be downloaded from the Sanger Institute. The complete distributions can be obtained as three tarballs:
After setting up one's shell environment, one can readily unpack the tar files by executing:
2.3 GUS properties fileThe .gus.properties file defines database information in your environment. Pointed to by the environment variable $GUS_CONFIG_FILE, .gus.properties stores information pertaining to the two Oracle databases: GUS read/write and GUS readonly. The contents of an operational .gus.properties file (minus the actual passwords) is given below.
The GUS properties file can be configured by copying the sample file and editing the contents to indicate the location of Perl on your system:
2.4 Building GUSNow one can instruct the build system to install both itself (the "install" project) and also to install the principal GUS project. There is a small workaround required though. The script $PROJECT_HOME/GUS/build.xml references database tables prior to installation of the Oracle tables, but the script that builds the tables scripts is {\tt build.xml}. So, we make a two pass install of GUS by first removing the database accesses in build.xml. These accesses occur in two places as of this writing in the following build.xml code fragments (line numbers are given from the release as of this writing) for GUS object generation:
116 〈exec executable="generateGusObjects"
117 failonerror="true"〉
118 〈arg value="--javaOrPerl=java"/〉
119 〈/exec〉
192 〈exec executable="generateGusObjects"
193 failonerror="true"〉
194 〈arg value="--javaOrPerl=perl"/〉
195 〈exec〉
Here is the build process with the work around. Note that three calls to build.xml are currently required.
2.5 Initializing GUS Database ContentsThe process of moving data into and out of GUS uses plugins. Plugins are Perl modules registered with the GUS database by storing the plugin's version number and checksum, and other information. Plugins are always executed by the program ga: ga must be registered with the database before you can use GUS. However, before ga can be registered and GUS populated, there are several tables that require values.2.6 Registering GUS namespace usersThe .gus.properties file defines for GUS username, group and project entries. Whatever the definition of these entries in the properties file, they must be in the Core.UserInfo, Core.GroupInfo, and Core.ProjectInfo tables. During installation, GUS creates default values for these properties, so that one can simply define the user, group and project in .gus.properties to be the same as the values established by the GUS installation, namely:
2.7 Populating the Machine TableThe Core.Machine table requires an entry that describes the machine where GUS is running. The following line inserted into the Core.Machine table will do the trick:insert into core.machine values(0, 'unknown', NULL, 1, 0, SYSDATE, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1);The attributes corresponding to the values for Core.Machine can be gleaned from the GUS schema browser. 2.8 Registering gaThe first step is to establish the plugin manager's property file. This file just contains the location of md5sum (for calculating checksums of plugins) on your system. The following will establish this:
2.9 Registering the SubmitRow pluginNow that GUS is initialized, we will perform a partial check of the database by registering the broadly useful plugin SubmitRow. We use ga for this as follows
ga +create GUS::Common::Plugin::SubmitRow
... check output and if looks okay, commit.
ga +create GUS::Common::Plugin::SubmitRow --commit
The two stage approach for running ga allows one to check the output
for problems without committing the changes to the database.
GUS basics are initialized at this point.
This is a good point to look at some of the tables created in
the installation and initialization process. The next section
tells how to populate other tables with information fundamental
to GUS, for example the GO ontology, sequence types, and NCBIs taxonomy.
|