ToxOtis

From OpenTox
Jump to: navigation, search

ToxOtis is a Java interface to the predictive toxicology services of OpenTox. ToxOtis is being developed to help both those who need a painless way to consume OpenTox web services and for ambitious service providers that don't want to spend half of their time in RDF parsing and creation, database management and security measures.

You can use ToxOtis to search in databases with chemical compounds, download a compound in any supported MIME type (e.g. SDF, SMILES, MOL etc), find a property for a compound (e.g. its LD50 lethal dose concentration), publish your chemicals in an online database, train QSAR models (regression, classification, clustering etc) and lots of other functionalities. Incorporation of ToxOtis into your services will relieve you from the labour of creating RDF documents.

Contents

About

ToxOtis is a Java API for accessing the OpenTox network of web services. Check out our blog for news on ToxOtis. Javadoc for version 0.7.1 is now available from here.

You can use ToxOtis to search in databases with chemical compounds, download a compound in any supported MIME type (e.g. SDF, SMILES, MOL etc), find a property for a compound (e.g. its LD50 lethal dose concentration), publish your chemicals in an online database, train QSAR models (regression, classification, clustering etc) and lots of other functionalities. Incorporation of ToxOtis into your services will relieve you from the labour of creating RDF documents. Documentation including lots of examples an be found also on-line.


Introduction

OpenTox components are the core elements in ToxOtis. These are all entities manipulated in OpenTox each one of which has a corresponding representation in RDF, i.e. a standard representation of a data model that describes it. Example of such components are Algorithms, Models, Tasks and Datasets. As far as their ontological nature and the corresponding RESTful API are concerned, you can find detailed documentation at the OpenTox site. From a programmatic point of view, all classes in org.opentox.toxotis.core and in org.opentox.toxotis.core.component subclass OTComponent and implement IOTComponent. This is an abstract class holding a URI and a MetaInfo field for all its subclasses; a very useful (abstract) method is included in this class: public abstract Individual asIndividual(OntModel model) which is implemented by all subclasses of OTComponent and allows the users to get an RDF representation straight from the component (we will provide some explanatory code snippets in the sequel). Other intermediate levels of abstraction are available such as OTOnlineResource and OTPublishable. All components are characterized by their meta information which consist of a subset of the Dublin Core properties, some RDFS and OWL properties and a couple of OpenTox specific properties (like ot:hasSource).

Unless you are already acquainted with the OpenTox components and API, take some time to read some basic things here.

Authentication and Authorization with ToxOtis

About Sigle sign-on

In OpenTox, access is controlled by an SSO (Single Sign-On) server (based on Sun’s openSSO). Single Sign-on can control access to systems based on any distributed architecture. Different services address to an SSO server to authenticate a client and ask permission for a given request. More on Single sign-on can be found at wikipedia. Read also our blog posts on Access Control in OpenTox and A&A for opentox.ntua.gr.


Authentication and Authorization scheme in OpenTox

The OpenTox A&A API is documented at http://opentox.org/dev/apis/api-1.1/AA. If you don’t have an account on OpenTox, you should head over to the registration form.

ToxOtis Authentication and Authorization

Acquire an Authentication Token

Once you provide your credentials to the SSO server, you acquire an authentication token. This will be used to authenticate yourself against any web service in OpenTox (if needed) and get permission to perform an operation. This is easily accomplished in ToxOtis. You simply provide your credentials to the AuthenticationToken (javadoc :

AuthenticationToken at = new AuthenticationToken("JohnSmith","mysecretPass111");

If you have a password file (read next section), you can use it to acquire an authentication token:

File passwordFile = new File("/path/to/my_sercret.key");
AuthenticationToken at = new AuthenticationToken(passwordFile);

For security reasons, every token has a certain lifetime after which is invalid, so even if someone malvolently obtains your token, will not have access to any OpenTox web service after a certain time. For the same reason, it is considered good practise to invalidate your tokens (i.e. log out) in case you do not intend to use them any more. It is advisable that you add a shutdown hook (javadoc) in your application which will invalidate all tokens before exiting the application. According to the OpenTox specifications, it is up to the client to monitor and manage tokens according to their life time. A collection of methods is available in AuthenticationToken that return the creation timestamp of the token as well as its status. A token is characterized as ACTIVE, INACTIVE or DEAD. A token should be used only if it is ACTIVE. A token is INACTIVE if it has either expired or it has been invalidated and DEAD if it has not yet been initialized (the user was not authenticated). So output of the follwing code:

AuthenticationToken token = new AuthenticationToken("JohnSmith","mysecretPass111");
TokenStatus statusBefore = token.getStatus();
token.invalidate();
TokenStatus statusAfter = token.getStatus();
System.out.println(statusBefore + ", " +  statusAfter);

will be:

ACTIVE, INACTIVE

Advanced Security Issues

In general, it is not good practise to store unencrypted passwords in your program or in your database. However, if you need to have your username and password stored so that you can easily use them, ToxOtis offers an encryption utility: PasswordFileManager. First of all, you have to create a (private) master key and store it in a file. You should make this file hidden and modify its permissions so that only your application will have access to it. The file should look like this:

--- START MASTER KEY ---
fEFWQ1FRUVdSXXxhOVBnazQyKy8vUzRcPWFfM2tmKjE
ajoTmn7ieV1qfb3645fFqa2MowkmMmP3Xg0A1gCRjTp
D96r3MEhKC89EAfpNG3hIKVxi4JBtyBxWySJIiidJX3
De6mx2tYqTJgyC8g83141qf27p59z5P51lw7VQ8E55n
wFr3T53y4WMW1nW5CN77C6oP832C2EtjUwR381ms6T3
P96y1NGm7I78k3sb4efDT462xVVUA8OU461u22T2v78
x3Mt6591855xKP65vQWn730jY889w47h9Fm0h6zYS04
--- END MASTER KEY ---


You should use the password generator of ToxOtis to create a good and valid master key. Here is an example of using PasswordFileManager for this purpose:

Thread createPasswordFile = new Thread() {
	@Override
	public void run() {
		try {
			PasswordFileManager.CRYPTO.
				createMasterPasswordFile("/dev/random", "/home/user/toxotisKeys/master.key", 500);
		} catch (IOException ex) {
			// Handle the exception properly!
		}
	}
};
Executors.newFixedThreadPool(1).submit(createPasswordFile);

Here is the output of this method:

----- ToxOtis Pasword Generator -----
Random number generator : /dev/random
Password file           : /home/chung/toxotisKeys/master.key
Password Stength        : EXCELENT

The class PasswordFileManager is Observable (source code) (javadoc) so you can monitor the progress of the password generation (That is why we wrap the execution of the master password file cretion in a Thread). This process might take long especially if you choose a good random number generator such as /dev/random on Linux. In case you do not provide a random number source (null), java.security.SecureRandom (source code) (javadoc) will be used instead (This is an RNG implementation by Java). In that case the method will print:

----- ToxOtis Pasword Generator -----
Random number generator : Secure RNG (java.security.SecureRandom)
Password file           : /home/chung/Desktop/alt.key
Password Stength        : EXCELECT (100)

While running you can monitor the process of the key creation:

while (true) {
	if (CRYPTO.hasChanged()) {
	    System.out.println(CRYPTO.getPasswordGenerationProgress());
	}
	if (CRYPTO.getPasswordGenerationProgress()==100){
	    break;
	}
}

Suppose now that your username is JohnSmith and your password is s3cret. Then you can use the above master password file to create an encrypted file for your credentials:

PasswordFileManager.CRYPTO.setMasterPasswordFile("/home/user/toxotisKeys/master.key");
PasswordFileManager.CRYPTO.createPasswordFile("JohnSmith", "s3cret", "/home/john/.hidpass/.my.key");

This will create a file with your credentials at the specified destination, that is/home/john/.hidpass/my.key (We suggest that this file should be hidden). Your private key will look like the following:

--- START PRIVATE KEY ---
/EXEudbuXSmvp2SrNI6iewwq==
2SSiPLZuCMLlz81=
--- END PRIVATE KEY ---

Now you can delete the line above which contains your credentials and any line like that and use the generated encrypted file to authenticate your self. Here is an example:

File passwordFile = new File("/home/john/.hidpass/.my.key");
AuthenticationToken at = new AuthenticationToken(passwordFile);

or alternatively:

AuthenticationToken at = PasswordFileManager.CRYPTO.authFromFile("/home/john/.hidpass/.my.key");

Authentication tokens are of high importance in ToxOtis, as they are necessary for most server-client data transactions.

Validate and Log out

The local status of a token can be retrieved using the method: AuthenticationToken#getStatus(). This checks whether the token has timed out. It is however more reliable to validate your token against an SSO server. For this purpose you should use the method AuthenticationToken#validate() : boolean. The method will return true if the token is valid and false otherwise.

AuthenticationToken at = PasswordFileManager.CRYPTO.authFromFile("/home/john/.hidpass/.my.key");
boolean isValid = at.validate();

If you need to discard your token so that it will not be active any more, you can invalidate it using the method AuthenticationToken#invalidate().

AuthenticationToken at = PasswordFileManager.CRYPTO.authFromFile("/home/john/.hidpass/.my.key");
// ... use your token ...
at.invalidate(); // Log out
boolean isValid = at.validate(); // isvalid is false

Retrieve user information from a token

You can use a token to obtain information about the user that created it providing its username and password. These information will be returned as an instance of User (javadoc). Here is an example:

AuthenticationToken at = PasswordFileManager.CRYPTO.authFromFile("/home/john/.hidpass/.my.key");
User user = at.getUser();
System.out.println(user); 

This will print:

UID   : john
Name  : John Smith
Mail  : john@smith.org
Pass  : {SSHA}FZLdpBMyrOO8SCYU7TeQY1JWAleotAVi7482

Users are suggested to invalidate their tokens if they don’t need them anymore in other A&A sessions. What is more, if you need to create a new token, make sure you have invalidated your old one.

Managing your Tokens: Token Pool

Obtaining a new token each time authentication/authorization is required is not good practice both in terms of performance and security. ToxOtis comes with a token management utility (javadoc) that allows for multiple login of different users but restricts a single user from obtaining multiple tokens. Once a user logs in, its token is stored in the pool. In case he/she attempts to login again and the stored token has not expired, then no new token is obtained but the existing one is returned from the method. Here is an example:

TokenPool tokenPool = TokenPool.getInstance();
for (int i =0; i < 10; i++){
   tokenPool.login("/path/to/my.key");
}
System.out.println(tokenPool.size());

The method will output 1 and not 10!

Benchmarking

The following results are based on 30 successive measurements. The measurements were carried out on a Linux machine (2.6.31-22-generic kernel, x86_64 GNU/Linux) with 3.8GB of RAM and an Intel Core 2 Duo CPU P8700 @2.53GHz. The SDK ToxOtis was used to perform the measurements (version 0.4.2.23) which includes Weka version 3.6.2 (latest stable version) and Jena version 2.6.2. These libraries run on a Sun™ JVM, version 1.6.0.20 with Java™ SE Runtime Environment (build 1.6.0.20-b02). All measurements are in milliseconds (ms).

Average ping time for opensso.in-silico.ch 57.9 (0% packet dropout)
Authentication using file 131.96
Average Invokation time for the method validate() in AuthenticationToken 79.0
Average Invokation time for the method invalidate() in AuthenticationToken 72.1
Average Invokation time for the method getUser() in AuthenticationToken 154.4
Authorization 184.1

Parse OpenTox Entities with ToxOtis

Introduction

Using ToxOtis one can parse remote OpenTox entities providing their URI or even OpenTox resources that are stored in some local file. Behind the scenes, ToxOtis downloads and parses an RDF representation of the resource and parses it into some instance of OTOnlineResource. For this purpose, the user is endowed with two tools: The abstract method loadFromRemote defined in OTOnlineResource (doc) and a set of spiders (doc) which are more powerful tools but also require a higher level of acquaintance with RDF and Jena (a library for parsing and editing RDF documents in Java). The ToxOtis API for downloading and parsing OpenTox resources is intertwined with the OpenTox A&A API, so in many cases users will need to provide their authentication token.


Download a Component

All subclasses of OTOnlineResource in ToxOtis, like Compounds, Features, Algorithms and Models can be downloaded from a remote location into some local resource such as a file or a variable (e.g. a String), or in general be directed to some output stream or written to some generic destination using a Write. The prototype methods are:

void download(String destination, Media media, AuthenticationToken token) throws ToxOtisException;
void download(OutputStream destination, Media media, AuthenticationToken token) throws ToxOtisException;
void download(File destination, Media media, AuthenticationToken token) throws ToxOtisException;
void download(Writer destination, Media media, AuthenticationToken token) throws ToxOtisException

This way, one can download the MOL representation of a compound and write it into a file. Here is an example of use:

Compound comp = new Compound(new VRI(Services.IDEACONSULT).augment("compound","10"));
File destination = new File("/path/to/file.mol");
comp.download(destination, Media.CHEMICAL_MDLMOL, (AuthenticationToken)null);

Parsing Components

Before proceeding to the next sections, users are advised to take a look at the documentation about the implementation of OpenTox components in ToxOtis.

Algorithms

A predefined collection of OpenTox algorithms is available within the class OpenToxAlgorithms (doc). You can load the algorithm data from the remote location using the method loadFromRemote defined in Algorithm (doc). Here is an example:

Algorithm myAlg = new Algorithm(OpenToxAlgorithms.TUM_KNN_CLASSIFICATION.getServiceVri());
// This will load into your object all information found at the remote location:
myAlg.loadFromRemote();
System.out.println(myAlg.getMeta());

The above source code will print the following to the System standard output:

identifier  : http://opentox.informatik.tu-muenchen.de:8080/OpenTox-dev/algorithm/kNNclassification^^string
title       : kNNclassification^^string
description : OpenTox REST interface to the WEKA k-Nearest Neighbor learning algorithm. 
              Can select appropriate value of K based on cross-validation. Can also do distance weighting.^^string
date        : Mon Sep 13 20:19:24 EEST 2010^^dateTime
creator     : tobias.girschick@in.tum.de^^string

If the algorithm is a protected resource you will have to authenticate yourself against that algorithm service providing an authentication token (doc). Here is an example:

Algorithm myAlg = new Algorithm(OpenToxAlgorithms.NTUA_MLR.getServiceVri());
AuthenticationToken at = PasswordFileManager.CRYPTO.authFromFile("./.secret/.my_secret.key");
// This will load into your object all information found at the remote location:
myAlg.loadFromRemote(at);

Datasets

The following example illustates how to use a Dataset Spider (doc) to download and parse a dataset from a remote server:

VRI vri = new VRI(Services.IDEACONSULT.augment("dataset","5"));
// Require that the dataset will contain no more than 10 compounds
final int size = 10;
vri.addUrlParameter("max", size);
DatasetSpider spider = new DatasetSpider(vri);
Dataset ds = spider.parse();

Now we can use this Dataset object to inspect its dataentries and values:

DataEntry de = ds.getDataEntries().get(2);
FeatureValue fv = de.getFeatureValue(0);
System.out.println(de.getConformer().getUri());
System.out.println(fv.getFeature().getUri() + " = " + fv.getValue());

The above code will print the following message to the System’s standard output:

http://apps.ideaconsult.net:8080/ambit2/compound/2554/conformer/327497
http://apps.ideaconsult.net:8080/ambit2/feature/20083 = 100-01-6^^string

Alternatively you can of course use the implementation of the method loadFromRemote() in Dataset (doc). Here is an example:

VRI vri = new VRI(Services.AMBIT_UNI_PLOVDIV.augment("dataset","9"));
Dataset ds = new Dataset(vri);
ds.loadFromRemote();

This will parse into the object ds the data downloaded from the URI: ambit.uni-plovdiv.bg:8080/ambit2/dataset/9.

Error Reports

Error Reports (code) (doc) are part of the OpenTox API since version 1.1. Error Reports define a formal way to handle exceptional situations while invoking a service or during inter-service communication thus facilitating debugging. They are sufficiently documented online at opentox.org/dev/apis/api-1.1/Error Reports. The parsing of Error Reports is carried out quite the same way as the entities mentioned above. The only difference with Error Reports is the the URL that hosts the error report differs from the IRI that describes the report in the RDF graph returned. So, if you choose to use a spider for parsing an Error Report you have to be careful with the initialization: The standard constructor for a spider ErrorReportSpider(Resource resource, Model model) will probably throw an error if you provide the wrong resource. This is why, you should prefer the constructor ErrorReportSpider(URI actorUri, Model ontModel) (doc) where you provide the URI of the actor of the exception and not the RDF node straightforward! Here is an example to obfuscate any misunderstanding:

VRI uri = new VRI(Services.NTUA.augment("algorithm", "mlr"));
GetClient client = new GetClient();
client.setUri(uri);
OntModel model = client.getResponseOntModel();
ErrorReportSpider spider = new ErrorReportSpider(uri, model);
ErrorReport er = spider.parse();

Error reports also appear in ToxOtisException (doc). When a ToxOtis Exception is thrown due to some exception thrown by a remote service, the Error Report from that service is incorporated into the exception. Here is an example:

VRI uri = new VRI(Services.NTUA.augment("algorithm", "mlr"));
try {
     new AlgorithmSpider(uri);
} catch (ToxOtisException tox) {
     System.out.println(tox.getRemoteErrorReport());
}

This will print to the System’s output the following text:

URI    : http://opentox.ntua.gr:3000/errorReport/#2390078396
Actor  : http://opentox.ntua.gr:3000/algorithm/mlr
Code   : AuthenticationFailed
Status : 403

Models

This is an example of how a user can download and parse an OpenTox Model (code) (doc) from a remote location:

VRI vri = new VRI(Services.NTUA.augment("model","f9a97443-6baf-4361-a55c-b08cf12c3e39"));
ModelSpider mSpider = new ModelSpider(vri);
Model m = mSpider.parse();

The above code downloads the model from opentox.ntua.gr and creates the object m: Model. The same can be accomplished using a Model object exclusivle. Here is an alternative way:

VRI vri = new VRI(Services.tumDev().augment("model","TUMOpenToxModel_j48_7"));
Model m = new Model(vri);
m.loadFromRemote();

Tasks

A Task is parsed as simply as any other OpenTox component. You simply have to provide its URI and invoke the method loadFromRemote() or, in case authentication is needed, loadFromRemote(AuthenticationToken). Here is an example of use:

VRI vri = new VRI("http://opentox.ntua.gr:3000/task/0fc060a0-f69b-4a81-bb2e-b9b32c8a04b3");
Task t = new Task(vri).loadFromRemote();

Compounds

Compounds, as these are represented in OpenTox, do not provide much information that could be parsed from their RDF representation, so the API for Compounds is formulated in a way that meaningful information would be returned to the user. First of all, users can obtain the Set of conformers (if any) that it groups and delegates. In cases where 3D characteristics of the compound are not taken into account, conformers do not play a particular role, otherwise the exact conformer has to be determined. The set of these conformers is available using the method Set<Conformer> listConformers(AuthenticationToken token) throws ToxOtisException. What is more, one can download the compound and store it in some supported chemical media type like sdf or mol as it was explained in the previous section. Here is an example of downloading the SD file of a given compound:

Compound c =new Compound(Services.IDEACONSULT.augment("compound","100"));
c.download(new File("/path/to/file.sdf"), Media.CHEMICAL_MDLSDF, null);

The compound and conformer APIs in ToxOtis are taking just their first steps so there is not much functionality in there yet.

ToxOtis and Weka

Converting a Dataset into an Instances object

Source Code Example

ToxOtis can be used in combintation with Weka (stable version 3.6.2), a well known open source machine learning package for Java. Using ToxOtis you can convert your datasets into instances of weka.core.Instances which in turn can be used in some filtering, training or other data processing procedure. Here is an example of downloading a dataset and creating a corresponding weka.core.Instances object.

VRI vri = new VRI(Services.IDEACONSULT.augment("dataset","9"));
Dataset ds = new Dataset(vri);
ds.loadFromRemote();
weka.core.Instances data = ds.getInstances();

The above code will print the following to the System output:

@relation http://apps.ideaconsult.net:8080/ambit2/dataset/54

@attribute compound_uri string
@attribute http://apps.ideaconsult.net:8080/ambit2/feature/22202 numeric
@attribute http://apps.ideaconsult.net:8080/ambit2/feature/22197 string
@attribute http://apps.ideaconsult.net:8080/ambit2/feature/22201 numeric
@attribute http://apps.ideaconsult.net:8080/ambit2/feature/22196 string
@attribute http://apps.ideaconsult.net:8080/ambit2/feature/22200 numeric
@attribute http://apps.ideaconsult.net:8080/ambit2/feature/22198 numeric
@attribute http://apps.ideaconsult.net:8080/ambit2/feature/22199 numeric

@data
http://apps.ideaconsult.net:8080/ambit2/compound/261/conformer/419588,...
      113.730003,chloramphenicol,3.7508,Molecule-1,-4.69,0.2812,1.14
http://apps.ideaconsult.net:8080/ambit2/compound/116508/conformer/419581,...
      54.27,artemisinin,2.746,Molecule-1,-4.52,0.0667,2.22
...

As you can notice, the Instances object has a structure that retains the links (URIs) to the dataset from which it was created and the feature URIs. Unfortunately, Instances objects are just data wrappers and were not designed to serve as data models so all meta information about the dataset and its contained features and compounds will not be found in this object.

Benchmarking

In the following table, the computational times needed to convert a Dataset object into an instance of weka.core.Instaces are summarized. These results are based on 10 successive measurements. The measurements were carried out on a Linux machine (2.6.31-22-generic kernel, x86_64 GNU/Linux) with 3.8GB of RAM and an Intel Core 2 Duo CPU P8700 @2.53GHz. The SDK ToxOtis was used to perform the measurements (version 0.1.1.13) which includes Weka version 3.6.2 (latest stable version) and Jena version 2.6.2. These libraries run on a Sun™ JVM, version 1.6.0.20 with Java™ SE Runtime Environment (build 1.6.0.20-b02).

Table 1. Measurements on fragments of the dataset created from http://apps.ideaconsult.net:8080/ambit2/dataset/9 with 21 features and up to 1000 chemical compounds.

No. Compounds Avergage time (ms)
100 2670
200 4896
500 10959
800 18661
1000 21132

Table 2. Measurements on fragments of the dataset created from http://apps.ideaconsult.net:8080/ambit2/dataset/10 with 60 features and up to 1000 chemical compounds.

No. Compounds Avergage time (ms)
100 5607
200 8622
500 19714
800 31511
1000 41513

Creating a Dataset from a Weka object or File

The reverse process of converting a Weka entity (either an ARFF file or an Instances object) into a ToxOtis Dataset component is accomplished using the static methods of the class DatasetFactory. It takes just one line of source code; here is an example:

Instance myInstances = ...;// This is your object
Dataset myDataset = DatasetFactory.createFromArff(myInstances);

You can also use "DatasetFactory":http://github.com/alphaville/ToxOtis/blob/master/src/org/opentox/toxotis/factory/DatasetFactory.java to construct a Dataset object from an ARFF file or some ARFF InputStream (might be also an Input Stream from a remote location). Here is an example:

File myFile = new File("/path/to/my.arff");
Dataset ds = DatasetFactory.createFromArff(myFile);

References related to ToxOtis and Weka

  1. Use weka in your Java code
  2. Create an Instances object using Weka
  3. Weka Home Page
  4. Weka API (for the latest stable version)
  5. OpenTox API


Publish your Component

POSTing a Component

A client can publish some OpenTox component, that is obtain a URI for its resource on some publicly available location, using the POST HTTP method. According to the OpenTox REST API clients create new resources and acquire a URI for them by POSTing an RDF representation of these resources to an appropriate service. There are a couple of notes we need to make here. First, the entity is always POSTed as an RDF document containing all necessary information that describe the OpenTox component in a formal way according to the specifications of the OpenTox ontology . Second, as far as the service response is concerned, the following status codes are possible:

  1. 200 A new resource is created; the URI of the created resource(s) is(are) contained in the response body.
  2. 202 An asynchronous job was initiated and a Task is returned to the client to monitor the progress of the procedure.
  3. 400 The representation that was POSTed by the client is malformed (syntactic or semantic errors) or some other parameters in the request are inappropriate.
  4. 404 The location at which the client attempted to POST the entity was not found on the server.
  5. 500 Internal Server Error
  6. 503 The service is temporarily unavailable: The resource is created but not yet available

Publishable components in ToxOtis subclass OTPublishable (doc) which defines two abstract methods: Task publishOnline(VRI,AuthenticationToken) and Task publishOnline(AuthenticationToken). We copy here the documentation for the first of these two that allows users to POST their components to a specified server:

/*
 * Publish the component to a proper server identified by the uri of the
 * publishing service provided in this method. The resource will be posted to the
 * server in RDF format (Mediatype: application/rdf+xml).
 * @param token
 *      Provide an authentication token. If you think that the service does not
 *      require auhtentication/authorization, you can leave this field <code>null</code> or
 *      you can provide an empty authentication token.If the provided URI
 *      already contains an authentication token (as the URL parameter <code>
 *      tokenid</code>) it will be replaced by the new token provided to
 *      this method.
 * @return
 *      A Task for monitoring the progress of your request. If the service
 *      returns the URI of the resource right away and does not return a task,
 *      then the object you will receive from this method will now have an identifier,
 *      its status will be set to {@link Task.Status#COMPLETED }, its progress
 *      will be set to <code>100%</code> and the URI of the created resource will
 *      be available applying the method {@link Task#getResultUri() } on the returned
 *      task. In any case, the service's response will be wrapped in a {@link Task }
 *      object.
 * @throws ToxOtisException
 *      In case of invalid credentials, if the POSTed resource is not acceptable
 *      by the remote service (returns a status code 400), communication error
 *      occur with the remote server or other connection problems or the access
 *      to the service was denied (401 or 403).
 */
public abstract Task publishOnline(VRI vri, AuthenticationToken token) throws ToxOtisException;

Examples

Publish a Feature

Using the ToxOtis API, one can create a new feature and publish it to some feature server. Here is an example:

Feature f = new Feature();
f.setUnits("m^4*mA*s^2*kg^-2");
f.getMeta().setTitle("Toxicity of my city");
f.getMeta().setHasSource("http://otherserver.net:8283/opentox/model/15451");
f.getMeta().setSameAs("http://www.youtube.com/watch?v=WMKmQmkJ9gg");
Task t = f.publishOnline(Services.AMBIT_UNI_PLOVDIV.augment("feature"), null);
System.out.println(t.getResultUri());

This will print a feature URI to the standard output of your System.

Publish a bibliographic reference

We can publish a bibliographic reference as we did with features. Here is an example where a BibTeX object (doc) is created and published online:

BibTeX bib = new BibTeX(); // ...Create anonymous bibtex
bib.setAuthor("Chung W.");
bib.setTitle("The truth about UFOs");
bib.setVolume(100);
bib.setJournal("International Journal of Conspiracy Theory");
bib.setCrossref("http://localhost:3000/bibtex/549a9f40-9758-44b3-90fe-db31fe1a1a01");
bib.setBibType(BibTeX.BIB_TYPE.Article);
Task t = bib.publishOnline(Services.NTUA.augment("bibtex")), null);

Publish a dataset

POSTing a dataset always creates a new resource. A task URI is usually returned to the the client (with HTTP status 202) for monitoring the progress of the uploading. In the following example a dataset is downloaded from a remote server and POSTed to some other dataset server. Particularly, only the first 5 compounds of the dataset are requested using the URL query ?max=N.

VRI vri = new VRI(Services.IDEACONSULT.augment("dataset", "54").addUrlParameter("max", "5"));
Dataset ds = new Dataset(vri).loadFromRemote();
Task t = ds.publishOnline(Services.AMBIT_UNI_PLOVDIV.augment("dataset"), null);
System.out.println(t.getHasStatus());
while (t.getHasStatus().equals(Task.Status.RUNNING)) {
     t.loadFromRemote();
     Thread.sleep(100);
}
System.out.println(t.getResultUri());

The above example will POST the dataset as application/rdf+xml to the dataset server at ambit.uni-plovdiv.bg and monitor the returned task. The dataset http://ambit.uni-plovdiv.bg:8080/ambit2/dataset/64 has been created by running the above example code.

In some cases it might be more convenient to get the background job as a Future<VRI> instead of as a Task or even assign the background job to a certain ExecutorService of Java. So, ToxOtis provides the method publish(VRI, AuthenticationToken): Future<VRI> that returns a Feature<VRI> for a Callable that runs on Single Thread Executor. Here is an example:

VRI vri = new VRI(Services.ideaconsult().augment("dataset", "54").addUrlParameter("max", "5"));
Dataset ds = new Dataset(vri).loadFromRemote();        
Future<VRI> t = ds.publish(Services.ambitUniPlovdiv().augment("dataset"),(AuthenticationToken)null);
System.out.println(t.get());

As already metntioned, users can assign the task to a certain ExecutorService (doc) :

ExecutorService myExecutor = Executors.newFixedThreadPool(10);
VRI vri = new VRI(Services.ideaconsult().augment("dataset", "54").addUrlParameter("max", "5"));
Dataset ds = new Dataset(vri).loadFromRemote();
Future<VRI> t = ds.publish(Services.ambitUniPlovdiv().augment("dataset"),
                     (AuthenticationToken)null, myExecutor);
while (!t.isDone()){
  // Do something while waiting for the result
}
vri result = t.get();

Publish a compound

Publishing of compounds works exactly the same as it does for datasets and bibTex entries as already mentioned. However in case of compounds, users are additionaly able to create new compounds using SD files, MOL files, SMILES and other chemical formats. Such files are well known on the planet of chemo and bioinformatics and lots of scientists maintain databases of such files. The related method lies in CompoundFactory and is static since it is not an operation applied on some specific instance of Compound but in general creates a new compound on an online service. The output of this method is a Task that allows us to monitor the progress of upload. Here is an example of use where we create a new compound from an SD file:

File myFile = new File("/path/to/opentoxin.sdf");
CompoundFactory factory = CompoundFactory.getInstance();
Task task = factory.publishFromFile(f, Media.CHEMICAL_MDLSDF.getMime(), (AuthenticationToken)null);


Factories

In this section we present the functionalities provided by the methods of the factory classes of ToxOtis that reside in the package org.opentox.toxotis.factory. They contain static methods that either create OTComponent objects or (for the sake of simplicity and performance) just return a URI or a collection of such. Note that for the same reason, these classes don’t follow to the letter the Factory Design Pattern but in some cases return just pointers to the objects they create (i.e. in our case, their URLs/URIs). In the next sections we will go through each factory providing examples of use:

Overview

The package org.opentox.toxotis.factory includes 3 factory classes:

  1. CompoundFactory (javadoc) : Provides methods for publishing compounds using local representations from a file (e.g. SDF).
  2. DatasetFactory (javadoc) : Provides methods for creating a Dataset object using weka objects (Instance, Instances) or ARFF files.
  3. FeatureFactory (javadoc) : Provides remote database lookup methods

Examples

FeatureFactory

List all features stored in a remote feature service

The method #listAllFeatures in FeatureFactory returns all features stored in a specified remote feature service. The method returns a list of URIs of the features. Users can subsequently use there URIs to download and parse some of these features if necessary. The maximum number of returned URIs can be prespecified to avoid huge lists of URIs (see also). Here is a simple example:

Set<VRI> featureUris = FeatureFactory.listAllFeatures(Services.ambitUniPlovdiv().augment("feature"), 10, null);

The above method will return a list of a maximum length equal to 10. If one needs all features, it suffices to set the max parameter to -1, that is:

Set<VRI> allFeatureUris = FeatureFactory.listAllFeatures(Services.ambitUniPlovdiv().augment("feature"), -1, null);

If paging is supported by the remote service, then you can specify the page length and page index while getting the list of features:

Set<VRI> featureUris = FeatureFactory.listAllFeatures(Services.ambitUniPlovdiv().augment("feature"), 3, 10, null);

The above code means that you request for the 3rd page of length 10. Here is a possible list of features:

http://apps.ideaconsult.net:8080/ambit2/feature/20089
http://apps.ideaconsult.net:8080/ambit2/feature/20088
http://apps.ideaconsult.net:8080/ambit2/feature/20087
http://apps.ideaconsult.net:8080/ambit2/feature/20086
http://apps.ideaconsult.net:8080/ambit2/feature/20085
http://apps.ideaconsult.net:8080/ambit2/feature/20084
http://apps.ideaconsult.net:8080/ambit2/feature/20091
http://apps.ideaconsult.net:8080/ambit2/feature/20090
http://apps.ideaconsult.net:8080/ambit2/feature/20093
http://apps.ideaconsult.net:8080/ambit2/feature/20092
Search for a feature using the OT feature ontology based on the ECHA specifications.

You can perform a database lookup on a remote feature service providing your search criteria in a very convenient way. For example, say you need to get a list of all features that are owl:sameAs the dissociation constant pKa (otee:Dissociation_constant_pKa). Then using the method lookupSameAs, one has:

Set<VRI> features = FeatureFactory.lookupSameAs(OTEchaEndpoints.DissociationConstantPKa(), null);
for (VRI f : features) {
    System.out.println(f.toString());
}

A list of all ECHA endpoints is provided from OTEchaEndpoints (javadoc) and a collection of some common features is available through OTFeatures (javadoc) .

This factory allows also for new features to be easily created and POSTed to a feature service for publication in a single line of code. Proper authentication/authorization are required most of the times. The invocation of the corresponding method is especially useful when developing model training web services where a prediction feature needs to be created for the model. Here is an example:

Model m = ...;
Feature predictedFeature = FeatureFactory.createAndPublishFeature(
                    "Feature created as prediction feature for the RBF NN model "+m.getUri(),
                    new ResourceValue(m.getUri(), OTClasses.Model()), featureService, token);

DatasetFactory

DatasetFactory (doc) is a class with static methods that facilitates dataset creation and conversion from ARFF files and weka.core.Instances objects into Datasets. Using as a source a File, an InputStream, a Reader or an Instances object, a new Dataset can be created. More, there is a method to create a single DataEntry instance out of a weka.core.Instance object. Here we will provide two examples that users might stubmle across their development. First, the ordinary case of reading from a file :

String filePath = "/path/to/your_file.arff"; // << You path here!
java.io.File file = new java.io.File(filePath);
Dataset myDataset = DatasetFactory.createFromArff(file);

The above source code will generate a Dataset out of the given ARFF file or will throw a ToxOtisException in case the ARFF file you provided is not compliant with the ToxOtis requirements. For more details, please read the ToxOtis Documentation about weka

The second use case concerns the creation of a Dataset object out of an online resource where the ARFF file is available with content negotiation when the client specifies the Header ‘Accept: text/x-arff’. Here is an example:

Dataset myDataset = null;
IGetClient client = ClientFactory.createGetClient(null);
client.setMediaType(Media.WEKA_ARFF);
try {
    int code = client.getResponseCode();
    if (code == 200) {
        InputStream stream = client.getRemoteStream();
        myDataset =DatasetFactory.createFromArff(stream);
    } else {
        // Handle Exceptional Event
    }
} catch (IOException ex) {
    // Handle Exceptional Event
} finally {
    try {
        client.close();
    } catch (IOException ex) {
        // Cannot close client...
    }
}


References

ToxOtis Source Code and Downloadables

  1. Github - ToxOtis hosted by github
  2. Sourceforge - ToxOtis on Sourceforge


Documentation

  1. OpenTox - Central project site
  2. ToxOtis Javadoc
  3. ToxOtis-DB The ToxOtis persistence module
  4. http://opentox.ntua.gr - The OpenTox-NTUA web site

Implementations by NTUA

  1. Maven Repository hosting the whole ToxOtis suite.
  2. http://opentox.ntua.gr:8080 - JAQPOT3: OpenTox web services (asynchronous model training and prediction) developed at NTUA.
  3. AMBIT2 - OpenTox web service (datasets, compounds, algorithms, models)

Publications

  1. Hardy B., et al., Collaborative development of predictive toxicology applications, Journal of Cheminformatics 2010, 2:7, doi:10.1186/1758-2946-2-7


Cite ToxOtis

Please, cite ToxOtis as follows:

P. Sopasakis and H. Sarimveis (2013), ToxOtis: A Java Interface to the OpenTox Predictive Toxicology Network, 4th International CEMEPE and SECOTOX Conference, Mykonos, Greece.

BibTeX:

@incollection{SopSar13a,
  title = {ToxOtis: A Java Interface to the OpenTox Predictive Toxicology Network},
  author = {Sopasakis, P. and Sarimveis, H.},
  year = {2013},
  booktitle = {4th International CEMEPE and SECOTOX Conference},
  address = {Mykonos, Greece},
  url = {http://www.cemepe4.prd.uth.gr/}
}

The poster of this presentation can be found here.

Personal tools