Usage - TCP

TCP (Transductive Conformal Prediction) is thus far only available for classification using liblinear as underlying model-implementation.

Instantiation

As only classification mode and liblinear is available thus far, the instantiation process is straight forward:

CPSignFactory factory = new CPSignFactory(license);
TCPClassification tcpImpl = factory.createTCPLibLinear();
SignaturesCPClassification signTCP = factory.createSignaturesCPClassification(tcpImpl, 1, 3);

The call to createSignaturesCPClassification is the same as for when using ACP classification and takes a CPClassificationModel as input, as well as the signatures start- and end-heights.

Loading data & Predict

Once instantiated, the SignaturesCPClassification object offers a set of utility functions for loading data from a set of different file-types (addModel, fromChemFile and fromMolsIterator). CPSign allows you load data from multiple files and formats by calling fromMolsIterator and fromChemFile multiple times. Once data is loaded, new predictions can be performed to retrieve p-values for classes or find which signature was the most important for the classification of the new molecule:

// From a SDFile and JSON file, the property-name to model must be supplied
List<String> labels = Arrays.asList("0", "1");
String property = "class";
tcp.fromChemFile(dataFile.toURI(), property, labels);
// From SMILES-file, no property-name is needed (if response value is in second column)
tcp.fromChemFile(dataFile.toURI(), null, labels);

IAtomContainer testMol = ... // Use the Utility Methods in section "Utility Methods"
Map<String,Double> pVals = predictMondrian(testMol);
SignificantSignature ss = tcp.predictSignificantSignature(testMol);

fromChemFile

If you pass a SDF file or JSON file to fromChemFile you also need to give the property-name where the activity of the molecules are recorded. In case of a SMILES file, you can get away with only passing null as property if the desired activity is in the second column in the SMILES file, or if the desired activity is in a different column, simply send the header of that column as property. Read SMILES file format to see what requirements we put on SMILES files.

Saving and loading models

A TCP model can be stored as precomputed data (in contrast to ACP/CCP models were we usually want to store the trained ICP models). Saving the precomputed data will however save some time compared to redoing the signatures generation every time you use the same dataset.

tcp.fromChemFile( dataFile.toURI(), property, labels );

tcp.saveModel( precomputedModel, compress );
// or
tcp.saveModelEncrypted( encryptedPrecomputedModel, encryptionSpec );

..
// Load the previously saved model
tcpNew.addModel( precomputedModel, null );
// or
tcpNew.addModel( encryptedPrecomputedModel, encryptionSpec );

Image generation

To get visual results from the predictions (i.e. of the significant signature), please refer to the Image rendering page.