Usage - TCP¶
TCP (Transductive Conformal Prediction) is thus far only available for classification using liblinear as underlying model-implementation.
As only classification mode and liblinear is available thus far, the instantiation process is straight forward:
CPSignFactory factory = new CPSignFactory(license); TCPClassification tcpImpl = factory.createTCPLibLinear(); SignaturesCPClassification signTCP = factory.createSignaturesCPClassification(tcpImpl, 1, 3);
The call to
createSignaturesCPClassification is the same as for when using ACP classification and
CPClassificationModel as input, as well as the signatures start- and end-heights.
Loading data & Predict¶
Once instantiated, the SignaturesCPClassification object offers a set of utility functions for loading data from a set of different
CPSign allows you load data from multiple files and formats by calling
fromChemFile multiple times.
Once data is loaded, new predictions can be performed to retrieve p-values for classes or
find which signature was the most important for the classification of the new molecule:
// From a SDFile and JSON file, the property-name to model must be supplied List<String> labels = Arrays.asList("0", "1"); String property = "class"; tcp.fromChemFile(dataFile.toURI(), property, labels); // From SMILES-file, no property-name is needed (if response value is in second column) tcp.fromChemFile(dataFile.toURI(), null, labels); IAtomContainer testMol = ... // Use the Utility Methods in section "Utility Methods" Map<String,Double> pVals = predictMondrian(testMol); SignificantSignature ss = tcp.predictSignificantSignature(testMol);
If you pass a SDF file or JSON file to
fromChemFile you also need to give the property-name where the
activity of the molecules are recorded. In case of a SMILES file, you can get away with only passing
null as property if the desired activity is in the second column in the SMILES file, or
if the desired activity is in a different column, simply send the header of that column as property.
Read SMILES file format to see what requirements we put on SMILES files.
Saving and loading models¶
A TCP model can be stored as precomputed data (in contrast to ACP/CCP models were we usually want to store the trained ICP models). Saving the precomputed data will however save some time compared to redoing the signatures generation every time you use the same dataset.
tcp.fromChemFile( dataFile.toURI(), property, labels ); tcp.saveModel( precomputedModel, compress ); // or tcp.saveModelEncrypted( encryptedPrecomputedModel, encryptionSpec ); .. // Load the previously saved model tcpNew.addModel( precomputedModel, null ); // or tcpNew.addModel( encryptedPrecomputedModel, encryptionSpec );