All Classes and Interfaces

Class
Description
 
 
 
 
 
 
 
 
This class represents an object containing useful statistics for an attribute of certain data.
 
Averages per-target reliability scores
Averages per-target scores, but ignores zeros.
 
 
 
 
 
 
This class represents the distance between values of a certain attribute type.
 
 
 
This class could be used in any kind of classification setting (e.g., hierarchical multilabel) and basically stores the statistics, which enables as to compute the number of TP, TN, FP and FN (T - true, F - false, P - positives, and N - negatives) for any given threshold.
 
 
 
 
 
 
 
 
 
Threshold calibration method by choosing the threshold that minimizes the difference in label cardinality between the training data and the predictions for the test data.
 
 
 
 
 
 
 
 
 
Calculate the confidence of prediction for the multi-target classification as follows: For each target, get the max classification probability (i.e., majority) among possible class values.
 
 
 
 
 
 
Classification statistics about the data.
 
Cloner: deep clone objects.
thrown if cloning fails
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Random tree depths for different iterations, used for tree to rules optimization procedures.
 
 
 
Writing the predictions from the ensemble in a separate file, their standard deviations from the voting procedure and the respective votes from each base classifier.
 
 
 
 
 
 
 
 
 
 
 
Ensemble of decision trees.
 
 
 
Created by Vanja Mileski on 12/15/2016.
Created by Vanja Mileski on 12/16/2016.
Subclasses should implement: public ClusModel induceSingleUnpruned(ClusRun cr); In addition, subclasses may also want to implement (to return more than one model): public void induceAll(ClusRun cr);
For each type of algorithm there should be a ClusClassifier object.
 
 
This is the logging class.
 
 
 
 
 
 
 
 
 
 
Clus formatter for the numbers that differently formats numbers whose absolute value is at least 1, and the others.
 
 
Class that holds OOB weights.
 
 
Class for outputting the training and testing results to .out file.
 
 
 
 
 
 
 
 
Class that holds OOB weights for ROS ensembles.
 
 
 
 
Create rules by decision tree ensemble algorithms (forests).
 
 
 
 
 
 
 
 
 
 
 
A linear term that has been included in the rule set.
 
 
Helper class for SLS algorithm weights.
Class representing a set of predictive clustering rules.
Create one rule for each value of each nominal attribute
Rule set created from a tree.
 
 
 
 
 
Self-training that operates without confidence score Implemented on the basis of: Culp and Michailidis, An iterative algorithm for extending learners to a semi-supervised setting, Journal of Computational and Graphical Statistics, 2008
 
 
 
 
 
 
 
 
 
 
Statistics about the data set.
Statistics manager Includes information about target attributes and weights etc.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Differential evolution algorithm.
 
Class representing a DE individual
Class representing the population.
Class representing a Differential evolution optimization problem.
 
 
 
 
 
 
 
 
 
 
Parent class of the DoubleBooleanCount that stores statistics that are used when building ROC- and PR-curves.
Class that stores prediction statistics that are used when building ROC- and PR-curves.
 
 
 
Structure that contains two doubles
 
 
 
 
 
A class that returns an Enumeration that returns only a subset of a given Enumeration using a certain filter.
 
 
 
 
EuclideanDistance works on all type of attributes.
Functions to evaluate predictions
 
A Filter Class to filter Files by Extension.
 
 
 
 
 
 
 
 
 
 
 
 
 
A Helper Class for working with Files on the Persistent Storage.
 
 
 
 
 
 
 
 
 
Class for gradient descent optimization.
Class representing a gradient descent optimization problem.
 
Deprecated.
 
 
 
 
 
Deprecated.
 
 
 
 
Hamming loss is used in multi-label classification scenario.
 
 
 
Some handy functions
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
used by fast cloners to deep clone objects
 
allows a custom cloner to be created for a specific class.
 
 
 
 
 
 
 
 
 
marks the specific class as immutable and the cloner avoids cloning it
Class for including all the linear terms implicitly in the weight optimization procedure.
 
 
Merge sort which returns the indexes of the target array, not the target array.
Structure that contains one int and one double
 
 
 
 
 
 
INIFileEnum<T extends Enum<T>>
 
 
 
 
Corresponds to a nominal settings file field.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Deprecated.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Returns maximum of per-target reliability scores, i.e., an example is considered as reliable as its most reliable component
 
 
 
 
 
 
 
 
 
 
 
Returns minimum of per-target reliability scores, i.e., an example is considered as reliable as its least reliable component
 
Normalises per-target confidence scores to [0,1].
 
 
 
 
 
 
 
 
 
 
Use if you want to compute MLC-measures in HMLC case.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Deprecated.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Implements simple insertion algorithm for maintaining k nearest neighbors.
 
 
 
 
 
 
 
 
Attribute of nominal value.
This class represents the distance between 2 values of a certain Nominal Attribute type.
 
 
 
This class stores some useful statistics for a Nominal Attribute of certain data.
 
 
 
Does nothing, no normalization is performed
Class for normalization of per-target confidence scores
 
Doesn't weights any attributes.
 
 
Attribute of numeric (continuous) value.
This class represents the distance between 2 values of a certain Numerical Attribute type.
This class stores some useful statistics for a Numeric Attribute of certain data.
 
 
A class implementing an interface to the loading of objects from a file
A class implementing an interface to the saving of objects to a file
 
 
 
 
Fake learner which returns the maintarget.
 
Abstract super class for optimization of weights of base learners.
Class representing a optimization problem.
Parameters for optimization algorithm.
Predictions of rule type base functions.
True values of the instances.
 
Deprecated.
 
 
 
Deprecated.
 
 
 
 
Provides reliability scores on the basis of 'actual error', which is not attainable in practice, i.e., if true unlabeled data are used.
 
Implements 2-Tuple.
 
 
 
 
 
 
 
 
 
Implements 4-Tuple.
 
Returns random numbers as reliability scores, two modes are possible: RANDOM_UNIFORM: random numbers are generated uniformly in [0,1] RANDOM_GAUSSIAN: random numbers are normally distribution in [0,1] with mean 0.5 and std.
 
On the basis of the given per-target confidence scores, provides ranking based confidence scores: per-target scores are ranked, independently for each target
 
 
 
 
 
 
 
 
 
 
 
 
This class stores a cache of TargetSets, testdata and the predictions for that testdata TODO: better search in stored results.
Class which determines reliability score of an unlabeled example e_u as follows: r(e_u) = sum_{e_l} w_l * oobError(e_l), where w_l is random forest proximity of e_u to labeled example e_l, and oonError return out-of-bag error of labeled example e_u.
 
 
Multiple rows (tuples) of data.
 
Relative root mean squared error.
Information about rule normalization.
 
 
 
Abstract implementation of the SearchAlgo interface.
 
 
 
 
 
 
 
 
All the settings.
 
 
 
 
 
 
 
 
Section: Ensemble methods *
 
 
 
How ROS ensemble make predictions
 
 
 
 
 
Section: General - ResourceInfo loaded *
 
 
 
 
Section: Hierarchical multi-label classification *
 
 
 
 
Section: Hierarchical multi-target regression *
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Section: Output - Show info in .out file *
Section: Output - Write predictions to file *
 
 
Section: Phylogeny *
 
 
 
 
 
 
 
For external GD binary, do we use GD or brute force method
How the initial rules are generated when using SampledRuleSet covering method
 
GD optimization.
 
WEIGHT OPTIMIZATION Differential evolution algorithm
 
 
 
 
 
Aggregation of per target reliability scores
Confidence (i.e., reliability) score for Self-Training
 
Normalization of per target reliability scores
Specifies which data will be used for calculation of OOB error, only originally labeled data or all examples (including the ones with predicted labeles with Self-training
Stopping criteria for self training
unlabeled criteria is the criteria by which the unlabeled data will be added to the training set (used by the Self Training algorithm)
 
Section: Time series *
 
 
 
Section: Tree - Heuristic *
 
Determines how we handle the case where when searching evaluating candidate split all examples have only missing values for a clustering attriute, in one of the branches.
 
Section: Tree - Pruning method *
Section: Tree - SetDistance *
 
 
 
Section: Tree - TimeSeriesDistance *
Section: Tree *
Section: Tree - TupleDistance *
 
 
 
 
 
Translates a chromosome to a targetset and evaluates it against the MTLearner
 
 
 
 
 
 
 
This class computes the average spearman rank correlation over all target attributes.
 
Standardizes per-target scores to 0.5 mean and 0.125 standard deviation.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Implements 3-Tuple.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Calculates the confidence of predictions as standard deviation of votes of the trees in random forest tree ensemble.
This class represents distances between DataTuples