All Classes and Interfaces
Class
Description
This class represents an object containing useful statistics for
an attribute of certain data.
Averages per-target reliability scores
Averages per-target scores, but ignores zeros.
This class represents the distance between values
of a certain attribute type.
This class could be used in any kind of classification setting (e.g., hierarchical multilabel) and basically stores
the statistics,
which enables as to compute the number of TP, TN, FP and FN (T - true, F - false, P - positives, and N - negatives)
for any given threshold.
Threshold calibration method by choosing the threshold that minimizes the
difference in label cardinality between the training data and the predictions
for the test data.
Calculate the confidence of prediction for the multi-target
classification as follows: For each target, get the max classification
probability (i.e., majority) among possible class values.
Classification statistics about the data.
Cloner: deep clone objects.
thrown if cloning fails
Random tree depths for different iterations, used for tree to rules optimization procedures.
Writing the predictions from the ensemble in a separate file,
their standard deviations from the voting procedure and
the respective votes from each base classifier.
Ensemble of decision trees.
Created by Vanja Mileski on 12/15/2016.
Created by Vanja Mileski on 12/16/2016.
Subclasses should implement:
public ClusModel induceSingleUnpruned(ClusRun cr);
In addition, subclasses may also want to implement (to return more than one model):
public void induceAll(ClusRun cr);
For each type of algorithm there should be a ClusClassifier object.
This is the logging class.
Clus formatter for the numbers that differently formats numbers whose absolute value is at least 1,
and the others.
Class that holds OOB weights.
Class for outputting the training and testing results to .out file.
Class that holds OOB weights for ROS ensembles.
Create rules by decision tree ensemble algorithms (forests).
A linear term that has been included in the rule set.
Helper class for SLS algorithm weights.
Class representing a set of predictive clustering rules.
Create one rule for each value of each nominal attribute
Rule set created from a tree.
Self-training that operates without confidence score
Implemented on the basis of: Culp and Michailidis, An iterative algorithm for extending learners to a semi-supervised
setting, Journal of Computational and Graphical Statistics, 2008
Statistics about the data set.
Statistics manager Includes information about target attributes and weights
etc.
Differential evolution algorithm.
Class representing a DE individual
Class representing the population.
Class representing a Differential evolution optimization problem.
Parent class of the
DoubleBooleanCount
that stores statistics that are used when building ROC- and
PR-curves.Class that stores prediction statistics that are used when building ROC- and PR-curves.
Structure that contains two doubles
A class that returns an Enumeration that returns only a subset of a
given Enumeration using a certain filter.
EuclideanDistance works on all type of attributes.
Functions to evaluate predictions
A Filter Class to filter Files by Extension.
A Helper Class for working with Files on the Persistent Storage.
Class for gradient descent optimization.
Class representing a gradient descent optimization problem.
Deprecated.
Deprecated.
Hamming loss is used in multi-label classification scenario.
Some handy functions
used by fast cloners to deep clone objects
allows a custom cloner to be created for a specific class.
marks the specific class as immutable and the cloner avoids cloning it
Class for including all the linear terms implicitly in the weight optimization procedure.
Merge sort which returns the indexes of the target array, not the target array.
Structure that contains one int and one double
Corresponds to a nominal settings file field.
Deprecated.
Returns maximum of per-target reliability scores, i.e., an example is
considered as reliable as its most reliable component
Returns minimum of per-target reliability scores, i.e., an example is
considered as reliable as its least reliable component
Normalises per-target confidence scores to [0,1].
Use if you want to compute MLC-measures in HMLC case.
Deprecated.
Implements simple insertion algorithm for maintaining k nearest neighbors.
Attribute of nominal value.
This class represents the distance between 2 values
of a certain Nominal Attribute type.
This class stores some useful statistics for a Nominal Attribute
of certain data.
Does nothing, no normalization is performed
Class for normalization of per-target confidence scores
Doesn't weights any attributes.
Attribute of numeric (continuous) value.
This class represents the distance between 2 values
of a certain Numerical Attribute type.
This class stores some useful statistics for a Numeric Attribute
of certain data.
A class implementing an interface to the loading of objects from a file
A class implementing an interface to the saving of objects to a file
Fake learner which returns the maintarget.
Abstract super class for optimization of weights of base learners.
Class representing a optimization problem.
Parameters for optimization algorithm.
Predictions of rule type base functions.
True values of the instances.
Deprecated.
Deprecated.
Provides reliability scores on the basis of 'actual error', which is not
attainable in practice, i.e., if true unlabeled data are used.
Implements 2-Tuple.
Implements 4-Tuple.
Returns random numbers as reliability scores, two modes are possible:
RANDOM_UNIFORM: random numbers are generated uniformly in [0,1]
RANDOM_GAUSSIAN: random numbers are normally distribution in [0,1] with mean
0.5 and std.
On the basis of the given per-target confidence scores, provides ranking
based confidence scores: per-target scores are ranked, independently for
each target
This class stores a cache of TargetSets, testdata and the predictions for that testdata
TODO: better search in stored results.
Class which determines reliability score of an unlabeled example e_u as
follows: r(e_u) = sum_{e_l} w_l * oobError(e_l), where w_l is random forest
proximity of e_u to labeled example e_l, and oonError return out-of-bag error
of labeled example e_u.
Multiple rows (tuples) of data.
Relative root mean squared error.
Information about rule normalization.
Abstract implementation of the SearchAlgo interface.
All the settings.
Section: Ensemble methods *
How ROS ensemble make predictions
Section: General - ResourceInfo loaded *
Section: Hierarchical multi-label classification *
Section: Hierarchical multi-target regression *
Section: Output - Show info in .out file *
Section: Output - Write predictions to file *
Section: Phylogeny *
For external GD binary, do we use GD or brute force method
How the initial rules are generated when using SampledRuleSet covering method
GD optimization.
WEIGHT OPTIMIZATION
Differential evolution algorithm
Aggregation of per target reliability scores
Confidence (i.e., reliability) score for Self-Training
Normalization of per target reliability scores
Specifies which data will be used for calculation of OOB error, only originally labeled data or all examples
(including the ones with predicted labeles with Self-training
Stopping criteria for self training
unlabeled criteria is the criteria by which the unlabeled data will be added to the training set (used by the
Self Training algorithm)
Section: Time series *
Section: Tree - Heuristic *
Determines how we handle the case where when searching evaluating candidate
split all examples have only missing values for a clustering attriute, in one
of the branches.
Section: Tree - Pruning method *
Section: Tree - SetDistance *
Section: Tree - TimeSeriesDistance *
Section: Tree *
Section: Tree - TupleDistance *
Translates a chromosome to a targetset and evaluates it against the MTLearner
This class computes the average spearman rank correlation over all target attributes.
Standardizes per-target scores to 0.5 mean and 0.125 standard deviation.
Implements 3-Tuple.
Calculates the confidence of predictions as standard deviation of votes of
the trees in random forest tree ensemble.
This class represents distances between DataTuples