Title: | Fast, Reliable and Elegant Reproducible Research |
---|---|
Description: | Analysis of experimental results and automatic report generation in both interactive HTML and LaTeX. This package ships with a rich interface for data modeling and built in functions for the rapid application of statistical tests and generation of common plots and tables with publish-ready quality. |
Authors: | Jacinto Arias [aut, cre], Javier Cozar [aut] |
Maintainer: | Jacinto Arias <[email protected]> |
License: | GPL-2 |
Version: | 0.4.1 |
Built: | 2024-10-09 03:43:49 UTC |
Source: | https://github.com/jacintoarias/exreport |
This fuctions joints two experiments sharing the same configuration of methods, problems and parameters but different outputs. The resulting experiment includes the common rows for both experiments with all the output columns.
expCombine(e1, e2, name = NULL)
expCombine(e1, e2, name = NULL)
e1 |
First experiment to combine. |
e2 |
An second experiment to combine, must share the same config as e1. |
name |
Optional name for the resulting experiment. If not specified the new experiment will be called "e1_name U e2_name" |
An new experiment with common rows and all columns.
# In this example we turn the wekaExperiment into two different experiments, # with different outputs to combine them: df_acc <- wekaExperiment[, c("method", "problem", "fold", "featureSelection", "accuracy")] df_time <- wekaExperiment[, c("method", "problem", "fold", "featureSelection", "trainingTime")] exp_acc <- expCreate(df_acc, name="acc", parameter="fold") exp_time <- expCreate(df_time, name="time", parameter="fold") # With expCombine we can mix the two experiments: expCombine(exp_acc, exp_time)
# In this example we turn the wekaExperiment into two different experiments, # with different outputs to combine them: df_acc <- wekaExperiment[, c("method", "problem", "fold", "featureSelection", "accuracy")] df_time <- wekaExperiment[, c("method", "problem", "fold", "featureSelection", "trainingTime")] exp_acc <- expCreate(df_acc, name="acc", parameter="fold") exp_time <- expCreate(df_time, name="time", parameter="fold") # With expCombine we can mix the two experiments: expCombine(exp_acc, exp_time)
This function concatenates two experiments with the same configuration of parameter an outputs. At least one common output must be present, the rest of them will be removed from the resulting experiment. Different methods and problems can be present.
expConcat(e1, e2, name = NULL, tol = 1e-09)
expConcat(e1, e2, name = NULL, tol = 1e-09)
e1 |
First experiment object to concat. |
e2 |
Second experiment object to concat. Must have the same configuration than e1. |
name |
Optional name, if not provided the new experiment will be called "e1_name + e2_name" |
tol |
Tolerance value for duplicate checking. |
An experiment object having all the rows of e1 and e2
# In this example we turn the wekaExperiment into two different experiments, # with different parameter values to combine them: df_no <- wekaExperiment[wekaExperiment$featureSelection=="no",] df_yes <- wekaExperiment[wekaExperiment$featureSelection=="yes",] exp_yes <- expCreate(df_yes, name="fss-yes", parameter="fold") exp_no <- expCreate(df_no, name="fss-no", parameter="fold") expConcat(exp_yes, exp_no)
# In this example we turn the wekaExperiment into two different experiments, # with different parameter values to combine them: df_no <- wekaExperiment[wekaExperiment$featureSelection=="no",] df_yes <- wekaExperiment[wekaExperiment$featureSelection=="yes",] exp_yes <- expCreate(df_yes, name="fss-yes", parameter="fold") exp_no <- expCreate(df_no, name="fss-no", parameter="fold") expConcat(exp_yes, exp_no)
This function loads a data.frame, checks its properties and formats an exreport experiment object. The columns of an experiments must contain at least two categorical columns to be identified as the method and problem variables and a thrid numerical column to be identified as an output variable. Additional columns can be added as parameters or additional outputs.
expCreate(data, methods = "method", problems = "problem", parameters = c(), respectOrder = FALSE, name, tol = 1e-09)
expCreate(data, methods = "method", problems = "problem", parameters = c(), respectOrder = FALSE, name, tol = 1e-09)
data |
A data.frame object satisfying the experiment format |
methods |
The name of the variable which contains the methods, by default is searches for a column named "method". |
problems |
The name of the variable which contains the problems, by default is searches for a column named "problem". |
parameters |
A list of the columns names to be identified as parameters. By default the remaining categorical columns are identified as parameters, so this list is useful only to identify numeric columns. |
respectOrder |
A logical parameter which indicates if the order of the elements of the method and problem columns must be respected by appearance or ordered alphabeticaly. It affects to the look of data representations. |
name |
A string which will identify the experiment in the report. |
tol |
Tolerance factor to identify repeated experiments for duplicated rows. |
A new exreport experiment object.
expCreateFromTable
# Creates experiment specifying column names and the numerical variables that # are parameters expCreate(wekaExperiment, methods="method", problems="problem", parameters="fold", name="Test Experiment")
# Creates experiment specifying column names and the numerical variables that # are parameters expCreate(wekaExperiment, methods="method", problems="problem", parameters="fold", name="Test Experiment")
Create an exreport experiment object from a tabular representation. The input data must be a table having methods as rows and problems as columns. The values in such table correspond to a particular output. The resulting experiment can be characterized with static parameters.
expCreateFromTable(data, output, name, parameters = list(), respectOrder = FALSE)
expCreateFromTable(data, output, name, parameters = list(), respectOrder = FALSE)
data |
Input tabular data satisfying the previous constraints. |
output |
String indicating the name of the output that the table values represent. |
name |
A string which will identify the experiment in the report. |
parameters |
A list of strings containing the names and values for the static configuration of the algorithm. The name of each element of the list will correspond with the name of a parameter and the element with the value asigned. |
respectOrder |
A logical parameter which indicates if the order of the elements of the method and problem columns must be respected by appearance or ordered alphabeticaly. It affects to the look of data representations. |
A new exreport experiment object.
expCreate
# We generate a data frame where the methods are rows and the problems columns # from the wekaExperiment problem. (This is only an example, normally you # would prefer to load a proper experiment and process it.) library(reshape2) df <- dcast(wekaExperiment[wekaExperiment$featureSelection=="no",], method ~ problem, value.var="accuracy", fun.aggregate = mean) # We can create it and parametrice accordingly: expCreateFromTable(df, output="accuracy", name="weka") # Optionally we can set a fixed value for parameters, and ordered by appearance: expCreateFromTable(df, output="accuracy", name="weka", parameters=list(featureSelection = "no"), respectOrder=TRUE)
# We generate a data frame where the methods are rows and the problems columns # from the wekaExperiment problem. (This is only an example, normally you # would prefer to load a proper experiment and process it.) library(reshape2) df <- dcast(wekaExperiment[wekaExperiment$featureSelection=="no",], method ~ problem, value.var="accuracy", fun.aggregate = mean) # We can create it and parametrice accordingly: expCreateFromTable(df, output="accuracy", name="weka") # Optionally we can set a fixed value for parameters, and ordered by appearance: expCreateFromTable(df, output="accuracy", name="weka", parameters=list(featureSelection = "no"), respectOrder=TRUE)
This function extends an existing exreport experiment object by adding new parameters with fixed values.
expExtend(e, parameters)
expExtend(e, parameters)
e |
Input experiment |
parameters |
A list of strings containing the values of the new parameters, the name for each one of them will be given by the name of the corresponding object in the list. |
A modified exreport experiment object with additional parameters.
# We load the wekaExperiment problem as an experiment and then add a new param # with a default value. experiment <- expCreate(wekaExperiment, name="test", parameter="fold") expExtend(experiment, list(discretization = "no"))
# We load the wekaExperiment problem as an experiment and then add a new param # with a default value. experiment <- expCreate(wekaExperiment, name="test", parameter="fold") expExtend(experiment, list(discretization = "no"))
This functions generates a new experiment incluing the methods that obtained an equivalent performance with statisticall significance in the multiple comparison test i.e. those whose hypotheses were not rejected
expExtract(ph)
expExtract(ph)
ph |
A testMultipleControl test object |
an experiment object
# First we create an experiment from the wekaExperiment problem and prepare # it to apply the test: experiment <- expCreate(wekaExperiment, name="test", parameter="fold") experiment <- expReduce(experiment, "fold", mean) experiment <- expInstantiate(experiment, removeUnary=TRUE) # Then we perform a testMultiplePairwise test procedure test <- testMultipleControl(experiment, "trainingTime", "min") expExtract(test)
# First we create an experiment from the wekaExperiment problem and prepare # it to apply the test: experiment <- expCreate(wekaExperiment, name="test", parameter="fold") experiment <- expReduce(experiment, "fold", mean) experiment <- expInstantiate(experiment, removeUnary=TRUE) # Then we perform a testMultiplePairwise test procedure test <- testMultipleControl(experiment, "trainingTime", "min") expExtract(test)
This function computes the duplicated rows attending to the method, problem and input parameters (but not the outputs). The resulting experiment will contain these duplicated rows.
expGetDuplicated(e, tol = 1e-09)
expGetDuplicated(e, tol = 1e-09)
e |
The experiment to check for duplicated rows |
tol |
The tolerance for numeric values to check if two outputs are numerically equal or not. |
If duplicated rows show different outputs the function will launch a a warning message indicating how many of them differ in the outputs from the original row, the extent to what two rows are divergent in their output can be parametrized.
This function is useful to determine the consistency of the experiment, as a measure to sanitice the original data source if needed,
A new experiment containing the duplicated rows
# We duplicate some of the rows of a given experiment: e <- expCreate(wekaExperiment, parameters="fold", name="Test Experiment") redundant <- expCreate(wekaExperiment[wekaExperiment$method=="NaiveBayes",], parameters="fold", name="Test Experiment") e2 <- expConcat(e,redundant) # Now we check for duplicates: expGetDuplicated(e2)
# We duplicate some of the rows of a given experiment: e <- expCreate(wekaExperiment, parameters="fold", name="Test Experiment") redundant <- expCreate(wekaExperiment[wekaExperiment$method=="NaiveBayes",], parameters="fold", name="Test Experiment") e2 <- expConcat(e,redundant) # Now we check for duplicates: expGetDuplicated(e2)
When performing statistical tests or summarizing an experiment for a given output variable there can be different parameter configuration for each interaction of method and problem. Once applied the desired transformations this function can be used to remove unary parameters from the experiment or to instantiate the methods for each configuration.
expInstantiate(e, parameters = NULL, removeUnary = TRUE)
expInstantiate(e, parameters = NULL, removeUnary = TRUE)
e |
The experiment object to be instantiated |
parameters |
A vector indicating the parameters to be instantiaded. If NULL or default all parameters would be considered. |
removeUnary |
Boolean value indicating if the unary parameters will be used in an instantiation or if the column can be erased. |
If any method is instantiated the cartesian product of the method and the selected parameters is performed and included in the resulting experiment as the methods variable. The name of the corresponding value will indicate the name of the former method and the value of each parameter instantiated.
an experiment object
# Create an experiment from the wekaExperiment experiment <- expCreate(wekaExperiment, name="test-exp", parameter="fold") # We would like to reduce the fold parameter by its mean value. It becomes an # unary parameter. experiment <- expReduce(experiment, "fold", mean) # Now we instantiate the experiment by the featureSelection parameter and # remove the unary fold parameter expInstantiate(experiment, removeUnary=TRUE)
# Create an experiment from the wekaExperiment experiment <- expCreate(wekaExperiment, name="test-exp", parameter="fold") # We would like to reduce the fold parameter by its mean value. It becomes an # unary parameter. experiment <- expReduce(experiment, "fold", mean) # Now we instantiate the experiment by the featureSelection parameter and # remove the unary fold parameter expInstantiate(experiment, removeUnary=TRUE)
This functions reduces a parameter by aggregating the outputs variables for each value and for each configuration of method, problem and remaining parameters. By default it computes the mean of the variables.
expReduce(e, parameters = NULL, FUN = mean)
expReduce(e, parameters = NULL, FUN = mean)
e |
An input experiment object. |
parameters |
The parameter or parameters to be reduced, if NULL or default all parameters are considered. |
FUN |
The function used to agregate the ouput values |
An experiment object.
# Create an experiment from the wekaExperiment experiment <- expCreate(wekaExperiment, name="test-exp", parameter="fold") # We would like to reduce the fold parameter by its mean value. This way expReduce(experiment, "fold", mean)
# Create an experiment from the wekaExperiment experiment <- expCreate(wekaExperiment, name="test-exp", parameter="fold") # We would like to reduce the fold parameter by its mean value. This way expReduce(experiment, "fold", mean)
This function removes duplicated rows of a given experiment attending to the interaction of methods, problems and parameters (but no outputs).
expRemoveDuplicated(e, tol = 1e-09)
expRemoveDuplicated(e, tol = 1e-09)
e |
The experiment to be analised |
tol |
The tolerance for numeric values to check if two outputs are numerically equal or not. |
The duplicated rows found are compared among themselves to determine if there is divergence between the outputs, if the rows are not consistent a warning is raised to note this difference.
an experiment object
# We duplicate some of the rows of a given experiment: e <- expCreate(wekaExperiment, parameters="fold", name="Test Experiment") redundant <- expCreate(wekaExperiment[wekaExperiment$method=="NaiveBayes",], parameters="fold", name="Test Experiment") e2 <- expConcat(e,redundant) # Now we remove those duplicates: expRemoveDuplicated(e2)
# We duplicate some of the rows of a given experiment: e <- expCreate(wekaExperiment, parameters="fold", name="Test Experiment") redundant <- expCreate(wekaExperiment[wekaExperiment$method=="NaiveBayes",], parameters="fold", name="Test Experiment") e2 <- expConcat(e,redundant) # Now we remove those duplicates: expRemoveDuplicated(e2)
This function change the name of problems, methods or parameter values that an existing experiment object contains.
expRename(e, elements = list(), name = NULL)
expRename(e, elements = list(), name = NULL)
e |
Input experiment |
elements |
A list of arrays of strings containing the new names. The old name will be specified as the name of the element in such array, and the name for the parameter, method or problem will be given by the name of the corresponding object in the list. If a name is not present in the set of parameter names or parameter values, it will be ignored. |
name |
The name of the new experiment. If NULL, the previous name will be used. |
A modified exreport experiment object with some changes on the name of the elements.
# We load the wekaExperiment problem as an experiment and then change the name # of one value for the parameter discretization and for one method. experiment <- expCreate(wekaExperiment, name="test", parameter="fold") expRename(experiment, list(featureSelection = c("no"="false"), method=c("RandomForest"="RndForest")))
# We load the wekaExperiment problem as an experiment and then change the name # of one value for the parameter discretization and for one method. experiment <- expCreate(wekaExperiment, name="test", parameter="fold") expRename(experiment, list(featureSelection = c("no"="false"), method=c("RandomForest"="RndForest")))
This function change the order of problems, methods or parameter values that an existing experiment object contains. The order affects the look of the data representation (as tables and plots).
expReorder(e, elements, placeRestAtEnd = TRUE)
expReorder(e, elements, placeRestAtEnd = TRUE)
e |
Input experiment |
elements |
A list of arrays of strings containing the ordered names. The name for the parameter, method or problem will be given by the name of the corresponding object in the list. The names which have not been specified will be placed at the begining or at the end (depending on the parameter placeRestAtEnd). If a name is not present in the set of parameter values, it will be ignored. |
placeRestAtEnd |
Logical value which indicates if the non specified value names have to be placed after the specified ones (TRUE) or before (FALSE). |
A modified exreport experiment object with some changes on the name of the elements.
# We load the wekaExperiment problem as an experiment and then change the order # of the values for the parameter featureSelection and for one valoue for the method. experiment <- expCreate(wekaExperiment, name="test", parameter="fold") expReorder(experiment, list(featureSelection = c("yes","no"), method=c("OneR")))
# We load the wekaExperiment problem as an experiment and then change the order # of the values for the parameter featureSelection and for one valoue for the method. experiment <- expCreate(wekaExperiment, name="test", parameter="fold") expReorder(experiment, list(featureSelection = c("yes","no"), method=c("OneR")))
This function receives a named list indicating variables and values to filter the input experiment.
expSubset(e, columns, invertSelection = FALSE)
expSubset(e, columns, invertSelection = FALSE)
e |
The experiment to be subsetted |
columns |
A named list containing the variables to be filtered and the valid values. |
invertSelection |
If the filtering must match the inversion of the specified conditions. |
The names of the elements in the list correspond with the variables to be filtered, indicating either the methos or problem variables as well as parameters. The values of the list correspond with the valid states for the filtering.
a filtered experiment object
# We create a new experiment from the wekaExperiment problem e <- expCreate(wekaExperiment, parameters="fold", name="Test Experiment") # We can filter the experiment to reduce the number of methods. e <- expSubset(e, list(method = c("J48", "NaiveBayes"))) e # We can filter the experiment to remove a given problem e <- expSubset(e, list(problem = "iris"), invertSelection=TRUE) e # We can subset the experiment to obtain a specific parameter configuration e <- expSubset(e, list("featureSelection" = "no")) e
# We create a new experiment from the wekaExperiment problem e <- expCreate(wekaExperiment, parameters="fold", name="Test Experiment") # We can filter the experiment to reduce the number of methods. e <- expSubset(e, list(method = c("J48", "NaiveBayes"))) e # We can filter the experiment to remove a given problem e <- expSubset(e, list(problem = "iris"), invertSelection=TRUE) e # We can subset the experiment to obtain a specific parameter configuration e <- expSubset(e, list("featureSelection" = "no")) e
This function inits a new exreport document to start adding elements for later rendering.
exreport(title)
exreport(title)
title |
A string representing a short title for this document |
an empty exreport document
exreportRender, exreportAdd
This function allows to add one or more reportable objects to an exisiting exreport document.
exreportAdd(rep, elem)
exreportAdd(rep, elem)
rep |
an exreport object in which the elem will be added |
elem |
a reportable object or a list of them |
an extended exreport document
# Create an empty document: report <- exreport("Test document") # Create a reportable object (an experiment) experiment <- expCreate(wekaExperiment, name="test-exp", parameter="fold") # Add this object to the document exreportAdd(report, experiment)
# Create an empty document: report <- exreport("Test document") # Create a reportable object (an experiment) experiment <- expCreate(wekaExperiment, name="test-exp", parameter="fold") # Add this object to the document exreportAdd(report, experiment)
This function renders an existing exreport object to a given file and format.
exreportRender(rep, destination = NULL, target = "html", safeMode = TRUE, visualize = TRUE)
exreportRender(rep, destination = NULL, target = "html", safeMode = TRUE, visualize = TRUE)
rep |
The exreport object to be rendered |
destination |
Path to the rendered file. If NULL, it uses a temporary directory |
target |
The format of the target rendering. HTML and PDF are allowed. |
safeMode |
Denies or allows (TRUE or FALSE) output files overwriting |
visualize |
Visualize the generated output or not |
an experiment object
A problem containing the percentage of the CO2 reduction in the emission of 20 industrial fuel combustion processes. It has been used three different Ionic Liquids (ILs) pills with different properties. The pills has been reused up to three times, and each experiment has been repeated three times under the same conditions. The variables of the problem are as follows:
data(ILsMultiple)
data(ILsMultiple)
A data frame with the data detailed in the Description.
IL The name of the IL pills (IL1, IL2 and IL3).
Scenario The name of the industrial fuel combustion process (from Scenario 1 to Scenario20).
Execution The number of the execution for each experiment under the same conditions.
Reused The number of experiments which the IL pill has been used previously (from 0 to 2).
CO2 The percentage of CO2 which has been reduced from the emission.
A problem containing the percentage of the CO2 reduction in the emission of 20 industrial fuel combustion processes. It has been used two different Ionic Liquids (ILs) pills with different properties. The pills has been reused up to three times, and each experiment has been repeated three times under the same conditions. The variables of the problem are as follows:
data(ILsPaired)
data(ILsPaired)
A data frame with the data detailed in the Description.
IL The name of the IL pills (IL1 and IL2).
Scenario The name of the industrial fuel combustion process (from Scenario 1 to Scenario20).
Execution The number of the execution for each experiment under the same conditions.
Reused The number of experiments which the IL pill has been used previously (from 0 to 2).
CO2 The percentage of CO2 which has been reduced from the emission.
This function builds an area plot from a testMultiple object displaying the cumulative value for each method for all the evaluated problems. The value for the rankings is obtained from the Friedman test independently of the scope of the test (control or pairwise).
plotCumulativeRank(testMultiple, grayscale = FALSE)
plotCumulativeRank(testMultiple, grayscale = FALSE)
testMultiple |
Statistical test from which the plot is generated. The rankings are obtained from the Friedman test. |
grayscale |
Configure the plot using a grayscale palette. |
an exPlot object
# First we create an experiment from the wekaExperiment problem and prepare # it to apply the test: experiment <- expCreate(wekaExperiment, name="test", parameter="fold") experiment <- expReduce(experiment, "fold", mean) experiment <- expSubset(experiment, list(featureSelection = "no")) experiment <- expInstantiate(experiment, removeUnary=TRUE) # Then we perform a Friedman test included ina a testMultipleControl # test procedure test <- testMultipleControl(experiment, "accuracy") # Finally we obtain the plot plotCumulativeRank(test) cat()
# First we create an experiment from the wekaExperiment problem and prepare # it to apply the test: experiment <- expCreate(wekaExperiment, name="test", parameter="fold") experiment <- expReduce(experiment, "fold", mean) experiment <- expSubset(experiment, list(featureSelection = "no")) experiment <- expInstantiate(experiment, removeUnary=TRUE) # Then we perform a Friedman test included ina a testMultipleControl # test procedure test <- testMultipleControl(experiment, "accuracy") # Finally we obtain the plot plotCumulativeRank(test) cat()
This function builds a barplot for a given experiment output variable, summarizing its distribution according to the different methods and problems. The aspect of the plot can be parametrized in several ways.
plotExpSummary(exp, output, columns = 0, freeScale = FALSE, fun = identity, grayscale = FALSE)
plotExpSummary(exp, output, columns = 0, freeScale = FALSE, fun = identity, grayscale = FALSE)
exp |
- The experiment object to take the data from |
output |
- A string identifying the name of the output variable to be plotted |
columns |
- Integer number, 0 for a wide aspect plot and any other value to include n columns of facets separating the problems. |
freeScale |
- Boolean, if using facets sets the scale of each one independent or not |
fun |
- A function to be applied to the selected output variables before being plotted. |
grayscale |
- Defaulted to False. True for a plot in grayscale. |
Please notice that the plot function requires that an unique configuration of parameters is present in the experiment. So the user must have processed and instantiated the experiment before.
an exPlot object
# This example plots the distribution of the trainingTime variable in the # wekaExperiment problem. # First we create the experiment from the problem. experiment <- expCreate(wekaExperiment, name="test", parameter="fold") # Next we must process it to have an unique parameter configuration: # We select a value for the parameter featureSelection: experiment <- expSubset(experiment, list(featureSelection = "yes")) # Then we reduce the fold parameter: experiment <- expReduce(experiment, "fold", mean) # Finally we remove unary parameters by instantiation: experiment <- expInstantiate(experiment, removeUnary=TRUE) # Now we can generate several plots: # Default plot: plotExpSummary(experiment, "accuracy") # We can include faceting in the plot by dividing it into columns: plotExpSummary(experiment, "accuracy", columns=3) # If we want to show the independent interaction for the output variable # in each experiment we can make the scales for example, remark the difference # in : plotExpSummary(experiment, "trainingTime", columns=3, freeScale=FALSE) plotExpSummary(experiment, "trainingTime", columns=3, freeScale=TRUE)
# This example plots the distribution of the trainingTime variable in the # wekaExperiment problem. # First we create the experiment from the problem. experiment <- expCreate(wekaExperiment, name="test", parameter="fold") # Next we must process it to have an unique parameter configuration: # We select a value for the parameter featureSelection: experiment <- expSubset(experiment, list(featureSelection = "yes")) # Then we reduce the fold parameter: experiment <- expReduce(experiment, "fold", mean) # Finally we remove unary parameters by instantiation: experiment <- expInstantiate(experiment, removeUnary=TRUE) # Now we can generate several plots: # Default plot: plotExpSummary(experiment, "accuracy") # We can include faceting in the plot by dividing it into columns: plotExpSummary(experiment, "accuracy", columns=3) # If we want to show the independent interaction for the output variable # in each experiment we can make the scales for example, remark the difference # in : plotExpSummary(experiment, "trainingTime", columns=3, freeScale=FALSE) plotExpSummary(experiment, "trainingTime", columns=3, freeScale=TRUE)
This function generates a boxplot from a testMultiple statistical test showing the ordered distrubution of rankings for each method computed for the Friedman test. If the input test features a control multiple comparison then the rejected hypotheses by the Holm methd are also indicates in the plot.
plotRankDistribution(testMultiple)
plotRankDistribution(testMultiple)
testMultiple |
The statistical test from which the plot is generated. The functions accepts either control and pairwise multiple tests. |
an experiment object
# First we create an experiment from the wekaExperiment problem and prepare # it to apply the test: experiment <- expCreate(wekaExperiment, name="test", parameter="fold") experiment <- expReduce(experiment, "fold", mean) experiment <- expSubset(experiment, list(featureSelection = "yes")) experiment <- expInstantiate(experiment, removeUnary=TRUE) # Then we perform a Friedman test included ina a testMultipleControl # test procedure test <- testMultipleControl(experiment, "accuracy") # Finally we obtain the plot plotRankDistribution(test) cat()
# First we create an experiment from the wekaExperiment problem and prepare # it to apply the test: experiment <- expCreate(wekaExperiment, name="test", parameter="fold") experiment <- expReduce(experiment, "fold", mean) experiment <- expSubset(experiment, list(featureSelection = "yes")) experiment <- expInstantiate(experiment, removeUnary=TRUE) # Then we perform a Friedman test included ina a testMultipleControl # test procedure test <- testMultipleControl(experiment, "accuracy") # Finally we obtain the plot plotRankDistribution(test) cat()
This function generates a table for the given outputs of the experiment, comparing all methods for each one of the problems. In addition the function can highlight the best results for each problem as well as display a range of parameters for the posterior renderization.
tabularExpSummary(exp, outputs, boldfaceColumns = "none", format = "f", digits = 4, tableSplit = 1, rowsAsMethod = TRUE)
tabularExpSummary(exp, outputs, boldfaceColumns = "none", format = "f", digits = 4, tableSplit = 1, rowsAsMethod = TRUE)
exp |
The ource experiment to generate the table from |
outputs |
A given variable or list of them to be the target of the table |
boldfaceColumns |
Indicate ("none","max" or "min") to highlight the method optimizing the variables for each problem. |
format |
Indicates the format of the numeric output using C formatting styles. Defaults to 'f' |
digits |
The number of decimal digits to include for the numeric output. |
tableSplit |
Indicates the number of parititions of the table that will be rendered. Usefull when the the table is excessivelly wide. |
rowsAsMethod |
Display the methods as the rows of the table, indicate FALSE for a transpose table. |
An extabular object
# This example plots the distribution of the trainingTime variable in the # wekaExperiment problem. # First we create the experiment from the problem. experiment <- expCreate(wekaExperiment, name="test", parameter="fold") # Next we must process it to have an unique parameter configuration: # We select a value for the parameter featureSelection: experiment <- expSubset(experiment, list(featureSelection = "yes")) # Then we reduce the fold parameter: experiment <- expReduce(experiment, "fold", mean) # Finally we remove unary parameters by instantiation: experiment <- expInstantiate(experiment, removeUnary=TRUE) # Generate the default table: tabularExpSummary(experiment, "accuracy")
# This example plots the distribution of the trainingTime variable in the # wekaExperiment problem. # First we create the experiment from the problem. experiment <- expCreate(wekaExperiment, name="test", parameter="fold") # Next we must process it to have an unique parameter configuration: # We select a value for the parameter featureSelection: experiment <- expSubset(experiment, list(featureSelection = "yes")) # Then we reduce the fold parameter: experiment <- expReduce(experiment, "fold", mean) # Finally we remove unary parameters by instantiation: experiment <- expInstantiate(experiment, removeUnary=TRUE) # Generate the default table: tabularExpSummary(experiment, "accuracy")
This function obtain a pairwise table comparing the methods among themselves for the specified metrics. It takes an testMultiplePairwise object as an input.
tabularTestPairwise(ph, value = "pvalue", charForNAs = "-")
tabularTestPairwise(ph, value = "pvalue", charForNAs = "-")
ph |
The input testMultiplePairwise object |
value |
Indicates the metric to be displayed ("pvalue", "wtl") |
charForNAs |
Indicates the character included when there is not comparison available |
An extabular object
# First we create an experiment from the wekaExperiment problem and prepare # it to apply the test: experiment <- expCreate(wekaExperiment, name="test", parameter="fold") experiment <- expReduce(experiment, "fold", mean) experiment <- expInstantiate(experiment, removeUnary=TRUE) # Then we perform a a testMultiplePairwise test procedure test <- testMultiplePairwise(experiment, "accuracy", "max") # Different tables can be obtained by using a range of metrics tabularTestPairwise(test, "pvalue") tabularTestPairwise(test, "wtl")
# First we create an experiment from the wekaExperiment problem and prepare # it to apply the test: experiment <- expCreate(wekaExperiment, name="test", parameter="fold") experiment <- expReduce(experiment, "fold", mean) experiment <- expInstantiate(experiment, removeUnary=TRUE) # Then we perform a a testMultiplePairwise test procedure test <- testMultiplePairwise(experiment, "accuracy", "max") # Different tables can be obtained by using a range of metrics tabularTestPairwise(test, "pvalue") tabularTestPairwise(test, "wtl")
This function builds a table from a testMultiple object, either control or pairwise. The htpotheses are added and compared in the table showing the methods and a range of different metrics than can be added to the table. Also the table shows information about rejected hypotheses.
tabularTestSummary(ph, columns = c("pvalue"))
tabularTestSummary(ph, columns = c("pvalue"))
ph |
The input testMultiple from which the table is generated |
columns |
A vector indicating the metrics that will be shown in the table |
an extabular object
# First we create an experiment from the wekaExperiment problem and prepare # it to apply the test: experiment <- expCreate(wekaExperiment, name="test", parameter="fold") experiment <- expReduce(experiment, "fold", mean) experiment <- expInstantiate(experiment, removeUnary=TRUE) # Then we perform a a testMultiplePairwise test procedure test <- testMultipleControl(experiment, "accuracy", "min") # Different tables can be obtained by using a range of metrics tabularTestSummary(test, c("pvalue")) tabularTestSummary(test, c("rank", "pvalue", "wtl"))
# First we create an experiment from the wekaExperiment problem and prepare # it to apply the test: experiment <- expCreate(wekaExperiment, name="test", parameter="fold") experiment <- expReduce(experiment, "fold", mean) experiment <- expInstantiate(experiment, removeUnary=TRUE) # Then we perform a a testMultiplePairwise test procedure test <- testMultipleControl(experiment, "accuracy", "min") # Different tables can be obtained by using a range of metrics tabularTestSummary(test, c("pvalue")) tabularTestSummary(test, c("rank", "pvalue", "wtl"))
This function perfoms a multiple comparison statistical test for the given experiment. First of all it performs a Friedman Test over all methods. In the case this test is rejected, meaning that significant differences are present among the methods a post-hoc test is then executed. For that, a comparison using the best method as a control is performed for each other method, finally a Holm familywise error correction is applied to the resulting p-values.
testMultipleControl(e, output, rankOrder = "max", alpha = 0.05)
testMultipleControl(e, output, rankOrder = "max", alpha = 0.05)
e |
Input experiment |
output |
The output for which the tet will be performed. |
rankOrder |
The optimization strategy, can be either maximizing "max" or minimizing "min" the target output variable. |
alpha |
The significance level used for the whole testing procedure. |
an testMultipleControl object
# First we create an experiment from the wekaExperiment problem and prepare # it to apply the test: experiment <- expCreate(wekaExperiment, name="test", parameter="fold") experiment <- expReduce(experiment, "fold", mean) experiment <- expSubset(experiment, list(featureSelection = "yes")) experiment <- expInstantiate(experiment, removeUnary=TRUE) # Then we perform a testMultiplePairwise test procedure test <- testMultipleControl(experiment, "accuracy", "max") summary(test)
# First we create an experiment from the wekaExperiment problem and prepare # it to apply the test: experiment <- expCreate(wekaExperiment, name="test", parameter="fold") experiment <- expReduce(experiment, "fold", mean) experiment <- expSubset(experiment, list(featureSelection = "yes")) experiment <- expInstantiate(experiment, removeUnary=TRUE) # Then we perform a testMultiplePairwise test procedure test <- testMultipleControl(experiment, "accuracy", "max") summary(test)
This function perfoms a multiple comparison statistical test for the given experiment. First of all it performs a Friedman Test over all methods. In the case this test is rejected, meaning that significant differences are present among the methods a post-hoc test is then executed. For that, each pair of methods are compared between each other, and finally a Shaffer familywise error correction is applied to the resulting p-values.
testMultiplePairwise(e, output, rankOrder = "max", alpha = 0.05)
testMultiplePairwise(e, output, rankOrder = "max", alpha = 0.05)
e |
Input experiment |
output |
The output for which the tet will be performed. |
rankOrder |
The optimization strategy, can be either maximizing "max" or minimizing "min" the target output variable. |
alpha |
The significance level used for the whole testing procedure. |
an testMultiplePairwise object
# First we create an experiment from the wekaExperiment problem and prepare # it to apply the test: experiment <- expCreate(wekaExperiment, name="test", parameter="fold") experiment <- expReduce(experiment, "fold", mean) experiment <- expSubset(experiment, list(featureSelection = "yes")) experiment <- expInstantiate(experiment, removeUnary=TRUE) # Then we perform a testMultiplePairwise test procedure test <- testMultiplePairwise(experiment, "accuracy", "max") summary(test)
# First we create an experiment from the wekaExperiment problem and prepare # it to apply the test: experiment <- expCreate(wekaExperiment, name="test", parameter="fold") experiment <- expReduce(experiment, "fold", mean) experiment <- expSubset(experiment, list(featureSelection = "yes")) experiment <- expInstantiate(experiment, removeUnary=TRUE) # Then we perform a testMultiplePairwise test procedure test <- testMultiplePairwise(experiment, "accuracy", "max") summary(test)
This function performs a Wilcoxon paired test to compare the methods of an experiment consisting exactly on two of them. If more methods are present, then a multiple comparison test must be applied.
testPaired(e, output, rankOrder = "max", alpha = 0.05)
testPaired(e, output, rankOrder = "max", alpha = 0.05)
e |
Input experiment |
output |
The output for which the tet will be performed. |
rankOrder |
The optimization strategy, can be either maximizing "max" or minimizing "min" the target output variable. |
alpha |
The significance level used for the whole testing procedure. |
a testPaired object
# First we create an experiment from the wekaExperiment problem and prepare # it to apply the test, we must subset it to only two methods: experiment <- expCreate(wekaExperiment, name="test", parameter="fold") experiment <- expSubset(experiment, list(method = c("J48", "NaiveBayes"))) experiment <- expSubset(experiment, list(featureSelection = c("no"))) experiment <- expReduce(experiment, "fold", mean) experiment <- expInstantiate(experiment, removeUnary=TRUE) # Then we perform a Wilcoxon test procedure test <- testPaired(experiment, "accuracy", "max") summary(test)
# First we create an experiment from the wekaExperiment problem and prepare # it to apply the test, we must subset it to only two methods: experiment <- expCreate(wekaExperiment, name="test", parameter="fold") experiment <- expSubset(experiment, list(method = c("J48", "NaiveBayes"))) experiment <- expSubset(experiment, list(featureSelection = c("no"))) experiment <- expReduce(experiment, "fold", mean) experiment <- expInstantiate(experiment, removeUnary=TRUE) # Then we perform a Wilcoxon test procedure test <- testPaired(experiment, "accuracy", "max") summary(test)
A problem containing experimental data obtaining by comparing several instances of Machine Algorithms from the Weka library. The variables are as follows:
data(wekaExperiment)
data(wekaExperiment)
A data frame with the data detailed in the Description.
method. Classification algorithms used in the experimen (NaiveBayes, J48, IBk)
problem. Problems used as benchmark in the comparison, up to 12.
featureSelection. Boolean parameter indicating if the data was preprocessed
fold. For each configuration a 10-fold cross validation was performed. This variable is a numeric value ranging from 1 to 10.
accuracy. This is a measure of the performance of each algorithm. Representing the percentage of correctly classified instances.
trainingTime. A second measure of performance. This one indicates the time in seconds that took the algorithm to build the model.