Composable Preprocessing Operators, or CPO
, are the central entity provided by the mlrCPO
package.
CPOs can perform operations on a data.frame
or a Task
, for the latter even
modifying target values and converting between different Task
types.
CPOs can be “composed” using the %>>%
operator, the composeCPO
function, or
the pipeCPO
function, to create new (“compound”) operators that perform multiple operations
in a pipeline. While all CPOs have the class “CPO”, primitive (i.e. not compound) CPOs have the additional class
“CPOPrimitive”, and compound CPOs have the class “CPOPipeline”. It is possible to split a compound CPOs
into its primitive constituents using as.list.CPO
.
CPOs can be “attached” to a mlr-Learner
objects to create CPOLearner
s,
using the %>>%
operator, or the attachCPO
function. These CPOLearner
s
fit the model specified by the Learner
to the data after applying the attached CPO. Many CPOs can
be attached to a Learner
sequentially, or in form of a compound CPO.
CPOs can be “applied” to a data.frame
or a Task
using the
%>>%
operator, or the applyCPO
function. Applying a CPO performs the operations specified
by the (possibly compound) CPO, and returns the modified data. This data also contains a “retrafo” and and
“inverter” tag, which can be accessed using the retrafo
and inverter
functions to
get CPORetrafo
and CPOInverter
objects, respectively. These objects represent the “trained”
CPOs that can be used when performing validation or predictions with new data.
CPOs can have hyperparameters that determine how they operate on data. These hyperparameters can be set during
construction, as function parameters of the CPOConstructor
, or they can potentially be modified
later as exported hyperparameters. Which hyperparameters are exported is controlled using the export
parameter
of the CPOConstructor
when the CPO was created. Hyperparameters can be listed using getParamSet
,
queried using getHyperPars
and set using setHyperPars
.
A CPO object should be treated as an opaque object and should only be queried / modified using the given set*
and
get*
functions. A list of them is given below in the section “See Also”--“cpo-operations”.
A special CPO is NULLCPO
, which functions as the neutral element of the %>>%
operator
and represents the identity operation on data.
print.CPO
for possibly verbose printing.
Other CPO lifecycle related:
CPOConstructor
,
CPOLearner
,
CPOTrained
,
NULLCPO
,
%>>%()
,
attachCPO()
,
composeCPO()
,
getCPOClass()
,
getCPOConstructor()
,
getCPOTrainedCPO()
,
identicalCPO()
,
makeCPO()
Other operators:
%>>%()
,
applyCPO()
,
as.list.CPO
,
attachCPO()
,
composeCPO()
,
pipeCPO()
Other getters and setters:
getCPOAffect()
,
getCPOClass()
,
getCPOConstructor()
,
getCPOId()
,
getCPOName()
,
getCPOOperatingType()
,
getCPOPredictType()
,
getCPOProperties()
,
getCPOTrainedCPO()
,
getCPOTrainedCapability()
,
setCPOId()
Other CPO classifications:
getCPOClass()
,
getCPOOperatingType()
,
getCPOTrainedCapability()
#> [1] "CPOPrimitive" "CPO"#> [1] "CPOPipeline" "CPO"#> Trafo chain of 2 cpos: #> pca(center = TRUE, scale = FALSE)[not exp'd: tol = <NULL>, rank = <NULL>] #> Operating: feature #> ParamSet: #> Type len Def Constr Req Tunable Trafo #> pca.center logical - TRUE - - TRUE - #> pca.scale logical - FALSE - - TRUE - #> ====> #> scale(center = TRUE, scale = TRUE) #> Operating: feature #> ParamSet: #> Type len Def Constr Req Tunable Trafo #> scale.center logical - TRUE - - TRUE - #> scale.scale logical - TRUE - - TRUE -#> $scale.center #> [1] FALSE #> #> $scale.scale #> [1] TRUE #>#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> 1 -0.8976739 1.01560199 -1.335752 -1.311052 setosa #> 2 -1.1392005 -0.13153881 -1.335752 -1.311052 setosa #> 3 -1.3807271 0.32731751 -1.392399 -1.311052 setosa #> 4 -1.5014904 0.09788935 -1.279104 -1.311052 setosa #> 5 -1.0184372 1.24503015 -1.335752 -1.311052 setosa #> 6 -0.5353840 1.93331463 -1.165809 -1.048667 setosa