The given transformation will be applied to the data in the given Task or data.frame.

If the input data is a data.frame, the returned object will in most cases also be a data.frame, with exceptions if the applied CPO performs a conversion to a Task. If the input data is a Task, its type will only be changed to a different type of Task if the applied CPO performs such a conversion.

The %>>% operator can be used synonymously to apply CPO objects to data. In case of CPORetrafo, predict can be used synonymously.

applyCPO(cpo, task)

Arguments

cpo

[CPO | CPORetrafo]
The CPO or CPORetrafo representing the operation to perform.

task

[Task | data.frame]
The data to operate on.

Value

[Task | data.frame]. The transformed data, augmented with a inverter and possibly a retrafo tag.

Application of CPO

Application of a CPO is supposed to perform preprocessing on a given data set, to prepare it e.g. for model fitting with a Learner, or for other data handling tasks. When this preprocessing is performed, care is taken to make the transformation repeatable on later prediction or validation data. For this, the returned data set will have a CPORetrafo and CPOInverter object attached to it, which can be retrieved using retrafo and inverter. These can be used to perform the same transformation on new data, or to invert a prediction made with the transformed data.

An applied CPO can change the content of feature columns, target columns of Tasks, and may even change the number of rows of a given data set.

Application of CPORetrafo

Application of a CPORetrafo is supposed to perform a transformation that mirrors the transformation done before on a training data set. It should be used when trying to make predictions from new data, using a model that was trained with data preprocessed using a CPO. The predictions made may then need to be inverted. For this, the returned data set will have a CPOInverter object attached to it, which can be retrieved using inverter.

An applied CPORetrafo may change the content of feature columns and target columns of Tasks, but will never change the number or order of rows of a given data set.

See also