This is a CPOConstructor to be used to create a CPO. It is called like any R function and returns the created CPO.

Create columns from expressions and the incoming data.

When cpoMakeCols or cpoAddCols are called as cpoMakeCols( <newcolname> = <expression>, ... ), a new column with the name <newcolname containing the result of <expression> is created. The expressions need to be vectorising R expressions and may refer to any feature columns in the data (excluding the target) and any other values. The names should be valid data.frame column names and may not clash with the target column name.

cpoMakeCols replaces existing cols by the newly created ones, cpoAddCols adds them to the data already present.

cpoMakeCols(..., .make.factors = TRUE)

cpoAddCols(..., .make.factors = TRUE)

Arguments

...

[any]
Expressions of the form colname = expr. See Examples.

.make.factors

[logical(1)]
Whether to turn resulting logical and character columns into factor columns (which are preferred by mlr). Default is TRUE.

Value

[CPO].

CPOTrained State

The created state is empty.

General CPO info

This function creates a CPO object, which can be applied to Tasks, data.frames, link{Learner}s and other CPO objects using the %>>% operator.

The parameters of this object can be changed after creation using the function setHyperPars. The other hyper-parameter manipulating functins, getHyperPars and getParamSet similarly work as one expects.

If the “id” parameter is given, the hyperparameters will have this id as aprefix; this will, however, not change the parameters of the creator function.

Calling a CPOConstructor

CPO constructor functions are called with optional values of parameters, and additional “special” optional values. The special optional values are the id parameter, and the affect.* parameters. The affect.* parameters enable the user to control which subset of a given dataset is affected. If no affect.* parameters are given, all data features are affected by default.

See also

Examples

res = pid.task %>>% cpoAddCols(gpi = glucose * pressure * insulin, pm = pregnant * mass) head(getTaskData(res))
#> pregnant glucose pressure triceps insulin mass pedigree age gpi pm #> 1 6 148 72 35 0 33.6 0.627 50 0 201.6 #> 2 1 85 66 29 0 26.6 0.351 31 0 26.6 #> 3 8 183 64 0 0 23.3 0.672 32 0 186.4 #> 4 1 89 66 23 94 28.1 0.167 21 552156 28.1 #> 5 0 137 40 35 168 43.1 2.288 33 920640 0.0 #> 6 5 116 74 0 0 25.6 0.201 30 0 128.0 #> diabetes #> 1 pos #> 2 neg #> 3 pos #> 4 neg #> 5 pos #> 6 neg