Skip to contents

prep data for the testing functions

Usage

prep_data(gene_data, feature_num = Inf, label_vector = "ensg00000183117_19")

Arguments

gene_data

a data.frame where the label_vector column is the response and the rest of the columns are predictors

feature_num

is the number of feature_num to select from the data. must be $>=$ 1.If exceeds ncol, ncol-1 is used (1 col removed as the label). Default is to use all columns

label_vector

name of the column to use as the response

Value

a list with xgboost data matrix objects, slots dtrain, dtest and wl see the xgboost docs on wl. dtrain is the trainin data to

Examples

  test_data = readRDS(
      system.file('testing_gene_data.rds',
      package = 'brentlabModelPerfTesting'))
  # note: suppressed warnings used here b/c the test data is too
  # small to stratify. Typically, do not use supressWarnings.
  suppressWarnings({prepped_data_subset = brentlabModelPerfTesting::prep_data(test_data, 10)})
#> INFO [2023-02-16 20:10:27] creating train test data with 10 predictor variables

  names(prepped_data_subset)
#> [1] "train" "test"