Class DataSplitter

java.lang.Object
com.jml.preprocessing.DataSplitter

public class DataSplitter extends Object
A class that provides a method for splitting a dataset into a training and testing dataset.
  • Method Details

    • trainTestSplit

      public static Map<String,​double[][]> trainTestSplit(double[][] X, double[][] y, double testSize)
      Splits a dataset with features and targets randomly into disjoint training and testing datasets. The arrays are first shuffled using the Fisher–Yates algorithm such that both X and y are shuffled the same. Then the arrays are split into a training and testing dataset.
      Parameters:
      X - Features of the dataset.
      y - Targets of the dataset.
      testSize - Percent of data to include in the testing dataset. Should be a value between 0 and 1 (inclusive).
      Returns:
      A hashmap containing the split dataset. Use the following keys to obtain the split data:
           key: "xTrain" -> X training data.
           key: "yTrain" -> y training data.
           key: "xTest" -> X testing data.
           key: "yTest" -> y testing data.