Package com.jml.preprocessing
Class DataSplitter
java.lang.Object
com.jml.preprocessing.DataSplitter
A class that provides a method for splitting a dataset into a training and testing dataset.
-
Method Summary
Modifier and TypeMethodDescriptiontrainTestSplit(double[][] X, double[][] y, double testSize)
Splits a dataset with features and targets randomly into disjoint training and testing datasets.
-
Method Details
-
trainTestSplit
Splits a dataset with features and targets randomly into disjoint training and testing datasets. The arrays are first shuffled using the Fisher–Yates algorithm such that both X and y are shuffled the same. Then the arrays are split into a training and testing dataset.- Parameters:
X
- Features of the dataset.y
- Targets of the dataset.testSize
- Percent of data to include in the testing dataset. Should be a value between 0 and 1 (inclusive).- Returns:
- A hashmap containing the split dataset. Use the following keys to obtain the split data:
key: "xTrain" -> X training data. key: "yTrain" -> y training data. key: "xTest" -> X testing data. key: "yTest" -> y testing data.
-