Package com.jml.core
Class Stats
java.lang.Object
com.jml.core.Stats
The stats class is a utility class to compute various statistical information about datasets.
-
Method Summary
Modifier and TypeMethodDescriptionstatic double
correlation(double[] y, double[] y_pred)
Computes the r value or determination between two sets of data.static double
determination(double[] y, double[] y_pred)
Computes the r2 value or correlation between two sets of data.static boolean
genRandBoolean(double p)
Generates a random boolean with a specified probability of being true.static double
max(double... data)
Finds the maximum value in a dataset.static double
mean(double... data)
Computes the arithmetic mean.static double
median(double... data)
Computes the median.static double
min(double... data)
Finds the minimum value in a dataset.static int
minIndex(double[] data)
Finds index of minimum value in an array.static int[]
minIndices(double[] data, int k)
Finds indices of the k smallest values in an array.static double
mode(double... data)
Computes the mode of a dataset.static double
round(double value, int decimals)
static double
sse(double[] y, double[] y_pred)
Computes the sum of square differences between two datasets.static double
sst(double... y)
Computes the sum of squares total of a dataset.static double
std(double... data)
Computes the standard deviation of the dataset.static double
variance(double... data)
Computes the variance for the data set.
-
Method Details
-
round
public static double round(double value, int decimals) -
mean
public static double mean(double... data)Computes the arithmetic mean.- Parameters:
data
- Dataset to compute mean of.- Returns:
- arithmetic mean of the dataset.
-
median
public static double median(double... data)Computes the median.- Parameters:
data
- Dataset to compute median of.- Returns:
- The median of the dataset.
-
mode
public static double mode(double... data)Computes the mode of a dataset.- Parameters:
data
- Dataset to compute the mode of.- Returns:
- The mode of the dataset.
-
variance
public static double variance(double... data)Computes the variance for the data set. This is similar to the mean squared error but thesst
is divided by (n-1) where n is the number of obervations in the dataset.- Parameters:
data
- Dataset of interest.- Returns:
- The variance of the data.
-
std
public static double std(double... data)Computes the standard deviation of the dataset. It is assumed that data contains only a sample of observations from the true population.- Parameters:
data
- Dataset of interest- Returns:
- The standard deviation of the data.
-
determination
public static double determination(double[] y, double[] y_pred)Computes the r2 value or correlation between two sets of data.- Parameters:
y
- Dataset one.y_pred
- Dataset two.- Returns:
- The correlation coefficient for the given datasets.
-
correlation
public static double correlation(double[] y, double[] y_pred)Computes the r value or determination between two sets of data.- Parameters:
y
- Dataset one.y_pred
- Dataset two.- Returns:
- The coefficient of determination for the given datasets.
-
sse
public static double sse(double[] y, double[] y_pred)Computes the sum of square differences between two datasets.- Parameters:
y
- Dataset one.y_pred
- Dataset two.- Returns:
- The sum of square differences between two datasets.
-
sst
public static double sst(double... y)Computes the sum of squares total of a dataset.- Parameters:
y
- Dataset in question.- Returns:
- The sum of squares total.
-
min
public static double min(double... data)Finds the minimum value in a dataset.- Parameters:
data
- Dataset of interest.- Returns:
- The minimum value in data.
-
minIndex
public static int minIndex(double[] data)Finds index of minimum value in an array.- Parameters:
data
- The array to find index of minimum.- Returns:
- The index of the entry with the smallest value.
-
minIndices
public static int[] minIndices(double[] data, int k)Finds indices of the k smallest values in an array.- Parameters:
data
- The array to find indices of the smallest values.- Returns:
- An array of length k containing the indices of the k smallest values.
-
max
public static double max(double... data)Finds the maximum value in a dataset.- Parameters:
data
- Dataset of interest.- Returns:
- The maximum value in data.
-
genRandBoolean
public static boolean genRandBoolean(double p)Generates a random boolean with a specified probability of being true.- Parameters:
p
- Probability of being true. Must be in range [0, 1].- Returns:
- Returns a random boolean with probability
p
of being true. - Throws:
IllegalArgumentException
- ifp
is not in range [0, 1].
-