Package com.jml.core

Class Stats

java.lang.Object
com.jml.core.Stats

public class Stats extends Object
The stats class is a utility class to compute various statistical information about datasets.
  • Method Summary

    Modifier and Type
    Method
    Description
    static double
    correlation​(double[] y, double[] y_pred)
    Computes the r value or determination between two sets of data.
    static double
    determination​(double[] y, double[] y_pred)
    Computes the r2 value or correlation between two sets of data.
    static boolean
    genRandBoolean​(double p)
    Generates a random boolean with a specified probability of being true.
    static double
    max​(double... data)
    Finds the maximum value in a dataset.
    static double
    mean​(double... data)
    Computes the arithmetic mean.
    static double
    median​(double... data)
    Computes the median.
    static double
    min​(double... data)
    Finds the minimum value in a dataset.
    static int
    minIndex​(double[] data)
    Finds index of minimum value in an array.
    static int[]
    minIndices​(double[] data, int k)
    Finds indices of the k smallest values in an array.
    static double
    mode​(double... data)
    Computes the mode of a dataset.
    static double
    round​(double value, int decimals)
     
    static double
    sse​(double[] y, double[] y_pred)
    Computes the sum of square differences between two datasets.
    static double
    sst​(double... y)
    Computes the sum of squares total of a dataset.
    static double
    std​(double... data)
    Computes the standard deviation of the dataset.
    static double
    variance​(double... data)
    Computes the variance for the data set.

    Methods inherited from class java.lang.Object

    equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Method Details

    • round

      public static double round(double value, int decimals)
    • mean

      public static double mean(double... data)
      Computes the arithmetic mean.
      Parameters:
      data - Dataset to compute mean of.
      Returns:
      arithmetic mean of the dataset.
    • median

      public static double median(double... data)
      Computes the median.
      Parameters:
      data - Dataset to compute median of.
      Returns:
      The median of the dataset.
    • mode

      public static double mode(double... data)
      Computes the mode of a dataset.
      Parameters:
      data - Dataset to compute the mode of.
      Returns:
      The mode of the dataset.
    • variance

      public static double variance(double... data)
      Computes the variance for the data set. This is similar to the mean squared error but the sst is divided by (n-1) where n is the number of obervations in the dataset.
      Parameters:
      data - Dataset of interest.
      Returns:
      The variance of the data.
    • std

      public static double std(double... data)
      Computes the standard deviation of the dataset. It is assumed that data contains only a sample of observations from the true population.
      Parameters:
      data - Dataset of interest
      Returns:
      The standard deviation of the data.
    • determination

      public static double determination(double[] y, double[] y_pred)
      Computes the r2 value or correlation between two sets of data.
      Parameters:
      y - Dataset one.
      y_pred - Dataset two.
      Returns:
      The correlation coefficient for the given datasets.
    • correlation

      public static double correlation(double[] y, double[] y_pred)
      Computes the r value or determination between two sets of data.
      Parameters:
      y - Dataset one.
      y_pred - Dataset two.
      Returns:
      The coefficient of determination for the given datasets.
    • sse

      public static double sse(double[] y, double[] y_pred)
      Computes the sum of square differences between two datasets.
      Parameters:
      y - Dataset one.
      y_pred - Dataset two.
      Returns:
      The sum of square differences between two datasets.
    • sst

      public static double sst(double... y)
      Computes the sum of squares total of a dataset.
      Parameters:
      y - Dataset in question.
      Returns:
      The sum of squares total.
    • min

      public static double min(double... data)
      Finds the minimum value in a dataset.
      Parameters:
      data - Dataset of interest.
      Returns:
      The minimum value in data.
    • minIndex

      public static int minIndex(double[] data)
      Finds index of minimum value in an array.
      Parameters:
      data - The array to find index of minimum.
      Returns:
      The index of the entry with the smallest value.
    • minIndices

      public static int[] minIndices(double[] data, int k)
      Finds indices of the k smallest values in an array.
      Parameters:
      data - The array to find indices of the smallest values.
      Returns:
      An array of length k containing the indices of the k smallest values.
    • max

      public static double max(double... data)
      Finds the maximum value in a dataset.
      Parameters:
      data - Dataset of interest.
      Returns:
      The maximum value in data.
    • genRandBoolean

      public static boolean genRandBoolean(double p)
      Generates a random boolean with a specified probability of being true.
      Parameters:
      p - Probability of being true. Must be in range [0, 1].
      Returns:
      Returns a random boolean with probability p of being true.
      Throws:
      IllegalArgumentException - if p is not in range [0, 1].