public abstract class HybridBaseFunction extends Object implements CriterionFunction
CriterionFunction
implements the basic functionality needed for
a majority of the hybrid functions available. It works by first gathering a
handful of meta data for the data set, such as the cluster sizes, initial
cluster assignments, and initial centroids. It then implements update
and requires subclasses to implement functions for
determining the change in the criterion score due to moving a data point.
Hybrid CriterionFunction
s utilize an internal and external criterion
function in order to balance between both objectives in order to create a
well balanced clustering.Modifier and Type | Field and Description |
---|---|
protected int[] |
assignments
The set of cluster assignments for each cluster.
|
protected DoubleVector[] |
centroids
The centroids representing each cluster.
|
protected int[] |
clusterSizes
The number of data points found in each cluster.
|
protected DoubleVector |
completeCentroid
The summation vector of all data points.
|
protected double[] |
e1Costs
The cost computed for each cluster.
|
protected double[] |
i1Costs
The cost computed for each cluster.
|
protected List<DoubleVector> |
matrix
The
Matrix holding the data points. |
protected double[] |
simToComplete
The distance of each centroid to
completeCentroid . |
Constructor and Description |
---|
HybridBaseFunction() |
Modifier and Type | Method and Description |
---|---|
int[] |
assignments()
Returns the cluster assignment indices for each data point in the
original matrix passed to {@link #setup(Matrix, int[] int) setup).
|
DoubleVector[] |
centroids()
Returns the final set of centroids computed for the dataset passed to
setup . |
int[] |
clusterSizes()
Returns the number of data points assigned to each cluster.
|
protected abstract BaseFunction |
getExternalFunction()
Returns the external
CriterionFunction . |
protected abstract BaseFunction |
getInternalFunction()
Returns the internal
CriterionFunction . |
double |
score()
Returns the score computed by this
CriterionFunction . |
void |
setup(Matrix m,
int[] initialAssignments,
int numClusters)
Creates the cluster centroids and any other meta data needed by this
CriterionFunction . |
boolean |
update(int currentVectorIndex)
Updates the clustering assignment for data point indexed by
currentVectorIndex . |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
isMaximize
protected List<DoubleVector> matrix
Matrix
holding the data points.protected int[] assignments
protected DoubleVector[] centroids
protected int[] clusterSizes
protected double[] e1Costs
protected double[] i1Costs
protected DoubleVector completeCentroid
protected double[] simToComplete
completeCentroid
.public void setup(Matrix m, int[] initialAssignments, int numClusters)
CriterionFunction
.setup
in interface CriterionFunction
m
- The Matrix
holding data points. This will be used as
read only.initialAssignments
- The cluster assignments for each data point in
m
. This is used as read only and discarded.public boolean update(int currentVectorIndex)
currentVectorIndex
. This returns true
if the data point is left
in the same cluster and false if it was relocated to another data point.update
in interface CriterionFunction
public int[] assignments()
assignments
in interface CriterionFunction
public DoubleVector[] centroids()
setup
.centroids
in interface CriterionFunction
public int[] clusterSizes()
clusterSizes
in interface CriterionFunction
public double score()
CriterionFunction
.score
in interface CriterionFunction
protected abstract BaseFunction getInternalFunction()
CriterionFunction
.protected abstract BaseFunction getExternalFunction()
CriterionFunction
.Copyright © 2012. All Rights Reserved.