Adding Points
In some cases it is useful to be able to add new points to an existing dataset without running the projection method on the whole dataset again. Methods exist for quickly adding new data points based on data that have already been projected. These methods work best when a certain amount of data has already been collected and projected using, for example, PCA or the Sammon map. Note that these methods will rarely be applied in most uses of Simbrain.
Nearest Neighbor Subspace Method
(1) Takes each new point and determines the three points in the current data set that are closest to it.
(2) Finds the projection of the new point into the two-dimensional subspace that contains the three nearest neighbors in the high-dimensional space.
(3) Uses the three nearest neighbors and their corresponding points in the low dimensional dataset to find an affine map that approximates the full projection method (whichever one is currently being used).
(4) Applies the affine map to the new datapoint.
The Triangulate method takes each new point and determines which two points in the current data set are closest to it. Then, if possible, it will place the projected image of the new point so that its distance from the projected image of its two nearest neighbors is the same as it was in the high dimensional space. When it is not possible to project the point such that its distance to its two nearest neighbors is preserved, then the projected image of the new point will be placed on a line connecting the projected image of its two nearest neighbors. In this case the position of the projected image of the new point on this line is determined by the relative sizes of the distances between the new point and its two nearest neighbors in the current data set.
Refresh
Data points are not added using any special algorithm. Rather, when new data points arrive, the current projection algorithm is re-run on the entire updated dataset (if the current projection algorithm is an iterative algorithm like the Sammon map then coordinate projection is used by default). PCA tends to be useful in refresh mode, for it is relatively fast but also takes into consideration the entire dataset. For better results using coordinate projection, Automatically Select Most Variant Dimensions can be used.