Using Data Elements

An IDataElement is the most basic grouping of data in the Comparator. An element is either a scalar or a list of values. The scalar types are null, boolean, integer, string literal, and real. The IDataElement.Type class enumerates the scalar types and the list type. Lists are indexed from 1 through n. An element of a list is again either a scalar or a list of values.

Element data is accessed by calling methods. Scalar data is read by a family of methods, one for each scalar type except null: getBooleanValue(), getIntegralValue(), getLiteralValue(), and getRealValue(). Each of these methods can also take an integer position argument to read list data. For example, getBooleanValue(3) would read the third element in a list as a boolean. In addition to the methods for reading scalar data in lists, the getListValue(int) method reads lists within lists. Reading multiple elements of a list at once is usually more efficient. Use one of the getSlice methods to do this when reading multiple values of the same type.

Each of these access methods has an equivalent method that will coerce the data to the appropriate type if necessary. These methods all start with force, such as forceBooleanValue(), forceSlice, and so on. If a non-forceful method is used on data that is not of the appropriate type, it will return the default value for that type. A forceful method will make a best effort attempt to interpret the data in a reasonable fashion.

The setValue methods in the IEditableDataElement interface write scalar and list data to an element. Writing values works similarly to reading values. The difference is that you do not have to specify what type of data you are writing. The type is automatically inferred from the type of data you pass to the setValue method.

Data Element Implementations

There are five types of data elements already defined by the Comparator. All of these types implement the IDataElement and IEditableDataElement interfaces. Four of the types support any type of data, while the fifth is a specialized type.

The default element type is SparseTreeDataElement which stores data in a height-balanced binary tree. In some situations, other types are faster or use less memory than SparseTreeDataElement. PackedTreeDataElement also stores element data in a height-balanced binary tree. However, instead of storing the element data directly in the tree, each tree leaf is a small array of storage locations. When the data is dense, this uses less memory than SparseTreeDataElement. SparseTreeDataElement2 and PackedTreeDataElement2 are alternative tree implementations that are faster and use less memory when the number of entries in the element is large. The documentation for these classes give estimates for when each class is the most efficient.

PackedDoubleDataElement is a specialized type. It stores element data in an array, but can only store real values. This class is very efficient for dense, real valued data that is altered infrequently.

If you are adding new types of elements for storage, you will want to extend either the DataElement or EditableDataElement classes. These classes implement nearly all of the functionality required by the IDataElement and IEditableDataElement interfaces as well as many other convenience methods.

Data Element Format

The format used to save data elements is the basis of the BioSPICE time series format. The general data format described in that document is compatible with the default format for data elements. There is also an example parser in Java for the format.