Reading Data

Learn the process for reading data from a Synnax cluster.

This page walks you through the lifecycle of a read operation. If you’d like a practical guide on reading data using a client library, take a look at the respective pages for Python and TypeScript.

Range Inclusivity

It’s important to note that ranges are start inclusive and end exclusive. For example, if our desired range is from 1677433720970863400 to 1677433721870863400, we would retrieve the following subset from the shown domain:

Iterators

Underneath every read operation is an iterator. An iterator allows the caller to traverse through a range of data in a streaming fashion. Iterators are more complex to work with, so we recommend using a single request-response read when possible.

Just like a read, an iterator can be created by providing a range and a list of channels. Creating an iterator will not return any data, but will instead open a persistent connection to the cluster. You can then perform two categories of operation: seeking and reading.

Validity

Iterator’s maintain a validity state throughout their lifetime. This flag is used to indicate whether the iterator is healthy and has more data to read. An iterator is considered invalid if it:

  1. Has accumulated an error. This typically happens when the iterator is unable to reach the cluster.
  2. Is not pointing at a valid sample. This can occur if the iterator has:
    • Exhausted its data, meaning the end of the range is reached during forward iteration, or the start of the range is reached during reverse iteration.
    • Has not been positioned yet with a seeking call.

An iterator that has been opened but not yet positioned is invalid. To position the iterator, you must call a seeking operation.

Seeking

Seeking moves the iterator to a new position in the range. All seeking calls return a boolean indicating the validity state after executing the operation.

OperationArgumentsDescription
seek-lttimestampSeeks to the first sample whose timestamp is strictly less than the provided timestamp.
seek-getimestampSeeks to the first sample whose timestamp is greater than or equal to the provided timestamp.
seek-firstNoneSeeks to the first sample in the range.
seek-lastNoneSeeks to the last sample in the range.

Seeking calls can be used to revalidate an iterator after it has been exhausted or positioned to an invalid location. In the case of an accumulated error, this call may or may not succeed.

Reading

There are two methods for reading from an iterator. The first is through a fixed number of samples called the chunk size, which can be set when creating the iterator. Each call to next or prev without any arguments will return the next chunk of data.

The second is by providing a specified timespan to read. This is useful for seeking to and reading specific sections of data. When using span-based iteration, you should be wary of reading too large of a span, as this can cause heavy performance degradation. Reading by a span is start inclusive and end exclusive, regardless of the direction of iteration.

As with seeking operations, all reads return a boolean indicating the validity state after executing the operation.

OperationArgumentsDescription
nexttimespan or nothingIf no timespan is provided, reads the next frame of data specified by the chunk size. If a timespan is provided, reads the next frame of data across the span.
prevtimespan or nothngReads the previous chunk of data whose timespan is less than or equal to the provided timespan. If no timespan is provided, reads the previous chunk of data specified by the chunk size.

Accessing the Iterator Value

While read operations do fetch frames from the cluster, they do not return them directly. Instead, the current frame is kept in client-side memory, and can be accessed through the value method on the iterator.

This method returns a frame with the same format as in unary reads. If the iterator is invalid, calls to value have undefined behavior. If the iterator has been positioned but not yet read from, the frame will be empty.