p5control.data.hdf5

This file provides an interface around an hdf5 file provided by h5py.File.

exception p5control.data.hdf5.HDF5FileInterfaceError

Bases: Exception

Exceptions concerning the HDF5FileInterface

class p5control.data.hdf5.HDF5FileInterface(filename: str, callback_queue: Queue | None = None)

Bases: object

Wrapper around an h5py.File with some added functionality for convenience. The file will be open after this class is initialized.

Parameters:

filename (str) – filename to use for storing the data

open()

Open file in mode ‘a’.

close()

Closes file the file and removes all callbacks.

append(path: str, arr: ndarray | dict[str, list[Any]], **kwargs) None

Append arr to the dataset specified with path along the first axis. Creates the dataset if no dataset at path exists. This one will have the dimensions of the array with the first axis being infinitely extendable. Does nothing if len(arr) == 0.

Any callback pointing to this path will be called and provided with the newly appended data, but converted to a numpy structured array.

Parameters:
  • path (str) – path in the hdf5 file

  • arr (np.ndarray or dic) – the data to append to the dataset

  • **kwargs – set as attributes of the dataset

:raises ValueError : if the secondary dimensions are not the same:

Examples

The data provided can be either a numpy array or a dictionary. If you want to add a row with three columns, you can for example use:

>>> append('my_dataset', np.array([[1,2,3]]))

Note that the type of the dataset in the hdf5 file is fixed after creating it. In this case, the type would be int64. If you want to safe floating point numbers, make sure to use 1.0. For convenience, data can also be provided in the form of a dictionary. This is then automatically converted to a structured array with the keys of the dictionary determining the fields. For example:

>>> append("dic_dset", {"time": 1.2, "freq": 2.3})
>>> get_data("dic_set")
array([(1.2, 2.3)], dtype=[('time', '<f8'), ('freq', '<f8')])

In addition, you can also provide a list instead of just single values to add multiple columns at the same time. Internally, python’s zip is used, so new rows are only appended until the first of the iterators does not have any additional elements.

>>> append("dic_list", {"time": [1.0, 1.1, 1.2], "freq": [2.3, 0.6, 1.6]})
>>> dgw.get_data("dic_list")
array([(1. , 2.3), (1.1, 0.6), (1.2, 1.6)], dtype=[('time', '<f8'), ('freq', '<f8')])

Instead of numbers, you can also add an numpy array as a value. This is restricted to numpy arrays because the underlying h5py library needs to know the size the element will need, and therefore only types with known size are accepted (e.g. you can’t add a list with arbitrary length). Note that if the values of the dictionary are lists, the previous case applies.

>>> append("dic_arr", {"a": 1, "b": np.array([[2,3], [1,0]])})
>>> dgw.get_data("dic_arr")
array([(1, [[2, 3], [1, 0]])], dtype=[('a', '<i8'), ('b', '<i8', (2, 2))])
_convert_to_array(arr: ndarray | dict[str, list[Any]], dtype=None)

Converts a dictionary to a numpy structured array. If given an array, it does nothing and returns the array unchanged.

Parameters:
  • arr (Union[np.ndarray, dict[str, list[Any]]]) – the array or dictionary

  • dtype (optional, None) – if provided, the dictionary is converted to this dtype, otherwise the dtype is automatically determined from the dictionary

_create_dataset(path: str, arr: ndarray, maxshape: Tuple | None = None, chunks=True)

Create hdf5 dataset at the specified path with data arr.

Defaults to a datasest where the first axis is infinitely extendable.

Parameters:
  • path (str) – path in the hdf5 file

  • arr (np.ndarray) – the array to save to the dataset

  • maxshape (Tuple, optional) – specify maxshape for the dataset, defaults to shape of arr with first axis infinitely extendable

  • chunks – wheter to use chunks

Returns:

the created dataset

Return type:

h5py.Dataset

get_data(path: str, indices: slice = (), field: str | None = None)

Return the data from a dataset, indexed with field and then slice:

if field:
    return dset[field][slice]
else:
    return dset[slice]
Parameters:
  • path (str) – path in the hdf5 file

  • indices (slice, optional = ()) – slice to in dex the desired data, use “()” for all data

  • field (str, optional) – can be used to specify fields for a compound dataset

Raises:
  • HDF5FileInterfaceError – if the hdf5 file object is not a dataset

  • KeyError – if there exists no object at the path

get(path: str)

Return an object from the hdf5 file, specified with path. Use this to access its children or attributes. If you want to get data from a datset, use get_data()

Parameters:

path (str) – path in the hdf5 file

get_keys(path: str)

Return the keys of the group specified with path. Returns a tuple containing the strings to minimize the rpyc requests.

Parameters:

path (str) – path in the hdf5 file