p5control.data.hdf5
This file provides an interface around an hdf5 file provided by h5py.File.
- exception p5control.data.hdf5.HDF5FileInterfaceError
Bases:
ExceptionExceptions concerning the HDF5FileInterface
- class p5control.data.hdf5.HDF5FileInterface(filename: str, callback_queue: Queue | None = None)
Bases:
objectWrapper around an
h5py.Filewith some added functionality for convenience. The file will be open after this class is initialized.- Parameters:
filename (str) – filename to use for storing the data
- open()
Open file in mode ‘a’.
- close()
Closes file the file and removes all callbacks.
- append(path: str, arr: ndarray | dict[str, list[Any]], **kwargs) None
Append arr to the dataset specified with path along the first axis. Creates the dataset if no dataset at path exists. This one will have the dimensions of the array with the first axis being infinitely extendable. Does nothing if
len(arr) == 0.Any callback pointing to this path will be called and provided with the newly appended data, but converted to a numpy structured array.
- Parameters:
path (str) – path in the hdf5 file
arr (np.ndarray or dic) – the data to append to the dataset
**kwargs – set as attributes of the dataset
:raises ValueError : if the secondary dimensions are not the same:
Examples
The data provided can be either a numpy array or a dictionary. If you want to add a row with three columns, you can for example use:
>>> append('my_dataset', np.array([[1,2,3]]))
Note that the type of the dataset in the hdf5 file is fixed after creating it. In this case, the type would be
int64. If you want to safe floating point numbers, make sure to use1.0. For convenience, data can also be provided in the form of a dictionary. This is then automatically converted to a structured array with the keys of the dictionary determining the fields. For example:>>> append("dic_dset", {"time": 1.2, "freq": 2.3}) >>> get_data("dic_set") array([(1.2, 2.3)], dtype=[('time', '<f8'), ('freq', '<f8')])
In addition, you can also provide a list instead of just single values to add multiple columns at the same time. Internally, python’s
zipis used, so new rows are only appended until the first of the iterators does not have any additional elements.>>> append("dic_list", {"time": [1.0, 1.1, 1.2], "freq": [2.3, 0.6, 1.6]}) >>> dgw.get_data("dic_list") array([(1. , 2.3), (1.1, 0.6), (1.2, 1.6)], dtype=[('time', '<f8'), ('freq', '<f8')])
Instead of numbers, you can also add an numpy array as a value. This is restricted to numpy arrays because the underlying h5py library needs to know the size the element will need, and therefore only types with known size are accepted (e.g. you can’t add a
listwith arbitrary length). Note that if the values of the dictionary are lists, the previous case applies.>>> append("dic_arr", {"a": 1, "b": np.array([[2,3], [1,0]])}) >>> dgw.get_data("dic_arr") array([(1, [[2, 3], [1, 0]])], dtype=[('a', '<i8'), ('b', '<i8', (2, 2))])
- _convert_to_array(arr: ndarray | dict[str, list[Any]], dtype=None)
Converts a dictionary to a numpy structured array. If given an array, it does nothing and returns the array unchanged.
- Parameters:
arr (Union[np.ndarray, dict[str, list[Any]]]) – the array or dictionary
dtype (optional, None) – if provided, the dictionary is converted to this dtype, otherwise the dtype is automatically determined from the dictionary
- _create_dataset(path: str, arr: ndarray, maxshape: Tuple | None = None, chunks=True)
Create hdf5 dataset at the specified path with data arr.
Defaults to a datasest where the first axis is infinitely extendable.
- Parameters:
path (str) – path in the hdf5 file
arr (np.ndarray) – the array to save to the dataset
maxshape (Tuple, optional) – specify maxshape for the dataset, defaults to shape of arr with first axis infinitely extendable
chunks – wheter to use chunks
- Returns:
the created dataset
- Return type:
h5py.Dataset
- get_data(path: str, indices: slice = (), field: str | None = None)
Return the data from a dataset, indexed with field and then slice:
if field: return dset[field][slice] else: return dset[slice]
- Parameters:
path (str) – path in the hdf5 file
indices (slice, optional = ()) – slice to in dex the desired data, use “()” for all data
field (str, optional) – can be used to specify fields for a compound dataset
- Raises:
HDF5FileInterfaceError – if the hdf5 file object is not a dataset
KeyError – if there exists no object at the path
- get(path: str)
Return an object from the hdf5 file, specified with path. Use this to access its children or attributes. If you want to get data from a datset, use
get_data()- Parameters:
path (str) – path in the hdf5 file
- get_keys(path: str)
Return the keys of the group specified with path. Returns a tuple containing the strings to minimize the rpyc requests.
- Parameters:
path (str) – path in the hdf5 file