CatalystNeuro

    Introducing zarr-matlab: Zarr v3, Natively in MATLAB

    Published on 2026-07-03 • 5 min read

    Introducing zarr-matlab: Zarr v3, Natively in MATLAB

    #Introducing zarr-matlab: Zarr v3, Natively in MATLAB

    Modern scientific data lives in the cloud, and the format it increasingly lives in is Zarr: chunked, compressed N-dimensional arrays designed so you can read the piece you need instead of downloading the file you don't. Zarr has first-class implementations in Python, Rust, C++, Julia, and JavaScript. MATLAB — still the daily driver for a huge share of working scientists — has been the conspicuous gap: MathWorks' built-in support covers only the older v2 format.

    Today we're releasing zarr-matlab, a free, open-source, pure-MATLAB implementation of the Zarr v3 specification.

    #What it does

    % Create a compressed, chunked array — and index it like any MATLAB matrix z = zarr.create("weather.zarr", [720 1440], "single", ... Path="temp", ChunkShape=[180 360], ... Codecs={zarr.codecs.ZstdCodec(5)}, FillValue=single(NaN)); z(:, :) = mydata; block = z(1:100, end-99:end); % reads only the chunks it touches % Open data written by zarr-python, xarray, or any v3 implementation g = zarr.open("https://example.org/climate.zarr"); % read-only over HTTP tile = g.item("precipitation").read([1 1], [512 512]);

    The feature list is the full v3 spec:

    • Every data type: integers, floats (including float16), complex, variable-length strings, and datetime64 — stored losslessly
    • Real compression: zstd, blosc (with shuffle/bitshuffle), gzip, and crc32c checksums, with prebuilt binaries for Linux, Windows, and Apple Silicon bundled in the File Exchange installer
    • Sharding: thousands of small chunks stored inside each cloud object, read individually via byte-range requests
    • Cloud-friendly stores: local directories, zip archives, in-memory, and read-only HTTP(S) — plus consolidated metadata, so opening a big hierarchy costs one request instead of hundreds
    • MATLAB-native ergonomics: ordinary 1-based indexing with end and logical masks, disp/tree displays, a .mltbx you can double-click

    #Interoperability is the contract

    The reason to store data in Zarr is that your collaborators, and your future self, can read it from any language. So zarr-matlab treats compatibility with zarr-python as its design contract: on every commit, CI has zarr-python write a 43-case matrix covering every data type × codec × sharding layout, verifies MATLAB reads every byte correctly, then verifies zarr-python can read everything MATLAB writes. Documentation is also executed by the test suite.

    #Where this is going

    zarr-matlab is a foundation as much as a product. Two projects are already building on it:

    • hdmf-zarr-matlab implements the hdmf-zarr conventions for links and object references — the path to reading and writing NWB files stored as Zarr from MATLAB.
    • matzarr attacks a problem every MATLAB lab with cloud data has: reading .mat files from object storage without downloading them. A .mat v7.3 file is secretly HDF5; matzarr indexes its chunk byte-ranges once into a small virtual-Zarr sidecar, after which MATLAB can open the file's entire structure in one request and stream exactly the slices it needs — cell arrays, structs and all.

    If you work with array data in MATLAB — neurophysiology, imaging, climate, anything — give it a try:

    CatalystNeuro builds open-source software for neuroscience data standardization and reuse. If your lab needs help adopting cloud-native data workflows — in Python or MATLAB — get in touch.