Add more data streaming documentation - ADRs, more hardware architecture. #135

rerpha · 2026-01-07T11:55:22Z

This PR adds a few ADRs to the dev wiki under the data streaming section as we have had a very informative talk with DSG and some things have become much clearer.

I am writing this now for HRPD-X so there are some ways we can cut corners ie. not histogramming and no spectra mapping, but this will likely change in the future to support the other instruments that we'll roll data streaming out to.

Please leave any initial comments here, but we will have a proper meeting to discuss - I'll organise this soon.

Closes ISISComputingGroup/DataStreaming#15
Closes ISISComputingGroup/DataStreaming#12
Closes ISISComputingGroup/DataStreaming#7
Closes ISISComputingGroup/DataStreaming#3

ISISComputingGroup/DataStreaming#26
and ISISComputingGroup/DataStreaming#24 should be done before creating any other tickets - we have some of the answers now, but the actual operation of the topics for each of those tickets will be done by those two processes. We should create tickets at the end of prototyping to flesh them out.

… chat earlier

Tom-Willemsen · 2026-01-14T09:26:15Z

@FreddieAkeroyd @GRyall @danielmaclaren did you have thoughts on this PR? Or do we want a full meeting to talk about it?

GRyall · 2026-01-14T11:37:24Z

reviewing it is on my todo list

Remove mermaid_params configuration from Sphinx.

FreddieAkeroyd · 2026-01-20T12:36:03Z

doc/specific_iocs/datastreaming/ADRs/002_spectra_mapping.md

+## Decision
+
+We are not going to support the old-style spectra files or any spectrum mapping/grouping in general 
+


The spectra file could also be used to disable collecting from a noisy detector (using spectrum 0) - is this possible via a different route?

If it's a noisy detector we probably don't want it streamed at all - we probably want to just not map it (before it ever hits kafka)?

saving spectrum 0 to file was optional, so using spectrum 0 was a workaround for DAE3 to discard data as it would always send data. Is it easy for a scientist to unmap a detector?

This may be file writer, but there was a third table detector.dat that contained detector angle details, there was similar to a mantid instrument geometry in idea. ISISICP could read detector.dat or a saved mantid workspace to extract detector details to add to a nexus file. Excitations used to adjust these files each cycle post calibration, so just noting that there would ultimately need to be a way for scientists to adjust detector metadata for an experiment.

FreddieAkeroyd · 2026-01-20T12:47:09Z

doc/specific_iocs/datastreaming/ADRs/003_linux.md

+- We are able to use Linux-centric technologies and tools, without needing to spend large amounts of time inventing workarounds for Windows.
+- The OS will be different. Developers will need _some_ understanding of Linux to maintain these servers.
+  * Mitigation: do as little as possible on the host, ideally limit it to just having a container engine installed via a configuration management tool such as Ansible.
+- Data-streaming infrastructure will not be on the NDH/NDX machine with the rest of IBEX. This is fine - EPICS is explicitly designed to run in a distributed way.


I would like to understand a bit more of what will be run on which machine (NDX or linux) and their potential interactions in various failure/restart modes.

i can add a bit more detail but was planning on running everything on the top level page in docker on a linux machine, probably something like fedora coreos which has docker installed by default and is auto-updating so should be less sysadmin effort to keep it alive and patched.

in terms of failure modes - we'll use health checks etc. to make sure containers don't fall over. we can add monitoring tools to send alerts if they're continually restarting etc.

Vs just running everything on the NDX there is a network link that could fail, but this is going to be the Arruba switch so is fairly unlikely to fail i'd say? If it does fail, the FPGAs probably can't stream anything anyway

Happy to join in this discussion too if useful

It was the NDX – linux interaction behaviour in case of failures/restarts on either end or other issues that might need resolving that i was interested in. At the moment we run all on the NDX and a clean restart is relatively simple.
Rather than linux v just NDX it was a comparison with having only Kafka on linux and all other stuff on NDX. If it is just kafka on linux (and no iocs/other services) then that seemed a potentially simpler setup? Also the option of changing kafka broker to another cluster or central hall service as a backup/recovery option or for testing etc. seemed easier?

maybe - but we still need a Linux machine for the container stuff. I don't think it's a good idea to run things on Windows or the WSL. I think running on a VM on the NDH would be OK but we can't do that with their current specifications.

Would the (first) Linux/Kafka machine be in the local (DAE) rack in the HRPDX?

yes - i think the streaming software should be in the HRPD-X rack. If we get asked to run Kafka it will probably be in that rack too.

Added considerations for Linux server specifications related to data rates, including disk write performance, network interface speeds, and memory requirements.

Expanded on the need for containerized data streaming software due to new detector technology and the limitations of WSL on Windows.

Added considerations for data streaming stack and container configuration.

Updated status to reflect pending discussions with HRPD-X parties.

rerpha · 2026-01-21T13:33:45Z

have added pros/cons/risks of each approach - is that any better @FreddieAkeroyd ?

rerpha added 6 commits January 6, 2026 14:27

WIP - add ADRs for data streaming and other information following DSG…

fb82dd0

… chat earlier

end of day - add adr template

f2d9199

fill out remaining ADRs on vetos etc.

98de23d

speling

57423e2

fix reference

e653b8f

more speling

61c1e9d

rerpha changed the title ~~Data streaming docs 2~~ Add more data streaming documentation - ADRs, more hardware architecture. Jan 7, 2026

even more speling

e8bb241

rerpha marked this pull request as ready for review January 7, 2026 16:18

rerpha requested review from FreddieAkeroyd, GRyall and danielmaclaren January 7, 2026 16:18

rerpha assigned rerpha and Tom-Willemsen Jan 7, 2026

rerpha added 2 commits January 7, 2026 16:30

use new names and URLs

60a3c18

use full names

20aa32d

rerpha added 2 commits January 14, 2026 17:17

Remove mermaid_params from conf.py

0198a9e

Remove mermaid_params configuration from Sphinx.

add note on run start metadata

9fd57d7

rerpha force-pushed the data_streaming_docs_2 branch from a0e3d30 to 9fd57d7 Compare January 19, 2026 11:52

FreddieAkeroyd reviewed Jan 20, 2026

View reviewed changes

Tom-Willemsen and others added 6 commits January 21, 2026 09:19

Expand Linux server specification considerations

baad797

Added considerations for Linux server specifications related to data rates, including disk write performance, network interface speeds, and memory requirements.

Clarify containerization needs for data streaming software

d5b034a

Expanded on the need for containerized data streaming software due to new detector technology and the limitations of WSL on Windows.

Update Linux ADR with data streaming details

1bf9bbf

Added considerations for data streaming stack and container configuration.

Revise status in 001_histograms.md

12deb33

Updated status to reflect pending discussions with HRPD-X parties.

add notes on approaches for running software

b9ac01f

words

c2bb1f0

specify partitions for topics

11a1267

		## Decision

		We are not going to support the old-style spectra files or any spectrum mapping/grouping in general

Add more data streaming documentation - ADRs, more hardware architecture. #135

Are you sure you want to change the base?

Add more data streaming documentation - ADRs, more hardware architecture. #135

Uh oh!

Conversation

rerpha commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Tom-Willemsen commented Jan 14, 2026

Uh oh!

GRyall commented Jan 14, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

FreddieAkeroyd Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rerpha commented Jan 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

rerpha commented Jan 7, 2026 •

edited

Loading

FreddieAkeroyd Jan 20, 2026 •

edited

Loading