A technical overview of EMAP can be found in the Technical_overview_of_EMAP.md
There are currently two data sources for EMAP:
- HL7 data
- Persisted in the Immutable Data Store (IDS), from a copy of specific HL7 message streams
- The IDS is read by the HL7 reader, (defined in the hl7-reader module) converting the HL7 message into a source-agnostic format (interchange message, defined in the emap-interchange module) and published to a rabbitMQ queue for processing by the core processor.
- The Hoover (defined in the hoover repository) service polls hospital databases (Clarity and Caboodle) for data that has changed since the last poll. It converts the query outputs into the interchange message and publishes these to a rabbitMQ queue for processing by the core processor. We can't make the Hoover repository public because the SQL queries contain the intellectual property of the hospital patient record system, EPIC.
The core processor (defined in the core module) is responsible for processing the interchange messages and updating the emap database (defined in the emap-star module).
The core processor compares what is already known in the EMAP database, with the data in the interchange message and updates the EMAP database accordingly. We can receive HL7 messages out of order so the processor must be able to handle this.
All the EMAP services use the Spring-Boot framework and are written in Java. Setup instructions are found in the emap repo with additional information for hoover in the hoover repo.
A decision log for technical choices for a module can be found in its dev/design_choices.md file.
Each HL7 message can produce one or more interchange messages, depending on the type of the message there are different patterns used in the codebase to process the HL7 message.
As an example, the following diagram shows the processing of an ORM^O01 HL7 message type which can either result in
a single ConsultRequest interchange message or a list of LabOrderMsg interchange messages
(these have been simplified in the diagram).
Flow of the processing:
- All HL7 messages are processed by the
mainLoopmethod of theAppHl7class, which delegates reading and processing of HL7 messages into interchange messages to theIdsOperationsclass, and then publishes to the queue using thePublisherclass. - The
IdsOperationsclass is responsible for reading the HL7 messages from the IDS and delegates the processing of HL7. In this caseORMmessages are an Order Message, so the message is routed to theOrderAndResultService. - The
OrderAndResultServicecan determine the source and type of the message, which can delegate to theConsultFactoryfor a consultation request, orLabFunnelfor a lab order.- If this is a consultation request, the
ConsultFactorywill create aConsultRequestinterchange message and return this up the call stack for publishing.
- If this is a consultation request, the
- The
LabFunnelwill use theOrderCodingSystemto route the HL7 message type to the correctLabOrderBuildersubclass.- Each builder extracts common elements from the HL7 message, using its parent class' methods to create one or
more
LabOrderMsginterchange message.
- Each builder extracts common elements from the HL7 message, using its parent class' methods to create one or
more
- For service testing, fake HL7 messages are manually created for each message type and stored in
the
src/test/resourcesdirectory. - To reduce repetition of configuration and annotations, the
TestHl7MessageStreamclass is extended in each test class.- This contains a
processSingleAdtMessagemethod which takes the path of the fake HL7 message and processes it into an interchange message for assertions. - This method tests at the
IdsOperationslevel, so thePublisherdoes not need to be mocked.
- This contains a
- Unless there is very tricky areas of logic, we don't unit test message processing, instead setting up test cases of HL7 -> interchange messages and checking that this is processed as expected
- To have certainty that our end-to-end testing from hl7-reader -> core -> emap-star database works correctly,
test methods are added to the
TestHl7ParsingMatchesInterchangeFactoryOutputtest class, which takes in HL7 messages as an input and serialised interchange messages in yaml format as an expected output and asserts that they match.- These serialised interchange messages are then used in emap core testing
- To ensure that all serialised interchange messages from the hl7-reader have an HL7 message that produces them,
the
InterchangeMessageFactoryis created with Monitored Files, an exception will be thrown if there are any interchange messages which have not been read while running the test class.
The hoover service requires native queries to be written for Clarity and Caboodle, so local development uses docker
containers running sqlserver
with fake data to test the queries. These are defined in test-files/clarity and test-files/caboodle.
Be sure to follow
the local setup instructions
before starting work on the service.
Each data type processed by Hoover is represented by its own class that implements the QueryStrategy interface, and
has a SQL file in the src/main/resources/sql directory.
An example is shown in the following diagram (simplified):
- The
Applicationclass creates a Spring Component that is an instance of theProcessorclass, initialised with aQueryStrategyinstance, in this caseLocationMetadataQueryStrategy.- This component is taken as an argument to the
runBatchProcessormethod, which delegates the scheduling of the database polling to theBatchProcessorclass.
- This component is taken as an argument to the
- The
Processorclass uses theQueryStrategyinterface to get the previous progress for the data type, and query for any new data since the most recent progress. - Defining the SQL query and how this data is processed is implemented by the
LocationMetadataQueryStrategyclass.- To allow for sqlfluff linting of SQL queries, the SQL is persisted to the
src/main/resources/sqldirectory, and linked to by thegetQueryFilenamemethod. - the
getBatchOfInterchangeMessagesmethod queries the database, returns a list of Data Transfer Objects (DTOs), in this caseLocationMetadataDTO. - In this case, the
LocationMetadataDTOcan build aLocationMetadatainterchange message and these are returned up the call stack for publishing using thePublisherclass.
- To allow for sqlfluff linting of SQL queries, the SQL is persisted to the
- For each data type, as a minimum you should carry out testing from a query within a time window and assert that the
expected data is returned.
- The expected data should include the serialised interchange messages in yaml format that will be read by the core processor during testing.
- As the databases are static, creating specific test conditions within a time window is the easiest way to have specific tests.
- Each test class should implement the
TestQueryStrategyinterface, which adds in default tests once you have implemented the required methods for metadata about the test.- This also gives helper methods to be able to test a time window of data and assert that the batch of interchange messages matches the yaml files.
Message processing within core follows a general pattern of:
- Read message from queue and delegate to processor class
- Processor class uses one or more controllers to update or create the relevant entities from the hl7 message
- Controllers use repositories carry out business logic for what exists in the database, and what should be updated or created.
- Repositories use Spring Data JPA to interact with the database tables directly.
An example is shown in the following diagram (simplified)
- Each interchange message uses double dispatch so that the class of the interchange message can be known at runtime
without checks.
The
InformDbOperationsclass implements theEmapOperationMessageProcessorinterface to enable the double dispatch.
TheprocessMessagemethod delegates to aProcessorclass for each family of messages, in this case theLabProcessoris used. - Each processor class has one or more
Controllerclasses, which allow for the business logic of comparing the current and previous state from the database and making a decision on what the correct outcome is. The processor class uses these delegated classes to update or create entities which are used by other controllers. - Each controller class can use other controllers (for complex data types which span 6+ tables), interact with tables
using a
Cachecomponent or directly interact with the database using aRepositoryinterface.- In this case, the
LabControlleruses theLabCachecomponent, this is because Spring Boot caching annotations are ignored when a method call is made by the same class. Breaking this out into a separate component allows for data type metadata to be cached to reduce the number of queries to the database and improve performance. TheLabCachecomponent itself can interact withRepositoriesbecause it is a Spring Component. - The
LabControlleralso uses theLabOrderController, which then uses specificRepositoriesto interact with the SQL tables in emap star.
- In this case, the
- Most tables in emap star have an
Auditequivalent, this allows for the history of each entity to be tracked.- We have defined an
@AuditTableto generate Java classes for these entities that are then represented as tables in emap star. This can be found in the emap-star/emap-star-annotations maven module. - An Audit table must extend the
TemporalCoreclass, generics are used to link the entity class and its audit entity class. - The
RowStateclass acts as a wrapper around an entity to help with determining if differences should be persisted to the database (and if it already exists, audited).
- We have defined an
- All testing is carried out from interchange messages persisted in yaml format, as used by the source services.
- Test classes should extend the
MessageProcessingBase, which provides class fields and configuration for testing.- the
messageFactoryfield is used to create interchange messages from yaml files - there are
processSingleMessageandprocessMessagesmethods which take interchange messages and process the message(s) into an in memory database for testing.
- the
- For complex message flows such as Lab Orders (where there are several messages with different data to and from the lab
system and EPIC),
a test class has been created, which uses a class that extends the
OrderPermutationBaseclass.- In this case the test class is
TestLabsProcessingUnorderedMessagesand the permutation class is theLabsPermutationTestProducer, this has test methods which take in a set of yaml filenames, which are processed in every possible non-repeating order permutation. - This ensures that the processing of the messages is not dependent on the order of the messages, and that the correct outcome is reached regardless of the order of the messages.
- Admissions, discharges and transfers are also checked using this method.
- In this case the test class is
The EMAP services are deployed using Docker containers, which can interact with each-other using docker compose. To simplify the configuration and deployment of the containers, we use the emap-setup python package. This also has functionality to deploy a validation run of EMAP, setting a specific start and end date for the data to be processed from all sources.
As all input data during development is created by the developer and this is clinical data, a validation run is always required before changes should impact the running codebase. If this is an entirely new data type with no effect on existing data, then feature flags can be used to disable the processing of the messages in production. For a change to an existing data type or to release into production, then follow the validation SOP.
Deployment is carried out using the emap-setup tool, follow the release procedure SOP