...
Send an Alert Message and cancel it
Alerts timeout and escalation
Even if there is no technical limitation, Alert Messages are not designed to support responses. The purpose of an alert is to notify a set of recipients and, most of the times, to expect an ACK from the recipients. In the context of an Alert, then, what determines the timeout/escalation is not the absence of a response, but the alertStatus of the Alert Message.
If, when the timeout period is reached, the status of the Alert Message is not "Acknowledged", the escalation mechanism should be triggered.
Design
Alerting Interface Design
The Alerting interface is similar to the Client interface already implemented in UCS. This interface is a one-way mechanism to execute a command on UCS. The available command in our NiFi implementation will be: updateAlertMessage.
An option for the implementation of this interface is to implement it as a specific command inside Client workflow.
A specific processor must be implemented to hold the logic related to this operation: UCSUpdateAlertMessageUCSAlertingUpdateAlertMessage.
UCSAlerting Interface Design
...
If we decide to re-use Client Interface workflow then we need to introduce a new processor into this workflow: UCSUpdateAlertMessageUCSAlertingUpdateAlertMessage. This processor will implement the required logic to update an Alert Message into UCS. A new output port Another processor that needs to be implemented in this workflow (no matter if we decide to re-use it for the Alerting interface or not) is UCSCancelMessage. This new processor will cancel an existing AlertMessage.
Two new output ports must be added to this workflow so other parts of UCS can be notified about the modification and cancellation of an Alert Message (i.e. UCS Alerting Interface workflow).
In order to allow clients to register UCSAlerting callback interfaces into UCS, a new processor must be implemented and added into Client Interface workflow: UCSRegisterAlertingCallback.
The previous image shows how Client Interface workflow will look after the modifications for the Alerting Interface are implemented.
...
In this approach, the two previously mentioned processors - UCSUpdateAlertMessage and UCSAlertingUpdateAlertMessage and UCSRegisterAlertingCallback - are implemented into a specific workflow called "Alerting Interface". The structure of this workflow is similar to the Client Interface.
As you can see in the previous image, the new workflow has a similar structure to Client Interface: a message is received via HTTP, parsed and processed according to its content. The 2 available "commands" in this workflow are: registerUCSAlertingInterface (or registerUCSAlertingCallback) and updateAlertMessage.
Info | ||
---|---|---|
| ||
In the screenshot, UCSUpdateAlertMessage should be named UCSAlertingUpdateAlertMessage. |
UCS Alerting Interface
This workflow is in charge of notifying any previously registered UCSAlerting interface about alert-related events happening inside UCS. The supported events are: a new Alert Message is present in UCS, an Alert Message was modified inside UCS and an Alert Message was canceled inside UCS. Each of this events is represented in the new workflow as an input port.
...
Just like UCS Client Interface workflow, this new workflow notifies any previously registered UCSAlerting callback about an alert-related event that happened inside UCS.
New Processors
...
UCSAlertingUpdateAlertMessage
This processor is the most important processor in the entire Alerting Interface implementation. This processor receives an AlertMessage as its input, and generates a FlowFile containing 2 AlertMessages.
The incoming AlertMessage MUST have a MessageId that is known inside UCS. The original Alert Message is then retrieved and a diff is performed between the old and the new version of the message.
Note | ||
---|---|---|
| ||
This implementation will only diff the alertStatus and statusByReciever properties of the messages.property of the messages. |
UPDATE: NiFi UCSAlertingUpdateAlertMessage processor will only take as input parameters the id and new status of the message to be updated. Receiving an entire message as input made things complicated - and error prone - to the mechanism we currently have in place to process commands. This implementation also makes more clear what the intention of the processor is. The impedance between what UCS specification stands for Alerting.updateMessage() and the way NiFi implements it will be hidden behind the concrete implementation of ucs-nifi-api.
The way the statuses status of an Alert Message are is processed is this:
1.- If the alertStatus property of the new original alert message in UCS is different than other than "Pending" or "Acknowledged", the processor will fail. The incoming flowfile is redirected to REL_STATUS_MISSMATCH (using UCSCreateException.routeFlowFileToException())
2.- If the alertStatus property of the original message, the property in the original message is updated.
2.- (Only executed if the previous step was not)The statusByReciever entries of the new message are analyzed and updated in the original message. If, after the property is updated, the following scenarios must be evaluated:
...
new alert message is other than "Acknowledged", the processor will fail. The incoming flowfile is redirected to REL_STATUS_MISSMATCH (using UCSCreateException.routeFlowFileToException())
3.- If the alertStatus property of the new message is different than the alertStatus property of the original message, the property in the original message is updated. Use REL_SUCCESS. The message is updated in UCSControllerService by invoking updateMessage().
4.- If the alertStatus property of the new message is equals to the alertStatus property of the original message, the flowfile will be directed to a REL_NO_UPDATE relationship.
If there is no error during the execution of this processor, the output of this processor is a flowfile containing a serialized version of both, the original and the updated message.
...
Similar to UCSGetUCSClientCallbacks, this processor retrieves any previously registered UCSAlerting callback URL from UCSControllerService and generates a flowfile for each of them.
UCSCancelMessage
This processor implements the expected behavior of Client.cancelMessage() operation. This processor retrieves the messageId from the incoming FlowFile, retrieves the related message from UCSControllerService and performs the following operations:
1.- If there is no message in UCSControllerService with the specified id, the processor will route the incoming FlowFile to a REL_UNKNOWN_MESSAGE relationship.
2.- If the message in UCSControllerService with the specified id is not an AlertMessage, the processor will route the incoming FlowFile to a BAD_MESSAGE relationship.
3.- If the alertStatus property of the message in UCSControllerService with the specified id is already "Retracted", a new FlowFile containing the message will be directed to a REL_NO_UPDATE relationship.
4.- If the alertStatus property of the message in UCSControllerService with the specified id is "Pending", the alertStatus value is changed to "Retracted" and a new FlowFile with the serialized message will be directed to a REL_CANCELLED relationship.
5.- If the alertStatus property of the message in UCSControllerService with the specified id is other than "Pending" or "Retracted", the original FlowFile will be directed to a REL_INVALID_STATE relationship.
Alerts timeout and escalation
Given that the current implementation ACKs an Alert Message as soon as the first recipient ACKs it, there is only one escalation possibility: onNoResponseAll.
When an Alert Message is persisted - UCSPersistMessage processor - a new cron job needs to be scheduled if the message fulfils the following requirements:
1.- The Message is an AlertMessage instance
2.- The respondBy property of the message's header is > 0
3.- The receiptNotification property of the message's header is true
The cron job that gets scheduled implements the following logic:
1.- If the alertStatus property of message if "Acknowledged", "Retracted" or "Expired", the cron job ends silently.
2.- If the alertStatus property of the message is "Pending", any Message present in onNoResponseAll list is notified to UCSControllerService.notifyAboutMessageWithResponseTimeout(). The timeOutReason of the generated TimedOutMessage is NO_RESPONSES.