...
- Alerting Interface: alert clients can update the state of an alert.
- UCS Alerting Interface: client side interface to receive alert-related events from UCS server.
Info | ||
---|---|---|
| ||
The documentation is not clear on how the methods between the 2 interfaces are related. Alerting defines a single updateAlert() method but UCSAlerting defines 3: receiveAlertMessage(), updateAlertMessage() and cancelAlertMessage(). |
...
JH - the Alerting interface is provided by UCOM to allow alertable recipients to acknowledge the alert. The UCS Alerting interface must be provided by the alertable recipient to receive alerts (and updates/cancellations) - these are the methods UCOM will call in response to the sender issuing send/cancel/updateMessage requests (from 5.3) with an alert message type. EA - How does the Alerting interface allows alertable recipients to acknowledge the alert? updateAlert() method in Alerting interface receives only an AlertMessage as parameter. After discussing this with the team, we came to the conclusion that Alerting.updateAlert() is too broad for this task. A new method in this interface needs to be introduced. |
According to the documentation, the following is a usual "Send Alert with ACK" scenario:
JH - the Alerting and UCS Alerting labels are wrong (note made to fix in spec)
The "send message" path is clear. A Sender uses Client.sendMessage() to send a new AlertMessage to UCS. UCS uses a specific adapter to send the message to one or more Receivers. (Note that our NiFi implementation will not have any adapter for the alerting part. All the messages will be kept inside UCS).
When the Receiver wants to ACK the message it uses UCSAlerting.updateAlertMessage() (Note: it is not clear what are the parameters the Receiver uses for the invocation JH - just the alert message itself) to modify the status of the AlertMessage in UCS. UCS then notifies the clients using, according to the documentation UCSClient.handleResponse() method. This is where I think the documentation is wrong. The method that UCS should use to notify a client about a modification in an AlertMessage should be UCSAlerting.updateAlertMessage() JH - no, this is not a method in 5.3, see above (EA- Then how is a client notified about the ACK? UCSClient.handleResponse()?? -> UCSClient.updateAlertMessage()). This analysis document will use UCSAlerting.updateAlertMessage() and not UCSClient.handleResponse() to notify client about a modification in an AlertMessage.
...
public enum AlertStatus {
Acknowledged,
New,
Expired,
Retracted,
Pending
...
This method is invoked whenever an AlertMessage is updated in UCS. An AlertMessage is modified using UCSAlertingAlerting.updateAlertMessage().
Info | ||
---|---|---|
| ||
Client interface also defines a method called updateMessage, but the intention of this method is to update a message that is being prepared BEFORE it is actually sent to UCS(JH - not strictly. We allow updates even after a message has been sent by UCOM, however for ti to be meaningful, it has to make sense in the modality used to transmit the message). In the current NiFi implementation, Client.updateMessage is not related in any way with UCSAlerting.updateAlertMessage. EA - (From 5.3: "[Client.updateMessage()]This operation is used to modify a message. Unsent messages are considered fully mutable except for the message ID and creation information. Once a message is sent, the message is immutable."). |
cancelAlertMessage
Is is not clear in the documentation what is the action in UCS that triggers this method. Client interface defines a method called cancelMessage but this method is used to cancel a message that was being prepared BEFORE it was actually sent to UCS.
A way this method could be triggered is when a client uses Alerting.updateAlertMessage() specifying a status of Retracted. We need to investigate further.
...
Send and receive an Alert Message
Send an Alert Message and
...
ACK
Send an Alert Message and update it
Warning | ||
---|---|---|
| ||
This operation is vaguely described in the specification. The implications of the modifications of a message are too complex to be included in this implementation. We are not going to support this operation. |
Send an Alert Message and cancel it
...
Alerts timeout and escalation
Even if there is no technical limitation, Alert Messages are not designed to support responses. The purpose of an alert is to notify a set of recipients and, most of the times, to expect an ACK from the recipients. In the context of an Alert, then, what determines the timeout/escalation is not the absence of a response, but the alertStatus of the Alert Message.
If, when the timeout period is reached, the status of the Alert Message is not "Acknowledged", the escalation mechanism should be triggered.
Design
Alerting Interface Design
The Alerting interface is similar to the Client interface already implemented in UCS. This interface is a one-way mechanism to execute a command on UCS. The available command in our NiFi implementation will be: updateAlertMessage.
An option for the implementation of this interface is to implement it as a specific command inside Client workflow.
A specific processor must be implemented to hold the logic related to this operation: UCSUpdateAlertMessageUCSAlertingUpdateAlertMessage.
UCSAlerting Interface Design
...
If we decide to re-use Client Interface workflow then we need to introduce a new processor into this workflow: UCSUpdateAlertMessageUCSAlertingUpdateAlertMessage. This processor will implement the required logic to update an Alert Message into UCS. A new output port Another processor that needs to be implemented in this workflow (no matter if we decide to re-use it for the Alerting interface or not) is UCSCancelMessage. This new processor will cancel an existing AlertMessage.
Two new output ports must be added to this workflow so other parts of UCS can be notified about the modification and cancellation of an Alert Message (i.e. UCS Alerting Interface workflow).
In order to allow clients to register UCSAlerting callback interfaces into UCS, a new processor must be implemented and added into Client Interface workflow: UCSRegisterAlertingCallback.
The previous image shows how Client Interface workflow will look after the modifications for the Alerting Interface are implemented.
...
In this approach, the two previously mentioned processors - UCSUpdateAlertMessage and UCSAlertingUpdateAlertMessage and UCSRegisterAlertingCallback - are implemented into a specific workflow called "Alerting Interface". The structure of this workflow is similar to the Client Interface.
As you can see in the previous image, the new workflow has a similar structure to Client Interface: a message is received via HTTP, parsed and processed according to its content. The 2 available "commands" in this workflow are: registerUCSAlertingInterface (or registerUCSAlertingCallback) and updateAlertMessage.
Info | ||
---|---|---|
| ||
In the screenshot, UCSUpdateAlertMessage should be named UCSAlertingUpdateAlertMessage. |
UCS Alerting Interface
This workflow is in charge of notifying any previously registered UCSAlerting interface about alert-related events happening inside UCS. The supported events are: a new Alert Message is present in UCS, an Alert Message was modified inside UCS and an Alert Message was canceled inside UCS. Each of this events is represented in the new workflow as an input port.
...
Just like UCS Client Interface workflow, this new workflow notifies any previously registered UCSAlerting callback about an alert-related event that happened inside UCS.
New Processors
...
UCSAlertingUpdateAlertMessage
This processor is the most important processor in the entire Alerting Interface implementation. This processor receives an AlertMessage as its input, and generates a FlowFile containing 2 AlertMessages.
The incoming AlertMessage MUST have a MessageId that is known inside UCS. The original Alert Message is then retrieved and a diff is performed between the old and the new version of the message.
Note | ||
---|---|---|
|
...
This implementation will only diff the alertStatus |
...
property of the messages. |
UPDATE: NiFi UCSAlertingUpdateAlertMessage processor will only take as input parameters the id and new status of the message to be updated. Receiving an entire message as input made things complicated - and error prone - to the mechanism we currently have in place to process commands. This implementation also makes more clear what the intention of the processor is. The impedance between what UCS specification stands for Alerting.updateMessage() and the way NiFi implements it will be hidden behind the concrete implementation of ucs-nifi-api.
The way the statuses status of an Alert Message are is processed is this:
1.- If the alertStatus property of the new original alert message in UCS is different than other than "Pending" or "Acknowledged", the processor will fail. The incoming flowfile is redirected to REL_STATUS_MISSMATCH (using UCSCreateException.routeFlowFileToException())
2.- If the alertStatus property of the original message, the property in the original message is updated.
2.- (Only executed if the previous step was not)The statusByReciever entries of the new message are analyzed and updated in the original message. If, after the property is updated, the following scenarios must be evaluated:
...
new alert message is other than "Acknowledged", the processor will fail. The incoming flowfile is redirected to REL_STATUS_MISSMATCH (using UCSCreateException.routeFlowFileToException())
3.- If the alertStatus property of the new message is different than the alertStatus property of the original message, the property in the original message is updated. Use REL_SUCCESS. The message is updated in UCSControllerService by invoking updateMessage().
4.- If the alertStatus property of the new message is equals to the alertStatus property of the original message, the flowfile will be directed to a REL_NO_UPDATE relationship.
If there is no error during the execution of this processor, the output of this processor is a flowfile containing a serialized version of both, the original and the updated message.
...
Similar to UCSGetUCSClientCallbacks, this processor retrieves any previously registered UCSAlerting callback URL from UCSControllerService and generates a flowfile for each of them.
UCSCancelMessage
This processor implements the expected behavior of Client.cancelMessage() operation. This processor retrieves the messageId from the incoming FlowFile, retrieves the related message from UCSControllerService and performs the following operations:
1.- If there is no message in UCSControllerService with the specified id, the processor will route the incoming FlowFile to a REL_UNKNOWN_MESSAGE relationship.
2.- If the message in UCSControllerService with the specified id is not an AlertMessage, the processor will route the incoming FlowFile to a BAD_MESSAGE relationship.
3.- If the alertStatus property of the message in UCSControllerService with the specified id is already "Retracted", a new FlowFile containing the message will be directed to a REL_NO_UPDATE relationship.
4.- If the alertStatus property of the message in UCSControllerService with the specified id is "Pending", the alertStatus value is changed to "Retracted" and a new FlowFile with the serialized message will be directed to a REL_CANCELLED relationship.
5.- If the alertStatus property of the message in UCSControllerService with the specified id is other than "Pending" or "Retracted", the original FlowFile will be directed to a REL_INVALID_STATE relationship.
Alerts timeout and escalation
Given that the current implementation ACKs an Alert Message as soon as the first recipient ACKs it, there is only one escalation possibility: onNoResponseAll.
When an Alert Message is persisted - UCSPersistMessage processor - a new cron job needs to be scheduled if the message fulfils the following requirements:
1.- The Message is an AlertMessage instance
2.- The respondBy property of the message's header is > 0
3.- The receiptNotification property of the message's header is true
The cron job that gets scheduled implements the following logic:
1.- If the alertStatus property of message if "Acknowledged", "Retracted" or "Expired", the cron job ends silently.
2.- If the alertStatus property of the message is "Pending", any Message present in onNoResponseAll list is notified to UCSControllerService.notifyAboutMessageWithResponseTimeout(). The timeOutReason of the generated TimedOutMessage is NO_RESPONSES.