Aim
To query the Prometheus server of the service mesh installed to get real time values of the 4 golden signals .
Based on the values of these signals and user defined threshold , alert the user whenever a value crosses the threshold.
The 4 Golden Signals and their thresholds .
The 4 golden signals we are looking at are -
...
Code Block | ||
---|---|---|
| ||
{ "requestRate": 20, "errorRate": 10, "latency": 5000, "saturation": 15, "interval_time": 1 } |
The RoostApi endpoint that receives this data is
/api/metricThresholds
This JSON structure is subject to change once we start supporting application specific thresholds. As of now , these values are system wide thresholds .
How to Use/Test the Alerts
For testing purposes we have tweaked our ballot image such that , any vote given to k3d returns a status code of 500 and this forms an error .
One can download the modified ballot image by running docker pull ashrr108/ballot:latest
or by simply changing the image field in the ballot.yaml to ashrr108/ballot:latest
.
...
You will be able to see the metrics in RoostApi logs as well . 2021/07/05 15:07:42 ROOSTAPI:asm_amd64.s:1374 goexit(): GoRoutine-29945: INFO: map[ErrorRate:100 Latency:1940.5 RequestRate:7.552380952380952]
RoadMap
Right now , the threshold values refer to system wide metrics. In the future we should support application specific thresholds.
Error Requests are requests with a response code of 5XX . This response code depends on the application and we may need to support configuring it based on user input .
We will also support Auto-Scaling of deployments when the request rate becomes too high or too low . In this scenario , the concerned deployment will be either scaled up or scaled down in order to keep the request rate under the threshold.
...