CICD Harness Pipelines & Stack availability
under review
P
Prepared Alligator
Discussed the same with Harness and was advised to create an enhancement request.
Need a report/view to display health state of Harness CICD pipeline and the stages and sending automated notification for CIO domains consumption.
Display banner in Harness if any of the stage or pipeline is unavailable.
• Pipeline available
• Delegates
• Tools integrated with Pipeline (Snyk, Sonarsource, Deputy etc.)
• Availability of GitHub, Artifactory
• EKS tenant clusters and their health
Log In
P
Prepared Alligator
The list of visualisations which will help Harness come up with a single dashboard view.
The below filters and data within these will help us representation of the data we want.
Dropdown selector (allow multiple selection):
• Time range
• Org ID
• Project ID
• Pipeline Name
• Pipeline Status
• Stage Name
• Stage Status
• Step Name
• Step Status
• Error Message
• Error Point
• Tag Name/Value
Filter (to apply the conditions):
• Stage Name
• Stage Status
• Step Name
• Step Status
• Tag Name/Value
Time charts (default to 80th percentile, but adjustable to show other percentiles)
• Pipeline Execution duration line chart
• Pipeline Execution duration area chart (stacked by stage)
• Pipeline Execution status breakdown area chart
• Stage Execution duration line chart
• Stage Execution duration area chart (stacked by step)
• Stage Execution status breakdown area chart
• Step Execution status breakdown area chart
Table:
• Pipeline Execution count overall
• Pipeline Execution count breakdown by project
• Pipeline Execution duration overall
• Pipeline Execution duration breakdown by project
• Pipeline error count breakdown by error message
• Pipeline Success/Failure rate % overall
• Pipeline Success/Failure rate % breakdown by project
• Stage Execution duration overall
• Stage Execution duration breakdown by project
• Stage Success/Failure rate % overall
• Stage Success/Failure rate % breakdown by project
• Stage error count breakdown by error message
• Step Execution duration overall
• Step Execution duration breakdown by project
• Step Success/Failure rate % overall
• Step Success/Failure rate % breakdown by project
• Step error count breakdown by error message
P
Prepared Alligator
Feedback from the meeting on 08/11 with Ruchira- Please see below
Collapsible dashboard- All of the 3 dashboards into 1 view- Showing data by Pipeline>Stage>Step
Must utilise Step output variables as a dimensions in the dashboard
Must have Status of Pipeline execution/Stage execution/Step execution as a dimension
Prod2 or Sandbox access as early as possible – 27/11 (date TBC from Harness)
P
Prepared Alligator
Requirements:
Consumers of the data are - Executives/Service Owners/Engineering Managers/Tech Leads/Engineers on NEF2.0 across NAB.
For the Pipeline stack dashboard, we are expecting data at a minimum at pipeline level:
- step level status/duration/error, grouped by stage
- stage level status/duration/error, grouped by execution
- execution level status/duration/error, grouped by pipeline and Kubernetes platform
Visualisation based on data:
- horizontal comparison of step metrics, e.g. how long the Initialize step takes in each stage; identify the stage that has the highest error rate in the Initialize step
- vertical comparison of step metrics, e.g. stacked view of how long each step take within a stage; identify if a particular step is treading up in error rate
Requirements to view data at Org/Project Level:
- For the Pipelines in Harness the requirement is to group the data of Various Projects under the Org and display the aggregate/average of pipeline stages by status/duration/error by Org.
- User must be able to view the Step Metrics, identify the stage that has the highest error rate for each pipeline at project and Organization level.
- User must be able to understand how long each step takes within a stage this must be displayed by pipeline aggregate/average at Organisation level.
- The view must allow the user to drill down from Org to Project level and investigate/view the metrics at pipelines stages and steps.
- Must be able to compare the duration over one or more periods of time for each pipeline stage and step at project and Org level.
- Must have ability to configure notifications and the frequency of notifications to be sent to one or more NAB team member based on status/error- The notification data should contain the details of the Pipeline/Stage/Step.
- Must have ability to display a banner within a project if the Pipeline stage/step has errors- If there are multiple errors a compressed view to be provided to the user which can be expanded by the user.
- Must display the status of delegate when down (not connected) - Aggregated at Org level ie when user looks at the delegate data at Org level user must be able to see count of delegates within the Org that are down and not available and further drill down to view which project has delegates as Not connected.
P
Prepared Alligator
- Must be able to display status/duration/error for APP ID and APP Name and the grouping should be available each step level, stage level, execution level- then must be able to group at Project level and must be able to display at Org level. (This is to understand the build and deployment time for multiple microservices is different for an APP ID)
P
Prepared Alligator
For the Pipeline stack dashboard we are expecting data at a minimum of:
- step level status/duration/error, grouped by stage
- stage level status/duration/error, grouped by execution
- execution level status/duration/error, grouped by pipeline and Kubernetes platform
Visualisation based on data at a minimum of:
- horizontal comparison of step metrics, e.g. how long the Initialize step takes in each stage; identify the stage that has the highest error rate in the Initialize step
- vertical comparison of step metrics, e.g. stacked view of how long each step take within a stage; identify if a particular step is treading up in error rate
As an example, we extract the step level execution data via Harness API and wrap it in a pipeline in Harness prod(https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fnab.harness.io%2Fng%2Faccount%2FtnA9DBjWSIue1XjX_c-3Ww%2Fall%2Forgs%2Fet%2Fprojects%2Fnef%2Fpipelines%2Ftrmstatscisteps%2Fpipeline-studio%2F%3FstoreType%3DREMOTE%26connectorRef%3Dgithubbootstrapprod%26repoName%3Dharness%26branch%3Dmaster&data=05%7C02%7CMohammed.Khan1%40nab.com.au%7Ced1c60babbce44d6f03208dcfc4a8721%7C48d6943f580e40b1a0e1c07fa3707873%7C0%7C0%7C638662644264610677%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=3UpP0s4bm37P8k1fmfxc56pWbNpInbrl57DOGcQkhnA%3D&reserved=0)
that periodically scans CI pipelines. The attached file is a sample data extracted from one execution. The structure and format of data here is only for reference – what matters is that the metrics are searchable and can be aggregated.
We send the extracted data to an internal logging platform. This is a sample dashboard that tracks the duration of the Initialize step across all stages, in both our EKS and AKS platform. We can see that in October the Initialize step took much longer than usual (we had some network issue). So that we can utilise the data to triage and troubleshoot.
P
Prepared Alligator
We would also require notifications to NEF(NAB) team in order to action the items which needs to be addressed.
Shylaja Sundararajan
under review