How do I become a management genius?

Since Atlassian acquired the incident management software Opsgenie, the system has grown again with various features and possibilities. For numerous organizations, Opsgenie is the best solution available on the market - both for companies that have just arrived in the field of always-on web software and for experienced teams who were not satisfied with their alternative tools. Ten particularly weighty reasons speak for Opsgenie.

1. Flexibility for individual workflows

When it comes to incident response, no two teams work the same way. The individual processes depend on many factors and influences - experience, source of an incident, payload, time of day of the incident, etc. Opsgenie offers flexible control mechanisms that allow teams to work in the way that best suits them and their products. For example, if an incident occurs after business hours, Opsgenie can alert all team members in an urgent case or delay the alert until the start of work if the priority is low. This flexibility ensures that each alarm gets exactly the amount of attention it deserves.

2. Seamless integration with other Atlassian products

Opsgenie can be connected in two ways with other Atlassian products, so that teams can establish an end-to-end incident management solution:

  • It is possible to create processes in Jira Software or Jira Service Desk from Opsgenie so that important tasks are documented and traceable.
  • New processes in Jira or Jira Service Desk can trigger alarms and escalations in Opsgenie and thus shorten response times.
  • The status information from third-party systems that are monitored with Statuspage can be integrated into Opsgenie, so that a central overview of the status of all important tools is created. If a malfunction occurs in one of these services, Opsgenie can notify the relevant on-call person directly.
  • During an incident, Opsgenie can automatically post changes to the status page and thus inform customers more quickly and transparently about the current status quo.

3. More sensible alarms for faster reactions

Thanks to the deep integration options of Opsgenie, users can reformat alarm messages regardless of their source and make them more understandable. A good example is an exemplary AWS Cloudwatch alarm:

Maximum ApproximateNumberOfMessagesVisibleGreaterThanOrEqualtoThreshold 4.0 for QueueName Production

This message can be reformatted and made easier to read:

The Production Message Queue has more than three messages in it.

This customized message can be used for all notifications (voice, SMS, email, mobile push notifications). In addition, the team can add tags and optional fields to their alarms, as well as attach charts, logs, automation manuals, etc. to create more context.

4. Role-based access for easy scaling

Opsgenie enables teams to create their own plans, rules and guidelines. You don't have to wait for specific account managers or experiment with provisional workarounds, but can create individual roles with granular authorizations. These possibilities set the course for low-friction, time-saving scaling in the company.

5. Continuous monitoring with heartbeats

Mistakes happen in a development / IT environment. How can teams be certain that their surveillance systems are doing their job? Opsgenie Heartbeats ensures that monitoring tools are active and interconnected and that individual tasks are completed as planned. If the absence of a signal is detected within a certain time window, Opsgenie identifies the problem immediately.

6. Secure connection with on-premise applications

It's not uncommon for teams to use a mix of self-hosted and cloud-based monitoring and ITSM tools. The linking of these systems then often goes hand in hand with opening firewall connections and exposing the host server to the public Internet. This carries risks that most companies would prefer to avoid. As a result, the on-premise tools remain silos and users are forced to make manual updates between systems and copy data back and forth.

Opsgenie solves this challenge with the Opsgenie Edge Connector (OEC), which allows secure and seamless connections via on-premise systems, including Jira, Nagios, Solarwinds and others. All connections are outbound so there is no need to expose high risk ingress ports and protocols to the Internet. In addition, individual scripts can be triggered via OEC, which enables alarm responders to quickly execute executables that support problem solving and automated recovery measures.

7. Systematic reports for continuous learning

When incidents occur, things can get chaotic and stressful. But at the same time, every incident creates an opportunity to learn. Opsgenie maps alarms and faults over their entire life cycle and aggregates them into helpful reports. These reports not only help the team to identify the source of the alarm, they also support the assessment of team performance and the distribution of the on-call workload - without users having to leave the application. Among other things, Opsgenie offers these analysis functions:

  • operational efficiency
  • monthly overview
  • User and team productivity
  • Distribution of standby work
  • State of services and infrastructure
  • Analysis for follow-up
  • Conference attendance and efficiency metrics

Thanks to the Looker technology used, the Opsgenie reports can be filtered in order to focus on certain sub-aspects. Users can access the underlying data with one click.

8. Opsgenie Actions for accelerated recovery

Teams that operate always-on services need to be quick to respond to challenges to prevent small problems from growing into large outages that affect customers. Recovery often involves a known arsenal of systemic or infrastructural actions, but this usually requires manual, repetitive interventions. The Opsgenie Actions features provide an easy way to automate these manual tasks right from the Opsgenie console or mobile app.

For example, Opsgenie Actions can run automation documents from the AWS system manager to adjust AWS resources and, for example, start an EC2 instance. If parameters are required, Opsgenie Actions can prompt the user for input via list, checkbox selection or free text field. If the team uses other automation tools, that's no problem either: Opsgenie Actions integrates via REST with a wide range of third-party software.

9. Data security through edge encryption

Incidents and failures are exhausting enough. Then it's good if you don't have to worry about data security as well - thanks to Opsgenie Edge encryption. It protects the communication of alarm and incident information while it is on the way to and from the Opsgenie cloud service. Control over the keys for the encryption and decryption of sensitive information always remains with the team.

10. Cost factor

Opsgenie is inexpensive. Organizations that choose Opsgenie pay three to five times less than alternative products - and as the business scales, so do the cost savings (see graph below). What is particularly cool about Opsgenie is the unlimited availability of free stakeholder licenses in the Enterprise plan. This makes it possible to really integrate all relevant people in the company (and beyond) and to establish a really transparent incident management that is visible to all.

We are your partner for Atlassian Opsgenie

Do you want to learn more about Atlassian Opsgenie? Would you like to speak directly about setting up a structured alerting process for specific applications in your company? Then contact us: We are Atlassian Platinum Solution Partner and would be happy to support you in all aspects of systematic incident management in your organization.

Further information

Atlassian Opsgenie: Introduction to Value Proposition, Use Cases, and Features
Requirements for modern incident management
Atlassian Opsgenie: new interface, incident timeline, post-mortem reports, status page integration
Buy Atlassian licenses from // SEIBERT / MEDIA - all advantages

AtlassianAtlassian CloudIncident-ManagementOpsgenie