Franco R. Negri

Subscribe to Franco R. Negri: eMailAlertsEmail Alerts
Get Franco R. Negri: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Related Topics: Java EE Journal

J2EE Journal: Article

In Search of Operational Excellence

In Search of Operational Excellence

The Industry Challenge:
Realigning IT and Business

Companies that aspire to lead their industries continue to find ways to optimize themselves in an endless pursuit of excellence. The race is on for many leading companies to leverage technology and provide enriched online services to their customers in order to maintain their positions and, in some cases, distinguish themselves from their competitors. It sometimes seems as though every month brings new innovation that enables and enhances a company's ability to expose its unique value through online Web services.

With the advent and standardization of technology inspired by the Internet, companies have access to a common business medium that allows flexible, rich products and services to be offered directly to customers and business partners. The enablers of these business services are mostly technology dependent and based on powerful and complex architectures. As a result, leading companies are turning to their IT departments to develop, provision, and maintain high-quality standards for these customer-facing applications and services.

Where Business and IT Operations Disconnect
One of the most profound problems that exists today in IT is the pronounced disconnect between IT operations and the business lines that rely on IT to proactively manage business services in order to thrive in a highly competitive environment. Due to the complexity involved, IT has traditionally managed applications and infrastructure components from the perspective of specific devices and components, including network and systems, applications, and databases. Today, modern Web services provide automated online methods for customers and business partners to access information and services directly in a much more efficient manner. On one side, business drives the requirements for this complex online world, expecting IT to create, deploy, and assure high-quality, consistent online services that operate without outages and perform optimally. On the other, operational, side, IT often manages these business services by monitoring the health of the individual components, not from the more holistic perspective of the business services or processes these technology components enable.

This is where the misalignment between business and IT is amplified, and where IT operations become a bottleneck. This is particularly true if IT engages legacy approaches and tools designed to manage applications and supporting infrastructure in technology silos, as though they aren't related to the very business they support.

With the advent of new accelerated business models, like that of e-business, IT has had the difficult, if not impossible, task of keeping up with business demands. The time has come for a new operational approach to application and infrastructure management that enables IT operations to perform in lockstep with the business lines it is supporting - running IT as a business. This article outlines a new business-centric approach to IT operations management, along with a new software solution designed from the ground up that enables this approach, and a use case of a technology leader, BEA Systems, Inc. BEA adopted PANACYA's eBusiness Application Performance Management Suite to achieve business-focused operational excellence and help maintain not only its technology leadership but also its continued leadership in serving customers and business partners.

The BEA Business Driver
BEA wanted to extend its leadership in the application infrastructure market, which enables enterprises to rapidly develop, deploy, manage, integrate, and secure enterprise applications. To do so, they set out to enhance the quality and reach of services to their users and business partners with enriched online services, including a developer community (dev2dev), a self-service and support portal (eSupport), a partner network (PartnerNET), and www.bea.com, to mention a few. Focusing on customer service, BEA established an "operational excellence" initiative to define and establish operational processes and management services to support these critical online services. The idea was to extend BEA's high-quality standards for excellence in its products to operations and its own online services. BEA's IT operations team subsequently set out to find enabling technologies to assist in enhancing services and operational efficiencies.

A High Standard for Application and Infrastructure Management
As with many progressive IT organizations, BEA's IT operations team wanted more than just the ability to detect the availability status of systems and network components. They wanted proactive visibility across all applications and supporting infrastructure in order to detect performance bottlenecks and faults before they impacted their end users' online experience. With the bar set very high, and consistent with its own high-quality business standards, the BEA operations team set stringent guidelines and requirements for the management system that would empower IT operations to support its business-critical online services.

After evaluating solutions from 13 of the leading infrastructure, applications, and network management vendors, the team selected PANACYA as an integral part of the solution set. Following are some of the selection criteria and also the guiding principles for IT operational excellence.

  • Proactive problem identification and resolution
    - Granular and proactive application visibility and control: The ability for a management solution to provide granular visibility into applications down to the management of J2EE components, applets, and methods.
    - Intelligent root-cause analysis: The ability to not only detect probable faults but also determine the detailed root cause of an emerging problem before it causes a fault and/or service outage.
    - Correlating faults between applications and infrastructure: The ability to easily correlate information among the components that deliver online services from the network to the provisioning applications themselves and ultimately with the customer experience in real time.
    - End-to-end service-level monitoring: The ability to view critical applications and associated infrastructure components holistically and from a service perspective. BEA wanted to understand the relationships and dependencies among components and how they interact in real time to provide online services. BEA also wanted the ability to enable this service-centric monitoring across IT functional areas of responsibility to provide a cohesive and collaborative business-focused operations capability.
    - Managing the entire service delivery stack: The ability for the management system to support the entire infrastructure stack and any application. Also, the ability to support not only the current environment but also one that would adapt to future requirements and associated applications.
    - Alerting, notification, escalation, and reporting: The ability to alert and notify operations personnel and business line managers with information pertinent to their area of responsibility. BEA also wanted an automated alerting and escalation capability in accordance with their best-practices operational policy. Additionally, BEA required a robust real-time and historical reporting capability.
    - Quick time-to-value: The solution needed to be implemented quickly and efficiently and enable high value with low maintenance.

  • Managing applications and infrastructure to business objectives: In order to raise the bar and provide the highest possible service to customers and business partners, BEA was searching for an enabling management suite to coordinate the IT operations organization and focus it on the online customer services. The solution needed to provide proactive and detailed information about the key performance indicators of service quality and the customer experience. Leveraging this approach would enable BEA's operations staff to troubleshoot problems in real time, and optimize applications and infrastructure components in order to meet or exceed the high-quality standards they set.

  • Operational collaboration across functional areas of responsibility: The management solution needed to provide a mechanism for providing accurate and timely information to a diverse group of users - database administrators, application support personnel, application developers, network and systems administrators, IT managers, and business managers. Each functional group was to use one integrated management solution in order to gain proactive insight into their own domain and across the entire infrastructure to enable operational collaboration and hence the highest possible service to partners and customers.

    The PANACYA Solution
    Note: The examples used in this document have been modified to protect the confidentiality and intellectual property of BEA and PANACYA.

    PANACYA met BEA's requirements and operational business needs by implementing a solution that integrates Internet applications and associated infrastructure and business-centric service management into one cohesive management paradigm. The following details demonstrate three simple yet powerful steps the two teams used to deliver a solution enabled by PANACYA's bAWARE Application Performance Management Suite.

    Step 1: Model IT infrastructure to business-centric service views
    First, the PANACYA and BEA teams identified and organized application components and infrastructure into cohesive managed-service models. They used the PANACYA Management Suite's modeling capability to create visual models of all the objects that contribute to delivering online services. These models consisted of BEA's customer-facing applications and online services, including sales, customer support, developer center, and business partner services, as well as applications that support internal operational functions. The modeling enabled not only visual mapping of online services to applications and infrastructure but also detailed modeling of relationships and dependencies among application components and infrastructure components as they relate to business services and processes. Figure 1 depicts an example of these models.

    PANACYA's Application Performance Management Suite enabled BEA administrators to define the relationships among application and infrastructure components with respect to online processes and services. Figure 1 shows a logic model of critical services as well as associated application and infrastructure component relationships and dependencies. Models are created by simply dragging discovered objects from the object palette, located on the left side, onto the Modeling Canvas located on the right side (main view). By moving away from the more traditional physical topology type of visualization techniques used by other management solutions - which are brittle, require constant change, and do not associate infrastructure to business processes and services - the BEA team logically modeled its critical applications, processes, key performance indicators, and infrastructure as a starting point to create an operationally easy-to-understand and collaborative solution.

    PANACYA is the first management vendor to incorporate semantic modeling as a core feature of its solution. There are many benefits of this model-based approach to application management, but BEA exemplifies perhaps the two most significant - proactive visibility into the complete service delivery infrastructure and operational collaboration. PANACYA's modeling capability enabled BEA's environment to be visually documented and published to users across many functional areas, including operations personnel responsible for managing databases, networks and systems, and application platforms, as well as application developers. The entire operational and development community benefited because they could easily visualize online services and all their component parts in one place and understand how components within the applications relate to one other, and more importantly, how they impact online services in real time.

    PANACYA's advanced modeling and visualization technology supports three types of models: logical infrastructure, business process, and workflow. These models can be easily associated with one another to provide true visualization and correlation among complex applications, infrastructure, and the business processes they support. A model can document a simple relationship, such as an application server's dependency on its underlying operating system or database server, or instead define a more complex relationship among supporting applications in sequential steps of a business process. Building relationships is a simple drag-and-drop operation via the suite's modeling interface.

    Beyond the stated benefits, PANACYA's modeling and visualization technology is greatly enhanced by applying advanced peer-to-peer intelligence to it. PANACYA has pioneered distributed intelligence and advances the state of the art in management technology by applying prepackaged intelligence modules (Intelligent Behavior Experts - BeXs) to these customer-defined models. Instead of using fixed thresholds to detect faults and performance problems, an approach common to most management solutions, PANACYA leverages patent-pending BeX technology, which automatically adapts to the environment and the varying load conditions consistent with Web architectures. BeXs enable performance, availability, and resource utilization to be proactively monitored across the boundaries of infrastructure components - end-to-end and from the customer experience through the application components and platforms to the back-end databases.

    Using PANACYA technology, intelligent models can be easily created that logically define infrastructure relationships and dependencies like the model depicted in Figure 2 or with a process orientation such as the model in Figure 3. These models can then be associated with each other to relate infrastructure relationships and performance behaviors to critical processes. In either scenario, PANACYA's modeling capability brings business context to application and infrastructure management, while its distributed peer-to-peer intelligence enables proactive performance and fault detection and root-cause analysis.

    This advanced capability enables users, for the first time, to visualize critical services in a way that is most understandable to their job functions, while gaining insight into how business processes, applications, and infrastructure work together holistically to provide online services. PANACYA's model-driven visualization technology not only makes it possible to understand complex application environments but also helps to address and mitigate the finger-pointing that is pervasive in operations today. By receiving simple-to-understand, holistic views of services and their component parts, operations personnel can focus immediately on the pressing issues and accurately pinpoint and correct problems that may impact services, instead of the more common practice of managing systems and devices independently, of the state of business services. This service modeling and visualization approach enables BEA's system, network, database, and application support staff to collaborate in order to isolate and fix problems much faster than previously possible.

    Step 2: Deploy distributed peer-to-peer out-of-the-box intelligence to proactively monitor applications and infrastructure:
    After organizing the infrastructure into cohesive service-delivery models, the BEA team deployed PANACYA's out-of-the-box BeXs to the previously modeled services. PANACYA has created BeXs for the most popular Web servers, application servers, databases, network devices, and systems. Highly intelligent modules, BeXs automatically self-adapt to the environment and learn the desired behaviors of application platforms and infrastructure components without configuration. Then, instead of bombarding operators with thousands of events and reports to centralized consoles, PANACYA's BeXs focus on proactively alerting operators to only abnormal events, based on the learned normal behavior that could impact components and services. BeXs are highly deterministic and often detect emerging problems and trends long before they become faults and affect services.

    BEA deployed out-of-the-box BeXs to manage BEA WebLogic Server, custom J2EE application components, enterprise databases, Cisco routers and switches, Solaris and Dell servers, and user transactions that traverse the infrastructure stack. By leveraging the information provided by the service models regarding component relationships and dependencies, and by leveraging the peer-to-peer distributed architecture, BeXs work together in an intelligence network to detect and pinpoint problems across the entire service-delivery stack.

    When BeXs detect emerging problems, they display what they have learned through the service models so that operators always consider performance anomalies and faults in the context of business process models and services.

    Figure 4 depicts BeXs reporting a potential service-threatening performance problem and its impact to the service model. We also see how BeXs provide detailed drill-downs into the root cause of the problem. Everything IT operations needs to proactively monitor applications and services is provided in one integrated and intelligent management console - allowing it to detect and fix problems across the infrastructure stack in real-time.

    In addition to gaining proactive control of the application and infrastructure environment, PANACYA provides a unique solution to define and manage the environment according to business rules and policy via BeX Studio. Depicted in Figure 5, BeX Studio is an advanced visual analytical tool that enables both operations personnel and analysts to encapsulate and automate operational procedures, service level agreements, and advanced business policy rules that establish and define best practices in operational automation.

    As an example, BEA is using BeX Studio to automate the detection of abnormal network behaviors; correlate abnormal contention between BEA WebLogic and many of its enterprise databases; and encapsulate, automate, and scale a multitude of operational and institutional best practices and knowledge. Instead of creating custom scripts that become difficult to maintain, BeX Studio uses "scriptless" intelligent objects that can be easily deployed, replicated, and managed. It provides a vehicle to scale operational expertise and automation and dramatically enhances operational efficiencies while negating the need for many subject matter experts. Additionally, BeX Studio is tightly integrated with PANACYA's bAWARE Suite and leverages the advanced analytical capabilities of PANACYA's intelligent distributed peer-to-peer architecture.

    Step 3: Provide policy-driven service views of application and infrastructure performance to operations and business staff members
    Empowering BEA's operations staff with integrated service views of the entire service-delivery stack enabled the business-focused proactive control and operational collaboration necessary to achieve "operational excellence." The final step in the project was to create roles, responsibilities, and custom views for operations staff and managers responsible for managing the online services. PANACYA's drag-and-drop policy modeling capability eased this process at BEA.

    The most difficult part of the process was determining who had access to PANACYA's Management Suite and their associated roles and responsibilities in the context of the specified services and/or components of the managed services. Because of PANACYA's integrated and object-oriented design, defining and deploying operational policy was a simple matter. Roles were first created for IT managers, administrators, operators, and business managers. Attributes for these roles were assigned according to job function and responsibilities. Then, using a drag-and-drop metaphor, the policy objects were applied to the many service models. Instead of using laborious methods for defining and implementing alerts, notification, and usage rights policies, PANACYA's approach enabled BEA operational policy to be defined and deployed in less than a day.

    PANACYA's bAWARE Application Performance Management Suite is a complete and highly integrated solution. It not only enabled BEA to quickly deploy the solution across all Web services but also provided the BEA operations staff with integrated views of the entire service-delivery stack.

    PANACYA provided BEA with many different views into their application infrastructure to accommodate the diverse needs of the user community. Pictured in Figure 6 is PANACYA's Web-based Event Console. This enables users to pick and choose services and components from an object palette and view performance information in either real-time or historically across the entire infrastructure stack. Events are categorized by severity and color-coded for easy viewing and to reflect state. Because PANACYA's underlying monitoring system is highly intelligent, the views are generally "quiet" and reflect only abnormal performance conditions unless the user requests all information (normal and abnormal behaviors) from the intelligent monitoring BeX technology. Additionally, PANACYA's BeXs store the learned normal behaviors of all metrics across all components by hour, day, week, and month (dynamic baselining). Operators use this information both to proactively detect abnormal performance behaviors in real time and to perform capacity-planning tasks.

    PANACYA's eService Management Portal also supports Radar Service Views. Shown in Figure 7, the Radar View displays services and all their associated infrastructure components in this single view. If service levels are violated or if any component exhibits abnormal performance, it is automatically displayed in the inner bands of the Radar. The closer to the center an object appears, the more adverse an impact it has on the managed service. Additionally, operators can drill down into the details and the root cause by simply clicking on the object. The drill-down in Figure 7 shows normal behavior in gray and the actual component behaviors in red.

    All of the performance management information in PANACYA's Management Suite is stored and displayed this way in real time and historically. PANACYA provides many different views and tools to aid in proactively monitoring and resolving problems. The accuracy of the information, the correlation of events across applications and infrastructure, and the root-cause analysis provided would not be possible without their unique, distributed peer-to-peer intelligent architecture.

    The Outcome: Operational Excellence
    PANACYA has provided BEA with a significant capability and enabling technology to achieve operational excellence in order to provide the highest possible service levels to customers and business partners. This includes the ability to proactively view and control distributed and mission-critical applications, and correlate applications and infrastructure to business processes and services in real time. This may seem like an obvious outcome, but it's difficult to find this kind of integrated and proactive management solution deployed across an infrastructure and "operationalized," even in the most successful and progressive of companies. PANACYA endeavors to seek out other leaders in the industry who want to challenge current thinking and conventional wisdom and who, like BEA, aspire to achieve operational excellence.

  • More Stories By Franco R. Negri

    Franco Negri is the founder, CTO, and chief strategist of PANACYA. In a 23-year career with leading suppliers and consumers of advanced management technology, Franco developed a keen understanding of market needs and a strong vision for the next generation. He was most recently VP, Product Marketing and VP, Research & Development at Computer Associates, where he was responsible for Unicenter TNG - CA's flagship Enterprise Systems Management product line.

    Comments (0)

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.