میزخدمت ITILمیزخدمت ITIL
  • خانه
  • دوره‌های آموزشی
    • دوره آموزشی ServiceDesk Plus
    • دوره آموزشی Analytics Plus
    • دوره آموزشی ADSelf Service
    • آموزش ADManager Plus
  • مرجع ITIL
    • Technical management
    • Service Management
    • General Management
  • عضویت‌ | وارد‌ شوید
  • سایت میزخدمت
  • خانه
  • دوره‌های آموزشی
    • دوره آموزشی ServiceDesk Plus
    • دوره آموزشی Analytics Plus
    • دوره آموزشی ADSelf Service
    • آموزش ADManager Plus
  • مرجع ITIL
    • Technical management
    • Service Management
    • General Management
  • عضویت‌ | وارد‌ شوید
  • سایت میزخدمت
صفحه اصلی/مرجع ITIL/Technical management/Infrastructure and platform management: ITIL 4 Practice Guide

Infrastructure and platform management: ITIL 4 Practice Guide

92 مشاهده 2 1401/07/12 بروزرسانی در 2023/06/03


1. About this document

It is split into five main sections, covering:

  • general information about the practice
  • the practice’s processes and activities and their roles in the service value chain
  • the organizations and people involved in the practice
  • the information and technology supporting the practice
  • considerations for partners and suppliers for the practice.

ITIL® 4 qualification scheme

Selected content of this document is examinable as a part of the following syllabus:

  • ITIL Specialist High-velocity ITITIL Specialist: High Velocity IT

Please refer to the respective syllabus document for details.


2. General information


2.1 PURPOSE AND DESCRIPTION

Key messageKey message

The purpose of the infrastructure and platform management practice is to oversee the infrastructure and platforms used by an organization. When carried out properly, this practice enables the monitoring of technology solutions available to the organization, including the technology and external service providers.

The infrastructure and platform management practice ensures that the organization has a high-quality IT infrastructure that efficiently meets its current and anticipated needs. ‘IT infrastructure’ as a concept includes all of the hardware, software, networks, and facilities that are required to develop, test, deliver, monitor, manage, and support IT services.

Depending on the architecture of the organization’s IT infrastructure, this practice may focus on the management of the physical environment, physical equipment, or digital infrastructure solutions, which may be the organization’s own resources or services provided by suppliers and partners. Often, IT infrastructure solutions are managed as services; in these cases, the infrastructure and platform management practice may include dedicated teams acting as service providers for the application and/or product teams within the organization. If this approach is taken, it is important to ensure that the infrastructure and platform teams are closely involved in the overall service delivery activities of the organization and follow the ITIL principles focus on value, think and work holistically, and collaborate and promote visibility. Members of these teams should understand the wider context of the organization and its service value system (SVS).

This practice covers all stages of the infrastructure solutions lifecycle, from ideation and gathering requirements to delivery and support. At every stage, it is used in conjunction with other practices (including the business analysis, architecture management, service design, availability management, capacity and performance management, service continuity management, information security management, risk management practices, and others). The importance of high-quality infrastructure and platforms for service delivery cannot be overstated; this practice is vital for the success of the organization’s digital services and digitized business processes.


2.2 TERMS AND CONCEPTS

The infrastructure and platform management practice provides the structure to deliver and support stable and well-performing technology services. Infrastructure and platform management is provided directly to the business, or supports the applications used by the business. With a robust infrastructure and platform management practice, an organization can enable value creation with the confidence that the underlying technology will meet organization’s and service consumers’ needs.

Definition: IT infrastructure

All of the hardware, software, networks, and facilities that are required to develop, test, deliver, monitor, manage, and support IT services.

A wide range of activities are used to run and manage IT infrastructure effectively. These activities range from understanding organization’s requirements and developing and planning infrastructure and platforms, to performing routine maintenance and overseeing infrastructure performance.All of the hardware, software, networks, and facilities that are required to develop, test, deliver, monitor, manage, and support IT services.

Definition: Operation

The routine of running and managing an activity, product, service, or other configuration item.

A large portion of the operational activities can be automated. Automation tools can monitor the environment, identify changes, distribute patches and other updates, provide asset inventory, and schedule and automate jobs.The routine of running and managing an activity, product, service, or other configuration item.


2.2.1 Business alignment for infrastructure and platform solutions

Infrastructure and platform solutions are designed to meet specific quality criteria defined to support the organization’s needs. The infrastructure and platform management practice is closely connected with the architecture management practice, ensuring that all infrastructure and platform solutions comply with the chosen architectural approach, model and standards, as well as sharing knowledge on the innovation available and feeding possible infrastructure and platform solutions into architecture management. The infrastructure and platform management practice must support application architecture, data architecture, and business architecture as well as align to the organization’s overall vision and principles.

To ensure alignment to the overall architectural model, standardized infrastructure and platform solutions are defined to meet the organization’s needs in a repeatable manner, to simplify delivery and ongoing management for these services. Standardized services allow for efficient provisioning through repeatability and automation. Many infrastructure services are designed to enable speed and agility. Self-service capabilities leverage automation capabilities to allow for users or other IT staff to request and receive items without manual steps behind the scenes. This should account for the majority of the services that are in utilized in the environment. Examples of standardized solutions may include storage systems, application servers, database platforms, authentication systems, single-sign-on, and others.

In integration with the architecture management the practice, the infrastructure and platform management practice should ensure development or outsourcing and cost-efficient operation of flexible and compatible core infrastructure and platform solutions, that should be easily deployable and easily configured or merged to support the organization’s services or products, serving as building blocks for the complex solutions, products, and services. One of the examples of implementing such approach is usage of microservices, that are “small in size, messaging-enabled, bounded by contexts, autonomously developed, independently deployable, decentralized and built and released with automated processes”.0F[1]

When the standard solution does not align with the business, a tailored or customized solution must be developed. The selection of a non-standard service delays the delivery of the solution and increases the ongoing effort and cost to the business for support for the solution. These non-standard solutions should be deployed and managed as an exception due to the additional overhead it requires.

In cases where the technology is not currently in place, the solution must be designed together with the architecture management and service design practices for conceptual and detailed design. During design, the infrastructure and platform management practice, business, and technical requirements are aligned and the recommended infrastructure and platform solutions are determined. As the solution is not currently available within the environment, additional steps are taken to address the procurement, build, sourcing, and support of the solution. The solution should be evaluated by infrastructure and enterprise architecture to determine if this should be offered to additional consumers or to remain as an exception to the existing documented standards.

When an organization needs an infrastructure and platform solution, infrastructure and platform management practice ensures that a solution is designed and delivered to meet the organization’s requirements. There are several ways to provide a solution. For requests that can be fulfilled using documented standard packages, the solution is provided through defined provisioning methods.

2.2.2 Infrastructure and platform solution technologies – physical and virtual

The technology used for infrastructure and platform solutions is either physical or virtual. Physical resources run directly from the hardware, such as an operating system that is installed directly on the hardware. This operating system can either host the application or services directly or virtual systems can run on top of it.

Virtualization allows for additional systems to be built on the physical system. Virtualization software runs on the hardware and allows for additional operating systems that are isolated and separated to be installed, creating multiple servers residing on the physical server. All virtual systems may run on the same or different hardware, but the virtual capabilities allow for dynamic workload placement and other capabilities; it also allows for better utilization of the hardware. The logical structure that connects the virtual servers and the physical servers should be accounted for in the configuration management database (CMDB). Additional capabilities that allow for dynamic moving of workloads should also be represented in the data model.

Infrastructure technologies, such as software-defined networking, virtual servers, and object storage, simplify the provisioning of infrastructure services. This allows the organization to deliver services quickly through automation.

Virtualization has greatly improved provisioning, performance, capacity, and availability for solutions. Further development in the virtualization direction is the usage of infrastructure-as-code (IaC) solutions. IaC is a way of managing and provisioning IT infrastructure and platforms by using machine-readable definition files rather than physically configuring hardware components. IaC solutions significantly speed up design (including hypothesis testing), development, building, provisioning and changing the infrastructure and platform solutions. Such solutions also usually make the infrastructure more reliable and fault resistant.

2.2.3 Infrastructure and platform solution delivery models

Advancements in technical capabilities have changed how services are delivered. Service providers have embraced the ability to scale services. As organizations move to services offerings that allow for flexibility in terms of how the service is provided, the organization can choose the model that best aligns to their strategic goals. Many times, the preferred model is a combination of both internal and external provided services. This complexity drives the need for a comprehensive management approach that ensures end-to-end delivery meets customer expectations.

There are many models for providing infrastructure and platform solutions, ranging from in-house dedicated data centres to fully out-sourced cloud environments. Many organizations continue to provide and support infrastructure residing in their internal data centres. They can also use solutions external to their organization. Cloud solutions provide offerings that allow systems and applications to run in internal and external data centres. Most enterprises use public cloud providers for at least part of their infrastructure. Cloud providers offer many solutions based on the expected needs of the business. An application may be accessed through the cloud, leaving infrastructure management activities beyond connecting to the cloud to be done externally by the application provider. Cloud offerings can include platforms for application development and infrastructure specific services like storage or backup as a service.

There is usually a mix of public and private cloud services in any organization. Both cloud services and outsourcing can provide infrastructure and platform services. Cloud services provide technical capabilities whereas outsourcing performs IT functions in a similar manner to internal teams. The contract defines the outsourcing scope and service levels. Instead of managing technology directly, internal IT teams focus on managing the contractual obligations and interactions with internal teams in an outsourced environment.

2.2.4 Agile methods in infrastructure and platform management

Recent technology innovations have enabled changes to how infrastructure is delivered and supported. Development practices have been adopted by teams providing infrastructure and platform solutions. Engineering and support functions rely heavily on coding and other development capabilities for automation.

Along with a focus on development from a system perspective, many organizations have also moved into models that blend development and infrastructure capabilities on one team to provide coverage throughout the lifecycle. DevOps and site reliability engineering (SRE) are examples of these models.

Specifically, DevOps brings a robust landscape of tools to automate the tracking, building, and deploying of small, agile-based releases. Agile is a development framework, but DevOps includes the infrastructure components and operational activities. DevOps focuses on the opportunities across all technology components and drives automation to enable rapid system updates. Infrastructure can now fully benefit from structured development practices.

By accounting for the end-to-end development and management of the solution, this approach allows for operational improvements to be included in the development releases. Machine learning and AIOps leverages data collected on solutions to automate, address issues, or manage requests without development. Through operational visibility and development capabilities, the overall system is managed in a more comprehensive and consistent manner through automation.

When using DevOps for infrastructure and platform management, special attention must be paid to obsolete systems and monolithic solutions that require manual operation and, therefore, slow down all management processes and changes. There should be a clear roadmap of decommissioning and replacing such solutions or replacing the manual activities with automation. One of the ways to do this is have an SRE team to run operations.

SRE is a discipline that incorporates aspects of software engineering and applies them to infrastructure and operations problems with the goal of creating ultra-scalable and highly reliable solutions. SRE is an approach that tries to bridge the gap between development and operations and find a consensus of their opposite objectives, which is to develop and release solutions fast and have a stable solution to support. SRE teams usually have software developers who must support the solutions they develop, and this stimulates them to automate most of the manual support and management tasks (in the course of reducing toil: manual, repetitive, automatable, non-creative work). With this, infrastructure and platform solutions become more manageable, require less manual work, and gain agility in changes, delivery, and support. Probably one of the most important gains of SRE operations is that infrastructure scale-out doesn’t lead to according linear growth of the team size, as it often happens in classical operations.

Key message

The practice is involved throughout the lifecycle of product and services. Figure 2.1 from “The Site Reliability Workbook” by Google, illustrates how SRE teams are involved during the lifecycle. With minor variations, this illustration is applicable to other approaches to infrastructure and platform management.
Figure 2.1 Infrastructure and platform management during product and service lifecycle
Figure 2.1 Infrastructure and platform management during product and service lifecycle

For products and components sourced outside the organization, development environments can be out of the organization’s control. For products and services delivered to service consumers outside of the organization, control over the live environment can be limited. Other variations are possible.

2.2.5 Continuous integration, continuous delivery, and continuous deployment (CI/CD)

Once the solution is in production, the primary focus of the team supporting and operating the infrastructure is to ensure high-quality delivery through managing the ongoing performance and functionality of the infrastructure and platform solutions. This team may be a dedicated infrastructure team or a dedicated product team. The products and services rely on the solution’s availability and performance to support them. In production, the organization has high expectations for uptime and very little tolerance for any impact of any type on service or product. To meet these demands, the solutions must be reliable and maintainable. Beyond infrastructure and platform configurations to support reliability and maintainability, the infrastructure and platform management practice must ensure the solution is supportable. Supportability addresses the organization’s requirements to ensure that the solution is functional and ready to support products and services.

Reliability is designed with the system. Reliability requirements are aligned to the uptime and performance requirements, defined by the capacity and performance management practice. These requirements ensure the solutions are built in to support the organization’s requirements. For example, this may include high availability or redundant network connectivity.

Definition: Reliability

The ability of a product, service, or other configuration item to perform its intended function for a specified period of time or number of cycles.

Maintainability of a system should be addressed during the design of a new system and tested before being transitioned to production. There could be rules agreed for an infrastructure and platform solution, ensuring maintainability based on the organization’s requirements and industry practices. One example is the existence of a monitoring tool to identify issues, or general monitorability of the solution planned at the design phase. Other examples could be the existence of tools used to configure, deploy, and provision the solutions. These rules could also be used to manage partners and suppliers responsible for infrastructure and platform service components.

Definition: Maintainability

The ease with which a service or other entity can be repaired or modified.

If maintainability is not addressed during the initial design and as part of daily operations, higher support costs, extended outages, and negative impacts to performance will affect the production environment. Maintainability is improved through appropriate monitoring configurations, automation, and utilization of standards.The ease with which a service or other entity can be repaired or modified.

Another aspect of maintainability involves ensuring the solution is recoverable to meet availability targets. This aspect is tightly aligned with the service continuity management. Maintainability ensures that infrastructure and platform solutions can be recovered to meet availability targets. This may mean, for example, ensuring that the hardware contract supports on-site replacements within a set timeframe. It may also cover having on-site resources performing the repair. When committing to availability targets, the parts and resources needed to restore service need to be factored in and be in place throughout the solution lifecycle. The infrastructure and platform management practice requires that the right pieces are in place to diagnose, repair, and recover in order to restore services on time.

Automation is also used to improve a system’s maintainability. Repeatable actions are excellent candidates for automation. Software development and management tools and techniques, such as Agile and DevOps, can be applied to infrastructure and platform management to drive frequent updates to systems and configurations. By addressing opportunities as they are identified and implementing solutions in small releases, benefits are realized quickly.


2.3 SCOPE

The scope of the infrastructure and platform practice includes:

  • activities used to plan, design, develop, deliver, maintain, and support infrastructure and platform technology
  • infrastructure and platform technology including:
    • hardware (servers, desktops, routers, switches, storage, cabling, and data centre)
    • software (operating systems, desktop applications, and middleware)
    • management tools (monitoring, management tools, deployment, inventory)
    • web hosting
    • cloud infrastructure and platform
    • identification systems and single sign-on (SSO).
  • infrastructure and platform management skills, including:
    • technical architecture and engineering
    • technical administration and operations
    • execution and enforcement of policies and procedures connected to infrastructure and platform management (planning, decision making, oversight).
  • integration with other practices
  • skills required for infrastructure and platform management, including infrastructure architecture, engineering, and administration.

There are many activities and areas of responsibility that are not included in the infrastructure and platform management practice, although they are still closely related to infrastructure and platform management. These are listed in Table 2.1, along with references to the practices in which they can be found. It is important to remember that ITIL practices combine value chain activities through value streams to deliver value.

Table 2.3.1 Activities related to the infrastructure and platform management practice described in other practice guides
ActivityPractice guide
Restoration of infrastructure and platform technology and services including major incidentsIncident management
Defining permanent resolution or workarounds for infrastructure and platform known errorsProblem management
Management of changes to the infrastructure and platformsChange enablement
Tracking and management of infrastructure and platform assetsIT asset management
Tracking of infrastructure and platform configurations in relationship to other configuration items (CIs)Service configuration management
Monitoring, event management, and log management for infrastructure and platform technologiesMonitoring and event management
Infrastructure and platform designService design
Defining requirements for infrastructure and platform solutionsBusiness analysis
Definition of standards and road map for infrastructure and platformsArchitecture management

2.4 Practice success factors

Definition: Practice success factor

A complex functional component of a practice that is required for the practice to fulfil its purpose.

A practice success factor (PSF) is more than a task or activity; it includes components from all four dimensions of service management. The nature of the activities and resources of PSFs within a practice may differ, but together they ensure that the practice is effective.

The infrastructure and platform practice includes the following PSFs:

  • establishing an infrastructure and platform management approach to meet evolving organizational needs
  • ensuring that the infrastructure and platform solutions meet the organization’s current and anticipated needs.
2.4.1 Establishing an infrastructure and platform management approach to meet evolving organizational needs

The needs of organizations and their customers are continually changing which leads to the technology industry continually transforming. The changes may result from industry trends, changes within organizations, business process innovation, or changes to business volumes. The infrastructure and platform management practice ensures that infrastructure and platform solutions are flexible and scalable so that they are aligned with demand. Organizational infrastructure and platforms meet this demand through optimized solutions that are designed for and used by all parts of the organization.

To properly design these solutions, teams delivering infrastructure and platform change must be aware of new technologies and techniques. The evolution of technology can be seen in examples like email, virtual server farms, storage arrays, single sign-on, and cloud platforms. When solutions are identified based on requirements, requests are promptly fulfilled. With virtual server technology that is used both internally and for cloud offerings, the turnaround time for requests can be reduced to minutes. Technological progress, such as virtualization, containers, continuous integration/continuous delivery (CI/CD), and IaC, significantly impacts the rate of change and innovation.

Organizations that deliver and support infrastructure and platform solutions have evolved through models, such as DevOps and SRE; they eliminate the use of traditional waterfall techniques in favour of end-to-end development and management within one team. Crucially, the organization’s structure and technology components must align with its overall strategic direction in order to ensure the consistent delivery and support of infrastructure and platform solutions. Components must align with the overall strategic direction to ensure consistent delivery and support of infrastructure and platform solutions.

It is important to plan how infrastructure and platform teams will identify, design, and introduce innovation into the environment at the solution and strategic levels. Depending on the current needs, infrastructure and platform management might need initial research and testing so that, when the need is presented, there is a clear plan of action. If the need is pressing, the technology may be selected, purchased, designed, and configured before any official requests are received.

The infrastructure and platform management practice should ensure that the infrastructure and platforms are built to promote experimentation, quick technology adoption, the ability to test theories and hypotheses, change the infrastructure and platform iteratively with feedback, fail fast, and learn from experience and errors in a safe environment. Each organization should define its innovation and risk appetite and consider their financial constraints for innovation in the infrastructure and platforms areas.

2.4.2 Ensuring that the infrastructure and platform solutions meet the organization’s current and anticipated needs

The main focus of the infrastructure and platform management practice should be ensuring that stakeholders receive value throughout the infrastructure and platform solution lifecycle. Stakeholders must be engaged from the initiation of a request or project until the solution’s retirement. Understanding stakeholder expectations, from design to the ongoing management and support of the solutions, is an essential aspect of delivering infrastructure and platform solutions. This ongoing relationship will drive improvement opportunities and ensure value continues to be co-created as the solution evolves.

When the organization needs a technical solution, requirements are defined in order to ensure that the solution meets the organization’s needs. The solution design should include technical and business requirements. The infrastructure and platform management practice is involved in analysing requirements to create a high-level design (in conjunction with the architecture management, business analysis, and service design practices, and others).

The requirements for infrastructure and platform solutions may come from different sources, including:

  • architectural standards and guidelines
  • compliance requirements, if the organization is subject to legislation
  • direct requirements from customers, if a solution is a service or service component that will be directly released to customers.

Where possible, the infrastructure and platform management practice ensures that standards can be defined and utilized in order to simplify the management of infrastructure and platform solutions. The enforcement of these standards ensures the reliability and maintainability of solutions. Standards enable efficient and effective operations and may include the hardware and software versions, configuration settings, management and monitoring tools, and support structures. Through standards, solutions are easier to operate, monitor, and upgrade.

Designs should be assessed against current and planned standards and validated against the current and anticipated levels of availability, performance, capacity, information security, and so on. Management practices supporting these should have active involvement.

Standard infrastructure solution packages should be utilized wherever possible. Any portion of the solution that is not standard increases cost, delays delivery, and requires customized support throughout the life of the solution. Exceptions to standards may result in extended downtime or other impacts to the customer. They may also delay teams responsible for performing other activities for other infrastructure and platform solutions.

If there are multiple exceptions to a standard, a review should be conducted to ensure that the standard still meets the organization’s needs. If it does not, a new standard should be designed and its implementation should be planned. Retiring the standard may include planning the removal of current systems that were installed as part the retired offering in order to reduce technical debt and the potential risk to the environment. The development and maintenance of the standards and standard packages are also within the scope of the infrastructure and platform management practice.

Part of the practice’s focus is to manage risk to the organization throughout the infrastructure and platform. As part of this effort, input from practices such as information security, service continuity, and risk management are taken to ensure that risks are managed throughout the lifecycle of the solution. This ongoing management includes, for example, ensuring that network devices are configured based on defined security policies, controls are tested periodically, and risks are identified and effectively managed. Requirements are handled on an ongoing basis to prevent adverse impacts, such as extended service downtime or a security breach of confidential information.

The overall management of infrastructure and platform solutions often includes internal and third-party solutions and components. Understanding the overall structure of these solutions and ensuring that the overall level of service provided meets customer expectations is critical.

Management need visibility to validate that solutions are performing at acceptable levels and to highlight opportunities. These may include addressing any issues and identifying areas that could be improved. The infrastructure and platform management practice should provide visibility to stakeholders in performance and improvement plans. This practice interacts with other practices to ensure that any issues or requests on solutions are resolved promptly. For this reason, the practice participates in agreeing targets for incident response, restoration, and request fulfilment times to align with customer expectations. This practice may include managing and reporting on the ability of solutions to meet targets. This visibility also provides an opportunity to improve performance in this area through automation or process refinement.

This practice also contributes to ensure that the agreed-upon levels of service is met. The scope of this effort includes any internal or external components used in the solution. Third-party services must align with customer expectations, or the expectations must be reset. External providers must meet the service levels in their contracts. By managing performance levels across internal and external services, the practice is able to report performance and other outcomes to the business.

The infrastructure and platform management practice ensures that solutions within its scope effectively contribute to overall financial targets. Infrastructure and platform solutions should be benchmarked against cloud offerings and external provider solutions. From a technology perspective, automation, consolidation, and standardization simplify the infrastructure and platforms and release resources, which can then be used to drive value. The current and potential partnerships with external providers can also be evaluated and existing agreements optimized.

2.5 Key metrics

The effectiveness and performance of the ITIL practices should be assessed within the context of the value streams to which each practice contributes. As with the performance of any tool, the practice’s performance can only be assessed within the context of its application. However, tools can differ greatly in design and quality, and these differences define a tool’s potential or capability to be effective when used according to its purpose. Further guidance on metrics, key performance indicators (KPIs), and other techniques that can help with this can be found in the measurement and reporting practice guide.

Key metrics for infrastructure and platform management are mapped to its PSFs. They can be used as KPIs in the context of value streams to assess the contribution of the practice to the effectiveness and efficiency of those value streams. Some examples of key metrics are given in Table 2.3.

Table 2.3 Examples of key metrics for the practice success factors
Practice success factorsKey metrics
Establishing an infrastructure and platform management approach to meet evolving organizational needsStakeholder satisfaction with the approach to management of infrastructure and platforms
Alignment of the infrastructure and platform management approach with the organization’s strategy and architecture
Number and impact of deviations from the organization’s strategy and architecture road mapLevel of benefits, costs, and risks associated with the approach to management of infrastructure and platforms
Ensuring that the infrastructure and platform solutions meet the organization’s current and anticipated needsStakeholder satisfaction with infrastructure and platform solutions
Number and impact of infrastructure incidents
Number and impact of constraints imposed by infrastructure and platform solutions
Number and impact of deviations from the agreed approach

The correct aggregation of metrics into complex indicators will make them easier to use for the ongoing management of value streams and for the periodic assessment and continual improvement of the infrastructure and platform management practice. There is no single best solution. Metrics will be based on the overall service strategy and priorities of an organization, as well as on the goals of the value streams to which the practice contributes.

[1] Nadareishvili, I., Mitra, R., McLarty, M., Amundsen, M., Microservice Architecture: Aligning Principles, Practices, and Culture, O’Reilly 2016

3. Value Streams and processes


3.1 Value stream contribution

Like any other ITIL management practice, the infrastructure and platform management practice contributes to multiple value streams. Remember, no value stream is made up of a single practice. The infrastructure and platform management practice combines with other practices to provide high-quality services to consumers. The main value chain activities to which this practice contributes are:

  • deliver and support
  • design and transition
  • obtain/build
  • plan.

The contribution of the infrastructure and platform management practice to the service value chain is shown in Figure 3.1.

Table 2.4 Examples of metrics for the practice success factors
Practice success factorsKey metrics
Establishing and maintaining effective approaches to the deployment of services and service components across the organizationLevel of stakeholders’ satisfaction with the rate of change of products and services supported by deployments
Rate of adoption of the agreed approach to deployment across the organization
Level of key partners’ and service consumers’ alignment with deployment approaches
Number of audit findings and external compliance issues caused by deployments
Ensuring effective deployment of services and service components in the context of the organization’s value streamsLevel of stakeholders’ satisfaction with lead time to deploy
Percentage of successful deployments/number of deployment errors/failures
Number/percentage of incidents related to deployments
Timeliness/adherence to deployments schedule
Deployment backlog throughput
Level of stakeholders’ satisfaction with quality of deployments

The correct aggregation of metrics into complex indicators will make it easier to use the data for the ongoing management of value streams, and for the periodic assessment and continual improvement of the deployment management practice. There is no single best solution. Metrics will be based on the overall service strategy and priorities of an organization, as well as on the goals of the value streams to which the practice contributes.


Figure 3.1 Heat map of the contribution of the infrastructure and platform management practice to value chain activities
Figure 3.1 Heat map of the contribution of the infrastructure and platform management practice to value chain activities

3.2 Processes

Each practice may include one or more processes and activities that may be necessary to fulfil the purpose of that practice.

Definition: Process

A set of interrelated or interacting activities that transform inputs into outputs. A process takes one or more defined inputs and turns them into defined outputs. Processes define the sequence of actions and their dependencies.

There are numerous models to structure activities of the infrastructure and platform management practice. These span several decades and range from waterfall and manual, to iterative and incremental.

This practice is one of the two ITIL practices (the other is the software development and management practice) where activities do not always form processes that could be described as sequences at the level of detail appropriate to this guide. This is because the infrastructure and platform management activities are always performed in a context of one or another value stream, and always in conjunction with other practices. However, activities of this practice can be categorized in three groups:

  • technology planning
  • product development
  • technology operations.
3.2.1 Technology planning activities

Technology planning activities ensure that the organization has a technology management approach and a roadmap for infrastructure development and improvement. These activities ensure the organization’s financial, architectural, and resource plans are aligned. With formalized and repeatable planning and effective integration with other practices, infrastructure and platform solutions will continually support alignment with the strategic goals of the organization. Table 3.1 shows how the activities transforms the inputs into outputs.

Table 3.1 Inputs, activities, and outputs of technology planning
Key inputsActivitiesKey outputs
Organization’s principles, policies, and visionAnalyse the organization’s strategy and architectureInfrastructure and platform management approach and roadmap
Organizational strategyDevelop and agree the infrastructure and platform management approachImprovement initiatives and requests for changes
Organizational structureReview the infrastructure and platform management approach
Product and service portfolio
Customer portfolio
Business analysis records and review reports
Audit reports

Figure 3.2 shows a workflow diagram of the process.

Figure 3.1 Heat map of the contribution of the infrastructure and platform management practice to value chain activities
Figure 3.1 Heat map of the contribution of the infrastructure and platform management practice to value chain activities

Table 3.2 provides an example of the technology planning activities.
ActivityExample
Analyse the organization’s strategy and architectureIT Leaders of the organization analyse the organization’s strategy, architecture road map, and portfolios and define requirements to the infrastructure and platform management approach.
Develop and agree the infrastructure and platform management approachBusiness analysts, architects, product owners, and infrastructure experts agree and communicate an infrastructure and platform approach, including scope, sourcing strategy, methods and techniques, procedures, and responsibilities.
Review the infrastructure and platform management approachBased on infrastructure review reports, periodic reviews, and audit reports, product owners and infrastructure experts review the effectiveness of the infrastructure and platform management approach and provide input to the analyse the organization and requirements activity, and/or initiate required changes.
3.2.2 Product development activities

In many organizations these activities are performed within product development value streams in conjunction with other practices. The infrastructure and platform management practice serves as a source of technical expertise and other resources to support product ideation, design, development, and deployment. In other organizations, infrastructure and platform solutions are developed in a separate value stream and provided to as services to product teams and their products. The activities of the infrastructure and platform practices are similar in these scenarios. In many cases, infrastructure solutions are sourced from external developers; the activities of the practice are focused on ensuring that the solutions meet the organization’s requirements and constraints.

This group includes the activities outlined in Table 3.3 and transforms the inputs into outputs.

Table 3.3 Inputs, activities, and outputs of product development

Key inputsActivitiesKey outputs
Basic and detailed designCreate a basic solution designInfrastructure and platform management approach
Agreed service level objectivesCreate a detailed solution designSolution requirements
Components and solutionsSource/develop/configure the componentsBudget and other resources and constraints
Solution documentationSource/build/configure the solutionSourcing and supplier management policies
Setup in management tools including monitoring, ITSM toolsSupport validation and testingSourcing and build policies and guidance
Operational run booksSupport deployment and releaseOperational standards
Reports and scheduled reviewsReview solution development and implementationSuccess criteria
Project structure (schedule, assignment, methods)

The focus of technology delivery and engineering is on designing, building, and transitioning infrastructure and platform services. These activities may vary, depending on how the services will be delivered and how the organization applies these steps, as is outlined in Table 3.4.

Table 3.4 Technology delivery and engineering activities
ActivityInternal buildSourced
Create a basic solution designBased on the requirements identified by business analysts and product owner, infrastructure specialists agree service level objectives for the infrastructure solution and create a basic solution design.
The basic design is approved by the product owner.
Create a detailed solution designInfrastructure specialists and/or site reliability engineers creates a detailed solution design, ensuring its reliability, efficiency, scalability, and other quality characteristics required by the agreed SLOs and the organization’s infrastructure management approach are met.
The resulting design includes a recommended sourcing and delivery model for the components and the solution.
Source/develop/configure the componentsAgreed components are developed and configured by infrastructure specialists according to the designAgreed components are procured and configured by a supplier according to the design; their work is monitored and accepted by infrastructure specialists
Source/build/configure the solutionAgreed solution/system is built/configured by infrastructure specialists according to the design; their work is accepted by the product ownerAgreed solution/system is built/configured by a supplier according to the design; their work is monitored and accepted by infrastructure specialists and the product owner
Support validation and testingInfrastructure specialists participate in the validation and testing of the components and the solution at all stages of the solution development, ensuring effective integration with the service validation and testing practiceInfrastructure specialists participate in the validation and testing of the components and the solution at all stages of the solution development, ensuring effective integration with the service validation and testing practice and the supplier management practice
Support deployment and releaseInfrastructure specialists participate in the deployment and release of the solution, ensuring effective integration with the respective practicesInfrastructure specialists participate in the deployment and release of the solution, ensuring effective integration with the supplier management practice
Review solution development and implementationInfrastructure specialists, product owners, and application developers review the infrastructure solution development activities and outcomes. The resulting report is used as an input to the technology planning activities and other improvement initiativesInfrastructure specialists, product owners, application developers, and supplier representatives review the infrastructure solution development activities and outcomes. The resulting report is used as an input to the technology planning activities, supplier management improvements, and other improvement initiatives

Product development activities ensure the delivery of a supportable solution that meets the organization’s needs and agreed SLOs. Even if an external provider provides a solution, steps are taken to ensure it fits into the overall delivery and support model.

3.2.3 Technology operation activities

The technology operations activities are performed after the solution goes into the live environment. These activities include planned maintenance and unplanned support activities. Maintenance focuses on the normal operations of the solution, such as administration and monitoring. Support focuses on addressing events, incidents, alerts, and other areas that are not performing as planned. In an organization that is not functioning well, the unplanned activities typically take most, if not all resource time. A more mature organization will focus on planned activities that will result in less unplanned work.

This group includes the following activities, and transforms the following inputs into outputs:

Table 3.5 Inputs, activities, and outputs of the technology operation
Key inputsActivitiesKey outputs
Solutions and support documentation, such as operational run booksManage queues of queries and eventsReports
Policies and guidelinesPerform scheduled tasksClosed tickets and events
Monitoring dataPatch and update the systemScheduled job completion
Queries (incidents, problems, and so on)
Backup completion
SLAsUpdated solution and support documentation
Automation
Improvements

Table 3.6 provides example descriptions of the technology operation activities

Table 3.6 Technology operation activities
ActivityExample
Manage queues of queries and eventsInfrastructure management teams and tools process incoming queries and events, ensuring timely and successful resolution of detected incidents, alerts, and other events requiring a response. Logs and reports reflecting this activity are created as agreed in the infrastructure and platform management approach and solution documentation.
Examples of this work include:
-rolling back a bad software push
-blocking or rate-limiting unwanted traffic
-bringing up additional serving capacity
-using the monitoring systems (for alerting and dashboards)
-solving incidents
-analysing problems
-conducting post-mortems.
Perform scheduled tasksSeveral actions are performed by infrastructure management teams or tools on a scheduled basis, such as daily backups or a data transfer between systems. Logs and reports reflecting this activity are created as agreed in the infrastructure and platform management approach and solution documentation.
Examples of this work include:
-administering production jobs
-describing the architecture, various components, and dependencies of the services
-testing back-up restoration
-training users
-reviewing supplier performance
-reviewing solution performance.
Patch and update the systemPatches and system updates are released to the environment in a structured manner. Typically, patches deployed to the lower environments for testing and then deployed to production. Despite this structure, there are exceptions where systems are not patched as part of this scheduled release due to an application incompatibility, business usage of the solution, or issues identified through testing. It is important to track the solutions that are not at current levels. Completing these updates should be rolled out promptly to maintain overall supportability. Up-to-date solutions reduce the risk of downtime or security breaches.
There are also situations where system updates or patches are installed to resolve an incident and then need to be rolled out to the rest of the organization. The result of applying patches and updates reactively creates a non-standard environment.
The infrastructure specialist manages these exceptions and identifies a plan to address these exceptions. Understanding and addressing these deviations is a vital part of technology management.

The technology operation activities ensure that solutions are available and functioning as designed from acceptance into the live environment through retirements. Technical experts and technical coordinators perform the activities in this process.

4. Organizations and people


4.1 Roles, competencies, and responsibilities

The practice guides do not describe the roles of practice owners or managers that should exist for all practices. They focus instead on specialist roles specific to each practice. The structure and naming of each role may differ from organization to organization, so any roles defined in ITIL should not be treated as mandatory, or even recommended. Remember, roles are not job titles. One person can take on multiple roles and one role can be assigned to multiple people.

Roles are described in the context of processes and activities. Each role is characterized with a competency profile based on the model shown in Table 4.1.

Table 4.1 Competency codes and profiles
Competency codeCompetency profile (activities and skills)
LLeader Decision-making, delegating, overseeing other activities, providing incentives and motivation, and evaluating outcomes
АAdministrator Assigning and prioritizing tasks, record-keeping, ongoing reporting, and initiating basic improvements
CCoordinator/communicator Coordinating multiple parties, maintaining communication between stakeholders, and running awareness campaigns
МMethods and techniques expert Designing and implementing work techniques, documenting procedures, consulting on processes, work analysis, and continual improvement
ТTechnical expert Providing technical (IT) expertise and conducting expertise-based assignments
Table 4.2 Examples of the roles involved in infrastructure and platform management activities
ActivityResponsible rolesCompetency profileSpecific skills
Technology planning
Analyse the organization’s strategy and architectureArchitects, business analysts, product owners, infrastructure specialistsTCGood knowledge of the organization and its environment, portfolios, products, resources, and customers
Understanding of the current infrastructure architecture and architecture roadmap
Analytical skills
Good knowledge of current and available technology
Develop and agree the infrastructure and platform management approachArchitects, business analysts, product owners, infrastructure specialists, consultantsTLMCGood knowledge of the organization and its environment, portfolios, products, resources, and customers Excellent knowledge of current and available infrastructure and platform solutions
Good knowledge of infrastructure and technology services suppliers and market
Review the infrastructure and platform management approachArchitects, business analysts, product owners, infrastructure specialists, consultantsTCAGood knowledge of the organization and its environment, portfolios, products, resources, and customers
Understanding of the current infrastructure architecture and architecture roadmap
Analytical skills
Good knowledge of current and available technology
Product development
Create a basic solution designSolution architects, infrastructure specialists, site reliability engineers, product ownersTAUnderstanding of the requirements
Good knowledge of the infrastructure and platform management approach
Expertise in the available technology
Create a detailed solution designSolution architects, infrastructure specialists, site reliability engineers, product ownersTAUnderstanding of the requirements
Good knowledge of the infrastructure and platform management approach
Expertise in the available technology and services
Source/develop/configure the componentsInfrastructure specialists, site reliability engineers, product owners, suppliersTCTechnical expertise
Communication and collaboration skills
Source/build/configure the solutionInfrastructure specialists, site reliability engineers, product owners, suppliersTCTechnical expertise
Communication and collaboration skills
Support validation and testingInfrastructure specialists, site reliability engineers, product owners, suppliersTCTechnical expertise
Communication and collaboration skills
Support deployment and releaseInfrastructure specialists, site reliability engineers, product owners, suppliersTCTechnical expertise
Communication and collaboration skills
Review solution development and implementationSolution architects, infrastructure specialists, site reliability engineers, product ownersTCAGood knowledge of the infrastructure and platform management approach
Technical expertise
Good knowledge of the organization and its environment, portfolios, products, resources, and customers
Technology operations
Manage queues of queries and eventsInfrastructure specialists, site reliability engineersTATechnical knowledge
Understanding of business and customer context
Communication and coordination skills
Perform scheduled tasksInfrastructure specialists, site reliability engineersTATechnical administration knowledge
Patch and update the systemInfrastructure specialists, site reliability engineersTAKnowledge of security policies, standards, and requirements
Technical knowledge
4.1.1 Infrastructure specialist

The key role for this practice is infrastructure specialist. This is a generic term to describe roles that can be specified either by the technology, like network, SRE, and so on (for example, network specialist, site reliability engineer, or virtualization specialist) or by the phase in product lifecycle, like design, testing, or operations (for example,. infrastructure designer/development specialist, testing specialist, or operations administrator).

Those distinctions are defined by the organization’s size and structure, but the general set of competencies are similar, and usually includes:

  • technology subject matter expertise
  • good understanding of the organization’s architecture
  • knowledge of the frameworks and techniques adopted by the organization
  • knowledge of organization’s products and services
  • service mindset
  • good knowledge of organization’s operating model and value streams.

Examples of other roles which can be involved in infrastructure and platform management activities are listed in Table 4.2, together with the associated competency profiles and specific skills.


4.2 Organizational structures and teams

Infrastructure and platform management specialists often form a dedicated team (or teams). However, in some organizations they are included in product teams and focused on infrastructure solutions supporting respective products. Regardless of the organizational solution, it is important to maintain shared view and responsibility across infrastructure and product teams.

Key message

“Rigid boundaries between “application development” and “production” (sometimes called programmers and operators) are counterproductive. This is especially true if the segregation of responsibilities and classification of ops as a cost centre leads to power imbalances or discrepancies in esteem or pay.
(…) Ideally, both product development and SRE teams should have a holistic view of the stack—the frontend, backend, libraries, storage, kernels, and physical machine—and no team should jealously own single components. It turns out that you can get a lot more done if you “blur the lines”11 and have SREs instrument JavaScript, or product developers qualify kernels: knowledge of how to make changes and the authority to do so are much more widespread, and incentives to jealously guard any particular function are removed.”
This quote from “The Site Reliability Workbook” by Google refers specifically to SRE teams. However, it is valid for any other approach to infrastructure and platform management.

The infrastructure and platform management practice needs to allow for organization variations while ensuring some level of consistency across infrastructure teams. The teams may be split by geography, type of technology, or business service. Having an overall structure to manage practice changes and communication is important to keep the overall service functioning in an optimal manner. This may be done with an overall governance group or through representation in an infrastructure committee.


5. Information and technology


5.1 Information exchange

The effectiveness of the infrastructure and platform management practice is based on the quality of the information used. This information includes, but is not limited to:

  • business services and processes
  • customers and users
  • partner and suppliers including contracts and service levels
  • SLAs
  • architecture and design documentation
  • portfolio and project management plans
  • policies, requirements, and controls
  • change records
  • incident records
  • request records
  • problem records
  • release records
  • financial information
  • application development and testing information
  • system information (versions, baselines, configurations)
  • monitoring and event information
  • IT assets and inventory information.

5.2 Automation and tooling

In most cases, the infrastructure and platform management practice can significantly benefit from automation. Where this is possible and effective, it may involve the solutions outlined in Table 5.1.

Table 5.1. Automation solutions for infrastructure platform management activities
Process activity Means of automation Key functionality Impact on the effectiveness of the practice 
Technology planning
Analyse the organization’s strategy and architectureCommunication and collaboration tools
Analytical systems
Knowledge management tools
Collection, processing, and presentation of data from diverse sourcesHigh
Develop and agree the infrastructure and platform management approachCommunication and collaboration toolsCollaboration and information sharingMedium
Review the infrastructure and platform management approachCommunication and collaboration tools
Analytical systems
Knowledge management tools
Collection, processing, and presentation of data from diverse sources
Reporting engines
Dashboard systems
High
Product development
Create a basic solution designWorkflow tools including task assignment, routing, approvals, tracking, and notificationsAbility to assign design tasks and approval for planning activities, including status tracking, notifications, and reporting to ensure actions are on task and the design is approvedHigh
Create a detailed solution designWorkflow tools including task assignment, routing, approvals, tracking and notifications, contract management with templates, approvals, and review schedulesAbility to assign tasks and approval for planning activities, including status tracking, notifications, and reporting to ensure actions are on taskHigh
Source/develop/configure the componentsAutomated provisioning, building, and configuring toolsAbility to receive approved request and to build a solution with no or limited manual intervention ensuring consistent and timely deliveryHigh
Source/build/configure the solutionAutomated provisioning, building, and configuring toolsAbility to receive approved request and to build a solution with no or limited manual intervention ensuring consistent and timely deliveryHigh
Support validation and testingAutomated testing and defect trackingAutomated testing, reporting, and logging into the defect management systemHigh
Support deployment and releaseDeployment toolsAutomated deployment from testing to implementation, including submission of change requestHigh
Review solution development and implementationWorkflow tools including task assignment, routing, approvals, tracking, and notifications
System health monitoring and reporting tools
Dashboards and reports, trend analysisMedium to high
Technology operation
Manage queues of queries and eventsAutomated request provisioning, automated resolution, ChatOps, AIOps, Workflow toolsAbility to close repeat tickets automatically and assign the tickets automatically to the correct group without manual triage stepsTask assignment, routing, approvals, tracking and notificationsHigh
Perform scheduled tasksJob scheduling tools and scripts for backup, batch, and other automated tasks
Vulnerability tools and report and testing automation for compliance, automated solution recovery, and testing ITSM report and dashboard automation
Automated report consolidation and generation, customer feedback surveys
Workflow tools including task assignment, routing, approvals, tracking, and notifications
Automation of scheduled tasks including notification on failures reducing the potential for missed procedure execution
Ability to automatically verify and test solutions for security hardening, recoverability, and controls
High
Patch and update the systemSystem and security patch deployment and inventory tools, software distribution and inventory toolsAbility to automatically deploy and report on installation status for patches and system updatesHigh

6. Partners and suppliers

Very few services are delivered using only an organization’s own resources. Most, if not all, depend on other services, often provided by third parties outside the organization (see section 2.4 of ITIL Foundation: ITIL 4 Edition for a model of a service relationship).

The infrastructure and platform management practice allows for many outsourcing options both from an activity perspective as well as from a technology perspective. Table 6.1 provides examples of areas that are candidates for outsourcing.

Table 6.1 Infrastructure and platform management sourcing considerations

ActivityOpportunityApplicability
ProvisioningDelivery of desktops, servers, computer, network, and storage services or other technology servicesOutsourcing is most effective when standards are in place. Outsourcing may be selectively used for remote locations.
SupportRestoration and prevention of incidents for in-scope technologiesSupport for the entire capability may be outsourced or focused on specific roles. Providers should adhere to standard service desk processes for a consistent customer experience. This works well for remote sites, especially for desktop support.
AdministrationPerforming routine tasks based on operational procedures and requestsAdministrative tasks need to be well documented and sufficient access must be provided.
Operations centreOutsourcing the operations centre function reduces the need to ensure adequate coverage with internal staff, especially if it is provided at all times. This function can provide monitoring, systems management, job scheduling, or other activitiesThis reduces internal staffing requirements. This function must be well documented, have adequate access and frequent touchpoints are recommended to understand any open issues or improvement opportunities.
Backup/restoreProvider configures and manages backup jobs and repositories, addresses backup failures, and restores files as neededProviders may leverage internal backup tools or may include backup solutions and storage as part of the agreement.
Systems management, patching, or other updatesManage systems to keep up to date for versions, configurations, and patchesStandards and configurations must be well documented, and access provided. Access to management tools is required.
Technology ownershipTechnology can be leased through subscription services, reducing the capital required to implement and maintain technologyWith cloud offerings, this is a prominent trend in the industry. This allows for service levels and capabilities to be delivered without the overhead of building and supporting technology internally.

With a large amount of opportunity within this space, understanding and managing outsourcing risks is an important activity to ensure that services meet customer expect

Some examples of these risks are:

  • loss of flexibility due to constraints of agreement
  • additional unplanned costs if the scope needs to be modified or if consumption exceeds the contractual terms
  • contractual service levels may not align with customer expectations
  • security and policy adherence of providers
  • loss of internal talent as role moves from performing activities to oversight of those activities
  • lack of visibility.

Although all functions can be outsourced, it is recommended to retain oversight and architecture functions. Oversight ensures providers are delivering to their committed levels and allows insight into potential improvements to the existing agreement. To effectively support and continue to deliver services, the knowledge of how solutions connect across providers must be well understood by the internal team. As the specific knowledge in specific technologies moves to the provider, there should be an architectural role internally that understands the design and operations of the infrastructure environment.


7. Important reminder

Most of the content of the practice guides should be taken as a suggestion of areas that an organization might consider when establishing and nurturing their own practices. The practice guides are catalogues of topics that organizations might think about, not a list of answers. When using the content of the ITIL practice guides, organizations should always follow the ITIL guiding principles:

  • focus on value
  • start where you are
  • progress iteratively with feedback
  • collaborate and promote visibility
  • think and work holistically
  • keep it simple and practical
  • optimize and automate.

More information on the guiding principles and their application can be found in section 4.3 of ITIL Foundation: ITIL 4 Edition.


Tags:ITIl4platform_managementinfrastructure_management

آیا این مطلب مفید بود؟

2 بله  خیر
مقالات مرتبط
  • Incident management: ITIL 4 Practice Guide
  • Software development and management: ITIL 4 Practice
  • Deployment Management: ITIL 4 Practice Guide

پاسخ خود را پیدا نکردید؟ تماس با ما

Leave A Comment لغو پاسخ

KB Categories
  • Technical management
  • Service management
  • General management

Deployment Management: ITIL 4 Practice Guide  

کلیه حقوق مادی و معنوی سایت متعلق به شرکت یگانه ارتباطات پیشرو می باشد