
AI Infrastructure Lead Architect
Job Description
YOU ARE
As a Lead and Principal Infrastructure Architect, you own end-to-end responsibility for designing optimized compute infrastructure for large-scale AI and machine learning systems, including large-scale distributed training environments.
You are the authority who translates business goals, SLAs, and client standards into infrastructure architectures that perform at scale while being deliberately engineered for cost-efficiency. Drawing on deep experience, you weigh multiple viable solutions for any given problem — across compute, networking, storage, orchestration, and model serving — and make rational, well-justified architectural decisions tailored to each client's situation, constraints, and standards. You architect and optimize the full computational stack for performance, power, cost, and scalability; design and tune large-scale GPU clusters and distributed training systems; and ensure infrastructure meets security, compliance, and regulatory requirements.
As the recognized AI infrastructure expert in at least one hyperscaler cloud (such as AWS, Azure, or Google Cloud), you bring authoritative knowledge of that platform's AI/ML services, accelerators, networking, and cost levers, and apply it to deliver best-in-class solutions. Beyond design, you set technical direction and standards, lead and mentor engineers and architects, partner with clients and stakeholders to shape the infrastructure roadmap, and are ultimately accountable for delivering AI/ML infrastructure that meets business SLAs, controls cost, and scales to enterprise and frontier workloads.
THE WORK
Own the end-to-end architecture and design of optimized compute infrastructure for large-scale AI/ML systems, including large-scale distributed training environments, from concept through delivery.
Develop and evaluate architecture alternatives, weighing trade-offs across compute, networking, storage, orchestration, and model serving to make rational, well-justified decisions tailored to each client's situation and standards.
Lead architecture assessments and reviews of existing and proposed environments, identifying gaps, risks, bottlenecks, and optimization opportunities, and recommending remediation.
Drive architectural decision-making, documenting rationale, trade-offs, and assumptions so decisions are transparent, defensible, and aligned with business SLAs and standards.
Define and maintain the AI infrastructure roadmap, planning capacity, scaling, and technology evolution in step with business and product goals.
Architect and optimize the full computational stack for performance, power, cost, and scalability, ensuring infrastructure meets business SLAs while being deliberately engineered for cost-efficiency.
Design and tune large-scale GPU clusters and distributed training systems, including accelerator selection, interconnect/networking, and storage for high-throughput training workloads.
Serve as the authoritative AI infrastructure expert in at least one hyperscaler cloud (AWS, Azure, or GCP), applying deep knowledge of its AI/ML services, accelerators, networking, and cost levers.
Design deployment, automation, and CI/CD strategies for reliable, repeatable, and scalable releases of AI systems, models, and data pipelines into production.
Establish AI monitoring and observability strategy across InfraOps and MLOps, defining SLAs, SLOs, alerting, and performance/cost tracking, and driving continuous optimization.
Integrate AI/ML systems into enterprise environments, ensuring interoperability, security, compliance, and adherence to regulatory and client standards.
Lead capacity planning and cost modeling, forecasting compute needs and engineering cost-efficiency into the architecture without compromising performance.
Collaborate with clients, stakeholders, and engineering teams to align infrastructure decisions with business outcomes, translating requirements into actionable architecture and standards.
Set technical direction, standards, and best practices, mentoring engineers and architects and leading design and code reviews across the team.
EDUCATION
• Bachelor's Degree in Computer Science, Computer Engineering, related Engineering field
BASIC (REQUIRED) QUALIFICATION
Solid background in coding, building, monitoring, troubleshooting applications of AI/ML models; selecting, designing and infrastructure for deploying and running them on premise or on public cloud.
Strong understanding of AI and machine learning as a subject.
Strong understanding of computing infrastructure a subject, preferred knowledge of AI infrastructure.
Good proficiency in programming languages such as Python, Java, or C++.
Experience with data pipeline and workflow management tools (e.g., Apache Airflow, Kubeflow).
Strong problem-solving skills and ability to work in a fast-paced environment.
Excellent communication and collaboration skills.
Significant experience in AI/ML infrastructure engineering or related roles on a hyperscaler platform for deploying large scale solutions.
Proven experience in leading and managing AI projects and teams.
Strong project management skills, with the ability to manage multiple projects simultaneously.
Demonstrated experience in evaluating and selecting AI technologies and frameworks.
Ability to work with cross-functional teams and drive project alignment.
About Accenture
Accenture is a leading global professional services company that helps the world’s leading businesses, governments and other organizations build their digital core, optimize their operations, accelerate revenue growth and enhance citizen services—creating tangible value at speed and scale. We are a talent- and innovation-led company with approximately 791,000 people serving clients in more than 120 countries. Technology is at the core of change today, and we are one of the world’s leaders in helping drive that change, with strong ecosystem relationships. We combine our strength in technology and leadership in cloud, data and AI with unmatched industry experience, functional expertise and global delivery capability. Our broad range of services, solutions and assets across Strategy & Consulting, Technology, Operations, Industry X and Song, together with our culture of shared success and commitment to creating 360° value, enable us to help our clients reinvent and build trusted, lasting relationships. We measure our success by the 360° value we create for our clients, each other, our shareholders, partners and communities.
Visit us at www.accenture.com
Equal Employment Opportunity Statement
We believe that no one should be discriminated against because of their differences. All employment decisions shall be made without regard to age, race, creed, color, religion, sex, national origin, ancestry, disability status, sexual orientation, gender identity or expression, marital status, citizenship status or any other basis as protected by applicable law. Our rich diversity makes us more innovative, more competitive, and more creative, which helps us better serve our clients and our communities.
Company benefits
Working at Accenture UK
Company employees:
Hiring in countries
Austria
Belgium
Denmark
Finland
France
Germany
Hungary
Ireland
Italy
Luxembourg
Netherlands
Norway
Poland
Portugal
Other jobs you might like
AI Infrastructure Principal Architect
London | Paris | Berlin | Madrid, Castellana 85 | United Kingdom
AI Infrastructure Architect
London | Paris | Berlin | Madrid, Castellana 85 | United Kingdom
AI Infrastructure Junior Architect
London | Paris | Berlin | Madrid, Castellana 85 | United Kingdom