< Back to search
Microsoft UK • Reading, United Kingdom

Principal AI Operations Lead

Employment type:  Full time
8.4

/10

Transparency ranking
Apply now

Job Description

Overview
Job OverviewAre you a customer-obsessed, AI-curious problem-solver who thrives in an inclusive, collaborative global team? The Engineering Operations (Eng Ops) team’s mission is to transform Microsoft Cloud customers into fans. Through our deep engineering engagements with customers and teams across Microsoft, we analyze and amplify customer needs and drive the vision to improve Cloud quality, security, and reliability. Our culture of growth, mindset and empowerment are central to who we are and how we work. Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond. In alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positively impact our culture every day.This is a critical role that will co-design and lead the execution of the Artificial Intelligence (AI) Operations strategy in partnership with the Senior Engineering Operations Director (also the person this role will report into) for a critical, and very high visibility UK government customer. The candidate will also need to seek input from key parties i.e., internal Operations teams, Site Reliability Engineering (SRE) team, program team, customer stakeholders and external partners that are part of the overall support eco-system for the customer. In partnership with, in particular, the Operations and SRE teams and partners, the candidate will focus on delivering operational efficiencies through the use of AI, Machine Learning (ML) and internal Microsoft AI platforms and tools to ensure advancement in key areas such as improved operational automation for incidents, predictive analysis, Large Language Model (LLM) enhancement and/or creation to accelerate triage and escalation for operational service incidents and guided responses using historical incidents and telemetry. The candidate should also be laser focused on driving the team to upskill and mature in the 'Agentic' space and drive progress towards towards incident prevention, so this is a unique 'AI Operations' role for a very exciting customer.

Responsibilities
Job Responsibilities

  • This role involves designing and implementing the AI Operations strategy and will be crafting operational and user-facing features powered by Generative AI while taking full ownership of their development from inception to delivery, ensuring exceptional quality and seamless implementation. You’ll be working closely with some of the world’s most recognisable brands, helping them catapult into the AI transformation era.
  • Due to the nature of the highly complex system architecture, there will be multiple opportunities to make both short and long terms efficiency and optimisation gains through the utilisation of AI/ML and Large Language Models (LLMs) so clear test and deployment plans will be required and the ability to showcase the wins and articulate the benefits with respect to protecting critical KPIs.
  • Upskill the Operations and SRE teams and enable them to be a part of, and contribute towards, development of the agents and execution of the AI Operations strategy.
  • Ideal candidates will have a balanced combination of theoretical AI knowledge and practical software development experience, enabling them to design, develop, and deploy sophisticated software systems at scale, with an experimentation and iterative mindset.
  • Candidates for this role should also demonstrate customer passion, excellent teamwork and relationship management skills, a bias for action, and an ability to work effectively across organisational boundaries.


Qualifications
Job Requirements

Required Qualifications

  • Bachelor's Degree in Computer Science or related technical field AND significant years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
  • Competence with DevOps practices, including CI/CD pipelines, containerisation, and infrastructure-as-code.
  • Solid understanding and successful demonstration of system security, scalability, reliability, and maintainability.
  • Practical Experience with AI/LLMs: Experience designing and implementing ML/LLM-based solutions in production environments.
  • Experience leveraging generative AI technologies to develop innovative and user-focused product features.

Preferred Qualifications

  • Master's Degree in Computer Science or related technical field AND significant years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor's Degree in Computer Science or related technical field AND significant years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
  • Capable of optimising, prompting and finetuning AI-based solutions for performance, accuracy, and scalability.
  • Experience coaching and growing engineers within the team.

Microsoft Cloud Background Check:This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter. #EngOps #EngOpsACES

This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.



Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.

Company benefits

Wellbeing allowance
Health insurance
Dental coverage
Gym membership
Mental health platform access
Buy or sell annual leave
Shared parental leave
Charity donation scheme
Employee assistance programme
Employee discounts
Volunteer days – 3 days a year
Fertility treatment leave
Open to compressed hours
Open to job sharing
Fertility benefits
Enhanced sick pay
Enhanced sick days
Compassionate leave
Travel insurance
20 days annual leave + bank holidays
Enhanced maternity leave – 26 weeks paid
Enhanced paternity leave – 6 weeks paid
Adoption leave – 24 weeks paid
Childcare credits
Carer’s leave – 4 weeks paid
Cycle to work scheme
Faith rooms
Annual bonus
Annual pay rises
Company car
Hackathons
Open to part-time employees
Pregnancy loss leave
Life insurance
Equity packages
Financial coaching
Relocation packages
Sabbaticals
Enhanced pension match/contribution
Family health insurance
LinkedIn learning license
In house training
Personal development days
Pregnancy support

Working at Microsoft UK

Company employees:

Globally: 228,000

Gender diversity (m:f):

67:33

Hiring in countries

Germany

Netherlands

Spain

United Kingdom

Office Locations

Awards & Accreditations

Family Friendly

Family Friendly

Flexa awards 2025
Career Progression

Career Progression

Flexa awards 2025
Most flexible companies

Most flexible companies

Flexa100 2024

Other jobs you might like