Site Reliability Engineer (Remote Europe)

Flexible hours

Various work from home options

Dog friendly

Employment type
Full time

Who We Are

Element is the startup that employs the core team behind matrix.org — the leading project for secure, open decentralised communication.

Matrix’s mission is to make messaging as open as email — allowing everyone to choose where their data is hosted, enjoy private conversations thanks to advanced encryption, and ultimately be in control of their own communication.

Matrix powers our flagship messaging apps for the web, iOS & Android, along with Element Matrix Services, our SaaS platform for personal & professional use.

We build things for everyone, and we know we can’t succeed without a diverse team. Our hiring process is designed to give candidates the best chance to show us what you can do. If we ever fall down on this, please let us know.

About Your Team

We are a small team today of five engineers working hard at transforming how operations and infrastructure is done within the organisation. We come from various backgrounds and are today a remote-first team with all of us working from different countries across Europe and the UK.

As part of our day-to-day operations, we use or touch on (in no particular order) AWS, UpCloud, Postgres, Grafana, Prometheus, Loki, Elasticsearch, PagerDuty, Python, AWS EKS, Red Hat OpenShift, GitLab, GitHub, Ansible, AWX, Terraform, Keycloak, Linux, Containers, Golang, HAProxy, Nginx to name a few.

The Team Today

  • We manage both internal and client infrastructure across private and public clouds, in private data centers and on kubernetes clusters. Translation - we ssh into boxes, apply ansible, use terraform, manage kubernetes clusters, manage configurations and release roll-outs.
  • We react to and resolve various issues within the infrastructure. Translation - we respond to alerts and pages, we go on-call, look at grafana dashboards, isolate/debug production issues, roll-out mitigation where we can etc. We are predominantly responsible for the availability of most services deployed in the organisation.
  • We are responsible for internal IT. Translation - we help on-board new employees, manage things like mail, calendaring access, sso etc.
  • We help our clients understand their needs, identify bottlenecks and manage their on-premise Matrix services. Translation - we do a bit of consulting work with our Professional Services team and also manage services on-premise on behalf of our customers.
  • We are working with intent towards our tomorrow. Translation - we dedicate time where we can to automate, modernise or fix our current assets, improve our processes/platforms and build best practices for our engineering teams.

The Team Tomorrow

  • We are cloud native and container first. Translation - our focus is on developing and delivering artefacts and automation predominantly targeting cloud environments. Particularly, we are focused on container native environments running on both managed and self-managed kubernetes clusters at scale.
  • We are focused on developer enablement. Translation - we focus on enhancing developer experiences by improving CI/CD pipelines, sharing cloud native development expertise and codifying our expertise in this area. We provided reliable infrastructure and platform for developers to build and deploy services to production. Developers are responsible for the day-to-day automation of their services, assisted by the tools and processes provided by us.
  • We are focused on Site Reliability. Translation - we codify the operational tasks, automate the recovery from incidents and we manage cattle not pets. We provide infrastructure, platform and tools for services to run at scale. And we enable automated operations across most if not all our mission critical services.

Requirements

About you

We are presently in the process of working towards our tomorrow. And we want to bring more team members along for this journey. We want to work with collaborative and kind people who do not mind experimenting with the unknown.

What we care about

  • You are kind, empathetic and willing to share your knowledge and experience.
  • You are willing to ask for help and to provide it when you can.
  • You are keen on learning new things and figuring out how to improve the status quo.
  • We try (operative word here) to focus on getting things done right, but this does not mean we do it right the first time as being pragmatic is important to us.

What are the basics you need

  • We do not need you to have any experience in decentralised communication, nor do we expect you to be experienced and knowledgeable in everything. However, there are few things that being familiar with will let you get your job done.
  • Linux Servers - You can ssh into a machine, update packages, get at logs, figure out why it is misbehaving.
  • Containers - You have built your own containers before, used it in anger, and understand the basics of how they work.
  • Infrastructure automation - You have worked with at least one of Terraform or Ansible. We are also happy if you have worked with similar tools, like Puppet, Chef or Saltstack etc.
  • Public/Private Cloud Providers - You have used at least one of AWS / Azure / GCP. Hopefully, you have used terraform to automate infrastructure on them.
  • Programming languages - You have written some meaningful code that did some of your automation for you. Preferably in Python or Go. You are able to look at an unknown code base and understand it enough to try debugging it in production.

About The Process

  • Opportunity fit. At this stage you will be talking to the person you will be reporting to and a member of our people team. We will talk a bit about the company, about you, about the role etc. You will get a chance to ask questions and understand better if the role fits what you are looking for. Obviously, we will have a few questions for you as well, but no whiteboards or algorithms involved.
  • Offline coding exercise. We will send you a coding exercise, nothing overly complex. Something that might take someone 1-2 hours to complete. We are looking for your approach, an insight into how you solve a problem and basic smell tests on your coding practices.
  • Interview with the team. In this stage, you will be talking to a couple of members from the team. We will talk about your coding exercise submission and talk about improvements etc. You will get an opportunity to talk to the team about the day-to-day, the organisation, the team, their experiences etc. We will definitely ask some technical questions here. We are looking for insights into how you communicate your ideas and technical solutions. We expect the conversation to be two-way.
  • Architecture. Here, you will be talking to our VP of Engineering and the Founder/CTO. Expect conversations around architecture, building at scale etc. You will get the opportunity to understand more about Element’s history, it’s future direction etc.

If you have any questions before making an application reach out to Mohand (@mohandbouhadouf:matrix.org) via https://app.element.io

Benefits

Our general approach is to treat people like adults and acknowledge that by being flexible we create an environment for people to do their best work. For more details here is our manifesto. That said specific points that differentiate us.

  • Our projects are almost entirely Free and Open Source Software, with high visibility and a large, enthusiastic community.
  • We fully support remote and flexible work, but also maintain offices in London and Rennes.
  • We strive to create a family friendly environment, many of the team have small children and we look to accommodate that as best we can.
  • People tend to stay with the company for a long time, we take this as a sign that we have a cohesive supportive culture, that we have engaging challenging work and that people can develop their skills and careers here for the long term.
  • Since our technology is relevant to anything that requires real-time comms, the role provides exposure to a wide range of domains from more traditional web and app development down to VoIP and IoT.

Element does not discriminate on the basis of race, sex, colour, religion, age, national origin, marital status, disability, veteran status, genetic information, sexual orientation, gender identity or any other reason prohibited by law in provision of employment opportunities and benefits.

Element
App
Messaging
Security
View company profile

Flexible hours

Various work from home options

Dog friendly