The infrastructure team enables Chainlink development and maintains services that support the health of the most widely-adopted oracle network in the world. As a Site Reliability Engineer, you will help us solve some of the unique challenges of blockchain oracle architecture and be primarily responsible for the Chainlink ecosystem’s off-chain part.
We are distributed across time zones and continents, and we embrace remote work. In the Infrastructure team, we follow the infrastructure-as-code approach and practice GitOps. Our on-call rotation uses the follow-the-sun pattern: you will be on call some of the time, but there should not be any overnight shifts.
We all have different backgrounds and are determined to help you succeed no matter where you are or who you are. If you think you would do a great job at Chainlink, we are looking forward to speaking with you, even if you don’t match 100% of the job requirements: those describe people we’ve usually had a great time working with, but they’re not a tick-box exercise.
- Support monitoring services that watch over the entire Chainlink network.
- Deploy and maintain various externally-facing services like reference Chainlink nodes used by developers and customers (including critical services such as Chainlink VRF).
- Improve the reliability and observability of our internal infrastructure.
- Provide our engineers with a reliable release pipeline and empower them to release and deploy Chainlink and adjacent tools extremely quickly.
- 5+ years of relevant professional experience. You have a software engineering background or an operations background and have worked as an SRE (or in a very close position) before.
- Experience with system architecture. You can create a design document for a cross-region load-balancing app with five microservices, a PostgreSQL cluster, a caching layer, and a Kafka queue—and then implement it on AWS.
- Experience with CI/CD pipelines. You can troubleshoot an existing pipeline or build your own, and you’ve probably worked on both software delivery and cloud-based services deployment.
- Experience with distributed systems and container orchestration. You have built or maintained complex Kubernetes clusters before.
- Ability to read and write code. You can understand precisely why a recent code change led to degraded performance; you can write scripts and tools to automate routine tasks and eliminate toil.
- Strong communication skills. You can give and receive constructive feedback, and you do not shy away from planning meetings and code reviews.
- Professional experience with Golang, TypeScript, or both.
- Excitement for blockchain, Web 3.0, and similar decentralized technologies.
- Experience running blockchain full node operator is a big plus.
- Experience with Chainlink as a developer or a node operator is a big plus.
- Comfort working with network protocols, proxies, and load balancers.
- Experience with information security and DevSecOps.
- Experience working remotely in a distributed team.
- We are giving slight preference to candidates who live in the UTC to UTC+8 range due to our on-call schedule for this particular opening.
Some of the tools and services we use daily or almost daily are:
AWS; Terraform/Terragrunt; Kubernetes, Calico and ArgoCD; Prometheus and Grafana; GitHub Actions; Packer
We expect you to be comfortable with most of those tools and very proficient in several of them.