BigPanda helps NOC and Operations teams to make faster and smarter decisions when (or before) outages occur. Some of the world's largest enterprises, including retailers, airlines, media companies and technology vendors, rely on us to provide a SaaS platform that is highly available and highly scalable, processing thousands of alerts per second. The number and size of our customers are growing fast, as is the complexity of systems that they monitor.
About the Role:
We are seeking an Operations Engineer to work within our Development teams to help the company maintain and evolve its ever-expanding highly scalable infrastructure.
Some of the challenges you'll be facing:
- Redesigning and changing our architecture to scale out indefinitely.
- Maintaining our event-driven, high throughput production environment with automation and code.
- Take an active role in migrating self-hosted infrastructure to cloud-based managed services.
- Taking active part in an agile dev team's day-to-day missions and providing robust operational solutions with the team.
What would make you a good fit?
- You're passionate about owning and managing the reliability and scale of mission-critical production environments.
- You're a self-learner and can work independently on missions given the right context.
- You see the bigger picture and think about how architecture, software and infrastructure work together.
- You enjoy engaging and collaborating with developers and understand how quality production code can drive the operation side of the platform.
- Minimum 2 years experience in maintaining production environments.
- 1-2 years (or more) of proven experience writing in one of the following: Python/Ruby/Go.
- Strong knowledge in Linux environments.
- Experience with at least 1 configuration management tool (Ansible, Chef, Puppet, Salt etc.)
- Experience with managing CI/CD in tools like (Jenkins, Travis, Circleci etc.)
- Deep knowledge in managing cloud-based services & Infra.
- Hands-on experience with Kubernetes.
Nice to Have:
- Hands-on experience with maintaining Node.JS orJVM based applications running with the following: MongoDB, ElasticSearch, Kafka or RabbitMQ.