We are seeking an experienced Site Reliability Engineer to lead the development of new Cloud native tools and services with extensive experience with deploying and monitoring Infrastructure as code (IAC).
In this role, you will lead architecture, design, build, and manage physical data structures designed for flexibility, scalability and resiliency to support current and future business needs of all the Blackline Accounts Receivable products and initiatives in automated, repeatable ways. Your role encompasses aspects of capacity planning, technical project execution, performance monitoring, site reliability, security and software engineering.
Tools and services built by Cloud Engineering will provide the cloud native foundation for future Blackline Products and Services. You will also design, build, and maintain processes and components of a data pipeline to support analytics, focusing on data quality and governance, pipeline performance, and best practices for democratized data access. You will also collaborate and influence across partner teams on design and architecture with our IAC principles and make recommendations for improvements.
If you’re an expert in DevOps principles, Containers, Event driven technology, traditional data warehousing, ETL and/or big data pipeline and processing, you’re exactly who we’re seeking. Technical capabilities aside, if you’re a self-starter who’s comfortable with ambiguity, able to think big without overlooking minute details, and who thrives in a fast-paced environment, you’re perfect for our team.
Roles and Responsibility (list in order of importance)
- Design and build Blackline's cloud infrastructure platforms using infrastructure as code methods to accelerate software engineering and data science teams ability to deliver new products and services.
- Design and build cloud tools that democratize access to the cloud in order to accelerate software engineering and data science teams ability to deliver new products and services.
- Improves the BlackLine SaaS service experience by discovering and highlighting optimization opportunities with existing code or architectural design to address application availability, performance, observability, efficiency, and security challenges.
- Develops tools and systems to automate the identification, analysis, and remediation of application events, infrastructure issues, or requests.
- Ensure compliance with centrally defined Security and with Operational risk standards (E.g. Network, Firewall, OS, Logging, Monitoring, Availability, Resiliency)
- Build and support continuous integration (CI), continuous delivery (CD) and continuous testing activities
- Support non-functional requirements such as serviceability, supportability, logging, Monitoring and alerting etc.
- Ensure good Change management practice is implemented as specified by central standards.
- Provide impact assessments where requested for changes proposed on Microsoft Azure core platform
- Take part in Release Management activities, including rota for deployments to Production on Sundays (with time off in week provided)
- On rotational on call basis provide out of business hours support as part of our 24 x 7 coverage
Years of Experience in Related Field: Minimum 5 Years
Education: Masters Preferred
Technical/Specialized Knowledge, Skills, and Abilities:
- Strong understanding of DevOps principles, Platform and Infrastructure as a Code concepts and techniques, based on Microsoft Azure SAAS and IAAS components.
- Experience using Azure DevOps, Containers, Git, Jenkins, CI/CD and available tools.
- Knowledge and experience in monitoring/alerting and observability of cloud native applications and components within Microsoft Azure for performance, stability and security.
- Security and Compliance, e.g. IAM and cloud compliance/auditing/monitoring tools.
- Proficient with Azure DevOps and a modern scripting language (preferably Powershell or Azure CLI & YAML) for automation of build tasks.
- Experience with deployment methodologies, configuration management tools (Azure DevOps, CI/CD, Chef, Puppet, Ansible, etc.), and logging (ELK) and monitoring tools (Azure Application Insights, etc.)
- Experience with SQL Managed Instances and other Cloud DB technologies in Microsoft Azure. (Nice to have)
- Experience with Relational Databases, NoSQL Databases and/or Big Data technologies (Nice to have)
- Implement automation tools and frameworks (Azure DevOps, Jenkins, CI/CD pipelines).
- Experience of building a range of Services in a Cloud Service provider (ideally Microsoft Azure).
- Able to carry out approaches such as risk-management, clustering, load balancing, disaster recovery and failover.
- Experience of Static Code Analysis tools, e.g., VeraCode, Snyk, Sonarqube (Nice to have)
- Conduct system tests for security, performance, and availability.
- Good interpersonal and communication skills, Ability to build strong relationships with Application teams, cross functional IT and global/local IT teams.
- A track record of constantly looking for ways to do things better and an excellent understanding of the mechanism necessary to successfully implement change.
- Set and achieved short, medium and long term goals which meet the standards in their field.
- Excellent written and spoken communication skills; an ability to communicate with impact, ensuring complex information is articulated in a meaningful way to wide and varied audiences.
- Working knowledge in IP and storage networking including SDN, Linux, application networking, DNS, SAN and hybrid technologies.
- Networking principles and protocols such as IP subnetting, routing, firewall rules, Virtual Private Cloud, LoadBalancer, Cloud DNS, Cloud CDN such as Cloudflare, etc.
- Individuals who are motivated by enabling and helping others within the company be data driven.
- 3+ years supporting a SaaS/Hosting type critical revenue-generating environment.
- 3+ years experience working in a strict change-controlled, 24/7 environment.
- Demonstrable Cloud service provider experience (ideally Microsoft Azure) - infrastructure build and configurations of a variety of services including Compute, Storage, SDN (VPC and XPN)
- Experience of working with Continuous Integration (CI), Continuous Delivery (CD) and continuous testing tools
- We run in Microsoft Azure Cloud and rely heavily on Azure SQL Managed Instances, Cloud Storage and our internal ETL frameworks to automate tasks. Experience with these technologies is a plus.
- Experience working within an Agile environment
- Automation scripting (using scripting languages such as Powershell & Azure CLI etc.)
- Server administration (either Linux or Windows)
- Ability to quickly acquire new skills and tools
- Standard office hours
- Participation in rotation for 24/7 support escalations
- Participation in rota for weekend software deployments – time off in lieu provided