Upwork is hiring Contract: Senior Site Reliability Engineer 2022 in Bangladesh
Job Responsibilities:
- Incident Response. Candidates will help us improve our monitoring tools and automation to improve our site reliability by identifying weakness and working with our development team to address those gaps.
- Help us manage the process of handling any type of incident impacting upwork.com, including coordination, communication, and debugging, and remediations.
- This role will participate in our on-call rotation in your day-time and on some weekends (about once every 3 weeks).
- Project-oriented work (Chaos Engineering, Observability, Auto-Remediation, Resilience)
- General SRE ticket work with a particular focus on assisting our developers. This includes supporting and monitoring new and existing services, platforms, and application stacks, automation scripting, writing Chef code, using AWS services and tools, managing nginx load balancers, managing DNS, configuring our CDN and assisting in debugging code in collaboration with developers.
Upwork is the world’s work marketplace. We serve everyone from one-person startups to over 30% of the Fortune 100 with a powerful, trust-driven platform that enables companies and talent to work together in new ways that unlock their potential. Our talent community earned over $3.3 billion on Upwork in 2021 across more than 10,000 skills in categories including website & app development, creative & design, customer support, finance & accounting, consulting, and operations.
Location:
CountrywideBenefits
Check the official link
Eligibilities
- Proven track record and years of experience in Reliability Engineering or Devops role
- Passion for designing and building reliable systems
- Hands on experience in any of the programming languages (Go, Python, Ruby, etc)
- Deep systems and infrastructure knowledge
- Automation advocate – you truly believe in removing operation load with software
- Excellent troubleshooting and problem solving skills
- Experience with scale testing, disaster recovery, and capacity planning
- Familiarity with micro services architecture and container orchestration with Kubernetes
- Expert in Linux System Administration
- Experience in AWS (EC2, S3, ECS, VPC, ElasticSearch, Lambda), Cloudfare, Pagerduty,
- Grafana/Prometheus/Atlas/Icinga monitoring tools
- Jenkins, Chef, Git/Bitbucket, Apache/NGINX
- Excellent verbal and written communication skills (English)
- Ability to size-up a situation, assess the effectiveness of various tactics/strategies, and make rapid decisions on appropriate courses of action.
- An out-of-the-box, critical thinker and you don’t just understand the challenges at the present but also know what to plan and do to improve in the future.
- Available to work during PST Timezone
- Bonus/plus: You have done Reliability engineering through “Chaos” and have shown good result (You have a detail story to tell)
Disclaimer: Youth Opportunities spreads opportunities for your convenience and ease based on available information, and thus, does not take any responsibility of unintended alternative or inaccurate information. As this is not the official page, we recommend you to visit the official website of opportunity provider for complete information. For organizations, this opportunity is shared with sole purpose of promoting “Access to Information” for all and should not be associated with any other purposes.