Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions
-
Updated
Jan 25, 2023 - Python
Site reliability engineering (SRE) is a set of principles and practices that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems. Site reliability engineering is closely related to DevOps, a set of practices that combine software development and IT operations, and SRE has also been described as a specific implementation of DevOps.
Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions
A curated list of Site Reliability and Production Engineering resources.
A curated collection of publicly available resources on how technology and tech-savvy organizations around the world practice Site Reliability Engineering (SRE)
Compilation of public failure/horror stories related to Kubernetes
At LinkedIn, we are using this curriculum for onboarding our entry-level talents into the SRE role.
Terraform Pull Request Automation
StackStorm (aka "IFTTT for Ops") is event-driven automation for auto-remediation, incident responses, troubleshooting, deployments, and more for DevOps and SREs. Includes rules engine, workflow, 160 integration packs with 6000+ actions (see https://exchange.stackstorm.org) and ChatOps. Installer at https://docs.stackstorm.com/install/index.html
Enable Self-Service Operations: Give specific users access to your existing tools, services, and scripts
Site Reliability Engineer Interview Preparation Guide
DevOps Roadmap for 2022. with learning resources
Chaos Engineering Toolkit & Orchestration for Developers
[Moved to cloudprober/cloudprober] An active monitoring software to detect failures before your customers do.
Cloud Native DataOps & AIOps Platform | 云原生数智运维平台
A collection of postmortem templates
Learning Shell,Python,Golang,System,Network
Web UI for Jaeger
DevOps by Example
A checklist of anyone practicing Site Reliability Engineering