DevOps
SecOps = DevOps + Security
Good Read:
- Is DevOps a Title?
- What is ‘Site Reliability Engineering’?
- SRE Book by Google
- The devops transformation
- https://medium.com/cracking-the-data-science-interview/how-operating-systems-work-10-concepts-you-should-know-as-a-developer-8d63bb38331f
DevOps Overview:
- Code — code development and review, source code management tools, code merging
- Build — continuous integration tools, build status
- Test — continuous testing tools that provide feedback on business risks
- Package — artifact repository, application pre-deployment staging
- Release — change management, release approvals, release automation
- Configure — infrastructure configuration and management, Infrastructure as Code tools
- Monitor — applications performance monitoring, end–user experience
DevOps Movement Core value –> CAMS:
- Culture
- Automation
- Measurement
- Sharing
Deployment/Release Strategy
- Canary Release: making staged release
- Green Blue Deployment: reduces downtime and risk by running two identical production environments called Blue and Green
Infastructure as Code
CI/CD
Virtualization
- Dockers: containerization
- VirtualBox: virtualization/hypervisor
- Vagrant: create and configure portable development environment
Automation
- Rundeck: Self-Service Operations Console
- Sonarqube: Continuous Code Quality
- SumoLogic: machine data analytics platform
- Scalyr: log search and management
- Stackdriver: Monitoring and management for services, containers, applications, and infrastructure
- prometheus
Log Journal
IAM
Ref:
- https://github.com/kdeldycke/awesome-iam
- https://wso2.com/whitepapers/identity-architect-ground-rules-ten-iam-design-principles/
- https://github.com/ory/oathkeeper
Access Control Model:
- Attribute-Based Access Control
- Role-Based Access Control
Observability
APM
= Application Performance Monitoring
APM Vendor:
- Logging: actionable logs
- Instrumentation: meaningful number/metrics
- (Request) Rate - the number of requests, per second, you services are serving.
- (Request) Errors - the number of failed requests per second.
- (Request) Duration - distributions of the amount of time each request takes.
- Utilization: the average time that the resource was busy servicing work
- as a percent over a time interval. eg, “one disk is running at 90% utilization”.
- Saturation: the degree to which the resource has extra work which it can’t service, often queued
- as a queue length. eg, “the CPUs have an average run queue length of four”.
- Errors: the count of error events
- resource: all physical server functional components (CPUs, disks, busses, …)
- Latency: The time it takes to service a request.
- Traffic: A measure of how much demand on the system.
- Errors: The rate of failed requests.
- Saturation: A measure of how “full” a service is, often measured by latency.
ELK Stack
(now called Elastic Stack)
- Elasticsearch: data
- Logstash
- Kibana: Visualization
- Beat
TIG Stack
- Telegraf
- InfluxDB
- Grafana
- Kapasitor:
rsyslog to forwarding log messages in an IP network
Tracing:
Twelve Factor App
Codebase
: One codebase tracked in revision control, many deploysDependencies
: Explicitly declare and isolate dependenciesConfig
: Store config in the environmentBacking services
: Treat backing services as attached resourcesBuild, release, run
: Strictly separate build and run stagesProcesses
: Execute the app as one or more stateless processesPort binding
: Export services via port bindingConcurrency
: Scale out via the process modelDisposability
: Maximize robustness with fast startup and graceful shutdownDev/prod parity
: Keep development, staging, and production as similar as possibleLogs
: Treat logs as event streamsAdmin processes
: Run admin/management tasks as one-off processes