Conftest is a tool to help you write tests against structured configuration data. It relies on Rego which is a nice query language that comes with a bunch of built-in functions that are ready to use. By using it, you can write tests against the config types below:
YAML/JSON
INI
TOML
HOCON
HCL/HCL2
CUE
Dockerfile
EDN
XML
When it comes to talking about conftest's pros/cons, there're some unique features that some other testing tools don't have.
Pros: You can:
write more declarative tests(policies) which are not simply assertions.
write tests against many kinds of config types.
use --combine flag to combine some different files in one context for using their variables globally.
use parse command to see how the inputs are parsed.
combine different input types in one test run and apply combined policy against them.
Pull/push policies from different kinds of sources like S3, docker registry, github file, etc...
Find real-world examples in examples/ folder
Cons:
Learning Rego could be a little bit time consuming
Finally, I encourage folks either to look at conftest's source code and rego language. It's a simple, single-threaded command-line tool. I recommend folks to integrate it to their organizations also PR's are welcome.
In this article, I'm gonna show you how we transform our ETL processes to spark which runs as Kubernetes pods.
Before that, we prefer custom python codes for our ETLs. The problem about this project is a need for a distributed key-value store and when we pick a solution like Redis, It creates too much internal I/O between slave docker containers and Redis. The performance with spark is much better. Also, the master creates numbers of slaves and manages the containers. Sometimes, docker-py library fails with communicating the docker-engine and the master can't delete the slaves or Redis containers. This causes idempotency problems. You have to distribute the slave containers across your docker cluster which means that you have to put too many cross-functional requirements next to your business code.
We inspect the spark documentation for Kubernetes because we have been already using Kubernetes for our production environment. We use the version 2.3.3 for Spark-Kubernetes. You can have a look at this: https://spark.apache.org/docs/2.3.3/running-on-kubernetes.html Even the Spark Documentation says the feature is experimental for now, we started to run spark jobs on our Kubernetes cluster.
This feature allows us to run spark across our cluster.
Easy to use.
Secured. Because you have to create a specific user for spark driver and executors.
Enough parameters for Kubernetes (node-selector for computation, core limit, number of executors, etc.)
We bundled the spark submit codes with our artifact jar. After this step, the docker container can make a request to k8s master, starts the driver pod, and the driver pod creates executors from the same image. This allows us to bundle all the things in one image. If the code change, CI creates a new bundle and publish it to the registry. The image describes the architecture below.
First of all, you have to create a base image. Download the "spark-2.3.3-bin-hadoop2.7" from here https://spark.apache.org/downloads.html and unzip it. Create an image from this.
./bin/docker-image-tool.sh -r internal-registry-url.com:5000 -t base build
./bin/docker-image-tool.sh -r internal-registry-url.com:5000 -t base push
We created multi-staged Dockerfile like this.
FROM hseeberger/scala-sbt:11.0.1_2.12.7_1.2.6 AS build-env
COPY . /app
WORKDIR /app
ENV SPARK_APPLICATION_MAIN_CLASS Main
RUN sbt update && \
sbt clean assembly
RUN SPARK_APPLICATION_JAR_LOCATION=`find /app/target -iname '*-assembly-*.jar' | head -n1` && \
export SPARK_APPLICATION_JAR_LOCATION && \
mkdir /publish && \
cp -R ${SPARK_APPLICATION_JAR_LOCATION} /publish/ && \
ls -la ${SPARK_APPLICATION_JAR_LOCATION} && \
ls -la /publish
FROM internal-registry-url.com:5000/spark:base
RUN apk add --no-cache tzdata
ENV TZ=Europe/Istanbul
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
COPY --from=build-env /publish/* /opt/spark/examples/jars/
COPY --from=build-env /app/secrets/* /opt/spark/secrets/
COPY --from=build-env /app/run.sh /opt/spark/
WORKDIR /opt/spark
CMD [ "/opt/spark/run.sh" ]
Notice that, you have to place the secrets in secrets/ folder in order to create pods with single image. After the driver pod created, it uses the internal executor pod creation scripts which also placed in spark:base image described also in the spark-kubernetes documentation.
We created the pipelines as build-push -> run-on-qa-cluster -> run-on-preprod-cluster -> run-on-prod-cluster
The run scripts placed in pipeline, pass the parameters to run.sh and we run like this :
This command creates one driver pod which has core limit equals to 2. And after that, 5 executor Pods are created by spark:base. Each one of them has also 2 core limits.
HPA determines if we need more pods and scales the number of Pod. You can scale using the CPU and memory metrics using "K8s Metrics-Server".
However, Kubernetes 1.6 adds support for making use of custom metrics in the Horizontal Pod Autoscaler. With Custom Metrics, you can attach Influxdb/Prometheus or another third party time series db.
Here are the instances of our Grafana Dashboards which is connected to Prometheus and shows autoscale's in Kubernetes. You can inspect the Pod memory and the newly created Pods can be seen there. At "07:56" and "08:00" people started to use Search API more and after scaling process, metrics become normal.
It's been a long time since I have written my last post. In this period, I dig into Kubernetes mostly. Kubernetes is a deployment automation system that manages containers in distributed environments. It simplifies common tasks like deployment, scaling, configuration, versioning, log management and a lot more.
In this article, you will find how can a dotnetcore app put into kubernetes using blue-green deployment and using the pipeline as code. In this case, I used GoCD and their yaml plugin: https://github.com/tomzo/gocd-yaml-config-plugin
First of all, you have to dockerise your dotnetcore app. Here is a snippet for example.
FROM microsoft/dotnet:2.0.5-sdk-2.1.4 AS build-env
WORKDIR /workdir
COPY . /workdir
RUN dotnet restore ./WebApp.sln
RUN dotnet test ./src/tests/WebApp.IntegrationTests
RUN dotnet test ./src/tests/WebApp.UnitTests
RUN dotnet publish ./src/WebApp/WebApp.csproj -c Release -o /publish
FROM microsoft/dotnet:2.0.5-runtime
WORKDIR /app
COPY --from=build-env ./publish .
EXPOSE 3333/tcp
CMD ["dotnet", "WebApp.dll", "--server.urls", "http://*:3333"]
After that, put a "kubernetes" folder in your Project's root. Folder structure can be like this:
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: webapp-${ENV}
spec:
replicas: ${PODS}
template:
metadata:
labels:
app: webapp
ENV: ${ENV}
spec:
containers:
- name: webapp
image: yourdockerregistry:5000/webapp:${IMAGE_TAG}
resources:
requests:
cpu: "750m"
ports:
- containerPort: 3333
readinessProbe:
tcpSocket:
port: 3333
initialDelaySeconds: 15
periodSeconds: 5
livenessProbe:
httpGet:
path: /status
port: 3333
initialDelaySeconds: 15
periodSeconds: 10
terminationGracePeriodSeconds: 30
---
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: webapp-${ENV}
spec:
scaleTargetRef:
apiVersion: apps/v1beta1
kind: Deployment
name: webapp-${ENV}
minReplicas: 10
maxReplicas: 25
metrics:
- type: Pods
pods:
metricName: cpu_usage # Metrics Comming From Prometheus. List of metrics : kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq .
targetAverageValue: 0.6 # If average pod CPU over %50, Pods will be scaled.
In this snippet, you will see some ENV variables for parametric values like image tag, deployment environment, blue-green deployment etc.. You can also use helm for rolling deployments, version bump-ups but I will use much more simple thing: "envsubst"
The other mechanism is horizontal scaling in the cluster. You can merge deployment and scaling in one yaml. In this instance, I used K8s' custom metric API.
We will use K8s' selectors in order to get blue-green switch for deployments. The selector object will take the suitable pods and bind into service. And I used nodePort because of binding services to Load Balancer externally.
You can bind like this : AGENTIP1:30333 http://servicedns.com AGENTIP2:30333 http://servicedns.com AGENTIP3:30333 http://servicedns.com
You don't have to give each agent's IP to load balancer because K8s have also internal Load-Balancing. (That's not a good approach. Managing in Loadbalancer in K8s simply better)
Your "switch_environment.sh" file can be like this.
#!/bin/bash
if [ -z "$1" ]
then
echo "No argument supplied"
exit 1
fi
if ! kubectl get svc $1
then
echo "No service found : ${1}"
exit 1
fi
ENVIRONMENT=$(kubectl describe svc $1 | grep ENV | awk '{print $2}' | cut -d"," -f1 | cut -d"=" -f2)
if [ $ENVIRONMENT == "blue" ]; then
ENV=green envsubst < service.yaml | kubectl apply -f -
echo "Switched to green"
else
ENV=blue envsubst < service.yaml | kubectl apply -f -
echo "Switched to blue"
fi
After all, bind all these items in one gocd.yml file.
You can define your build script to build, dockerise the application. If you have Test, Staging environments, put them in the "gocd.yaml" too. (In order to simplify, I removed those lines) That's it! After that, you have :
Dockerised dotnetcore app
Kubernetes Deployment pipelines
Blue-Green Switch Pipeline which controls kubernetes service (You have to configure kubectl for gocd agents)
Horizontal Pod Autoscaler (CPU based autoscale mechanism in the cluster)
Hi Guys, in this article I'm gonna talk about how microservices managed by Istio and why we should prefer Istio. Before Istio, the people, who own microservice architecture, are complaining about management of microservices, visualizing service mesh, monitoring of distributed services, service discovery and so on. The announcement of Istio came at a good time because of those issues.
Istio is a platform that hosted on the top of your kubernetes cluster. You deploy your applications(containers) with a special sidecar proxy throughout your environment that intercepts all network communication between microservices, configured and managed using Istio’s control plane functionality. Therefore, you can use Envoy for managing the network, routing rules etc... Service Discovery is also supported by Istio. You don’t have to think about whether or not the network was configured correctly.
I did the POC on their website, described in here: https://istio.io/docs/guides/bookinfo.html. It's clear to understand because it's hosted on the top of kubernetes. You can use your favorite Cloud Provider. (I used GCP's fully managed kubernetes cluster). The content-based request routing(A/B testing for ex.), traffic shifting, fault injection are the most satisfying parts of using Istio.
Istio also has their ready-to-use plugins and add-ons, you can have a look at the addons parts. Grafana-Prometheus for the monitoring, jeager for the tracing, dotviz for the service mesh... All of them are ready to inject. You manage and do custom configurations via their provision ymls.
Check out of those URL's if you are interested in.
In this article, I wanna focus on Continuous Delivery. As described by Martin Fowler, Continuous Delivery is a software development discipline where you build software in such a way that the software can be released to production at any time. This means our packages should be battle tested, reliable, automatically deployable and configurable. That's why we do Continuous Integration. Frequent builds, in turn, lead to more frequent releases. At that point, I ' m on the side of trunk based development rather than git flow.In my opinion, each commit should be deployed to the environments instead of waiting for a silly manual merging operation. We gain more agility and each change becomes simpler and lower risk.
From the business perspective, this idea is the simply perfection. Because this allows organizations to adjust rapidly to changing market conditions.
From the developer perspective, we need to develop and deploy more carefully, adding more test suites not only unit tests but also integration, contract, security etc. to our pipeline is good. Maybe, we should do more pair programming and this is like continuous code review. Frequent production releases make us more aware and we discover new approaches. The approaches we found, actually, Continuous Delivery best patterns.
Yazılım dünyasında son zamanların yükselen trendi gibi görünse de aslında özünde bir takım konseptler ve disiplin süreçlerini içeriyor. Bunların toplamına ise bir kültür diyebiliriz aslında.
Yukarıdaki anlam biraz havada kaldıysa konuyu daha derinlemesine inceleyelim.
Aslında bana en yakın gelen tanım şu şekilde : Yazılımcının kendi yazdığı kodu deploy etmesinden sorumlu olması. Buradan hareketle, DevOps, yanlızca bazı kişilerin üzerinde olan bir sorumluluk değil, çalıştığınız kurumun tüm geliştiricilerinde olması gereken bir özelliktir. Bu pencereden baktığımızda ise DevOps, aslında Agile,Waterfall gibi bir süreç.
Spotify bunu Agile kültürünün bir parçası olarak görüyor ve takımları, ürünlerinin design,development,deployment (end-to-end) vs. tüm süreçlerinden sorumlular. Takımları tamamı ile Cross-Functional. Takımların operasyonel işlerinin automate edilmesi ile başlayan otomasyon süreci, daha sonra Machine Management'e , Immutable Infrastructure'a , Infrastructure as Code'a kadar gidiyor. Bu ekipler otomasyonu sağlayan tooları yazıyorlar ve de Site Reliability Engineer'lardan oluşuyorlar. Amaç; Self-healing systemler ve auto-scalable infrastructure. Bu noktada insan faktörünün manuel konfigürasyon değişikliklerini 0'a indirdiğini ve artık datacenter'leri yöneten Software System'leri görüyoruz. Öyle ki Google'nın Borg'u, birden fazla Datacenter'da ki tüm operasyonları başarı ile orchestrate ediyor. Yönetimde Sys. Adminler yeride Software-Systemler var. Bir benzer teknoloji için; http://mesos.apache.org/
Dikkat edilmesi gereken nokta, Site Reliability Engineer'lar ürün deliver eden ekipleri bloklayan bir ekip değil aksine onlar için teknoloji geliştiriyorlar ve bu teknolojileri Developer'ların kullanımına sunuyorlar.