<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="http://itsamit.online/feed.xml" rel="self" type="application/atom+xml" /><link href="http://itsamit.online/" rel="alternate" type="text/html" /><updated>2026-03-24T10:57:20+00:00</updated><id>http://itsamit.online/feed.xml</id><title type="html">Amit Kumar DevOps Blog</title><subtitle>DevOps, Kubernetes and Cloud Engineering Notes</subtitle><author><name>Amit Kumar</name></author><entry><title type="html">EFK Stack on AKS — Production Ready Centralized Logging</title><link href="http://itsamit.online/azure/kubernetes/devops/observability/2026/03/24/EFK-setup-AKS.html" rel="alternate" type="text/html" title="EFK Stack on AKS — Production Ready Centralized Logging" /><published>2026-03-24T00:00:00+00:00</published><updated>2026-03-24T00:00:00+00:00</updated><id>http://itsamit.online/azure/kubernetes/devops/observability/2026/03/24/EFK-setup-AKS</id><content type="html" xml:base="http://itsamit.online/azure/kubernetes/devops/observability/2026/03/24/EFK-setup-AKS.html"><![CDATA[<h1 id="efk-stack-on-aks--production-ready-setup">EFK Stack on AKS — Production Ready Setup</h1>

<p>A complete guide to deploying <strong>Elasticsearch, Fluent Bit, and Kibana</strong> on Azure Kubernetes Service for centralized logging.</p>

<hr />

<h2 id="table-of-contents">Table of Contents</h2>

<ul>
  <li><a href="#architecture">Architecture</a></li>
  <li><a href="#prerequisites">Prerequisites</a></li>
  <li><a href="#cluster-setup">Cluster Setup</a></li>
  <li><a href="#deploy-elasticsearch-eck">Deploy Elasticsearch (ECK)</a></li>
  <li><a href="#deploy-kibana">Deploy Kibana</a></li>
  <li><a href="#deploy-fluent-bit">Deploy Fluent Bit</a></li>
  <li><a href="#azure-blob-archival">Azure Blob Archival</a></li>
  <li><a href="#index-lifecycle-management-ilm">Index Lifecycle Management (ILM)</a></li>
  <li><a href="#kibana-dashboards">Kibana Dashboards</a></li>
  <li><a href="#kql-queries-reference">KQL Queries Reference</a></li>
  <li><a href="#elasticsearch-dev-tools-queries">Elasticsearch Dev Tools Queries</a></li>
  <li><a href="#security">Security</a></li>
  <li><a href="#scaling--ha">Scaling &amp; HA</a></li>
  <li><a href="#monitoring-the-stack">Monitoring the Stack</a></li>
  <li><a href="#troubleshooting">Troubleshooting</a></li>
  <li><a href="#cost-optimization">Cost Optimization</a></li>
  <li><a href="#useful-commands">Useful Commands</a></li>
</ul>

<hr />

<h2 id="architecture">Architecture</h2>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>┌──────────────────────────────────────────────────────┐
│                    AKS Cluster                       │
│                                                      │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐          │
│  │ App Pods  │  │ App Pods  │  │ App Pods  │          │
│  │ (stdout)  │  │ (stdout)  │  │ (stdout)  │          │
│  └─────┬─────┘  └─────┬─────┘  └─────┬─────┘         │
│        └───────────────┼───────────────┘              │
│                        ▼                              │
│         ┌──────────────────────────┐                 │
│         │  Fluent Bit (DaemonSet)  │                 │
│         │  /var/log/containers/*   │                 │
│         └──────────┬───────┬───────┘                 │
│                    │       │                          │
│                    ▼       ▼                          │
│  ┌─────────────────────┐  ┌──────────────────┐      │
│  │   Elasticsearch     │  │  Azure Blob      │      │
│  │   (StatefulSet)     │  │  (archive)       │      │
│  │   master + data     │  └──────────────────┘      │
│  └──────────┬──────────┘                             │
│             ▼                                        │
│  ┌─────────────────────┐                             │
│  │   Kibana            │                             │
│  │   (Deployment)      │                             │
│  └─────────────────────┘                             │
└──────────────────────────────────────────────────────┘
</code></pre></div></div>

<h3 id="key-concepts">Key Concepts</h3>

<table>
  <thead>
    <tr>
      <th>Component</th>
      <th>K8s Resource</th>
      <th>Purpose</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>ECK Operator</td>
      <td>Deployment</td>
      <td>Manages ES &amp; Kibana lifecycle automatically</td>
    </tr>
    <tr>
      <td>ECK CRDs</td>
      <td>Custom Resource Definitions</td>
      <td>Teaches K8s what <code class="language-plaintext highlighter-rouge">Elasticsearch</code> and <code class="language-plaintext highlighter-rouge">Kibana</code> resources are</td>
    </tr>
    <tr>
      <td>Elasticsearch Master</td>
      <td>StatefulSet (via ECK)</td>
      <td>Cluster brain — tracks metadata, allocates shards. Always 3 for quorum</td>
    </tr>
    <tr>
      <td>Elasticsearch Data</td>
      <td>StatefulSet (via ECK)</td>
      <td>Stores and searches actual logs. Scale as needed</td>
    </tr>
    <tr>
      <td>Fluent Bit</td>
      <td>DaemonSet (via Helm)</td>
      <td>Lightweight log collector — one per node</td>
    </tr>
    <tr>
      <td>Kibana</td>
      <td>Deployment (via ECK)</td>
      <td>Web UI for searching and visualizing logs</td>
    </tr>
  </tbody>
</table>

<blockquote>
  <p><strong>Note</strong>: ES master/data nodes are pods on your AKS worker nodes. They are NOT related to AKS control plane master nodes (which are Azure-managed and invisible).</p>
</blockquote>

<h3 id="what-needs-persistent-storage">What Needs Persistent Storage?</h3>

<table>
  <thead>
    <tr>
      <th>Component</th>
      <th>PV Required?</th>
      <th>Reason</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Elasticsearch</td>
      <td>Yes</td>
      <td>Stores log data — must survive pod restarts</td>
    </tr>
    <tr>
      <td>Fluent Bit</td>
      <td>No</td>
      <td>Stateless — reads and forwards logs</td>
    </tr>
    <tr>
      <td>Kibana</td>
      <td>No</td>
      <td>Stateless — just a UI querying Elasticsearch</td>
    </tr>
  </tbody>
</table>

<hr />

<h2 id="prerequisites">Prerequisites</h2>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Azure CLI</span>
curl <span class="nt">-sL</span> https://aka.ms/InstallAzureCLIDeb | <span class="nb">sudo </span>bash

<span class="c"># kubectl</span>
az aks install-cli

<span class="c"># Helm 3</span>
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
</code></pre></div></div>

<hr />

<h2 id="cluster-setup">Cluster Setup</h2>

<h3 id="create-aks-cluster">Create AKS Cluster</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">RESOURCE_GROUP</span><span class="o">=</span><span class="s2">"rg-efk-prod"</span>
<span class="nv">CLUSTER_NAME</span><span class="o">=</span><span class="s2">"aks-efk-cluster"</span>
<span class="nv">LOCATION</span><span class="o">=</span><span class="s2">"centralindia"</span>

az group create <span class="nt">--name</span> <span class="nv">$RESOURCE_GROUP</span> <span class="nt">--location</span> <span class="nv">$LOCATION</span>

az aks create <span class="se">\</span>
  <span class="nt">--resource-group</span> <span class="nv">$RESOURCE_GROUP</span> <span class="se">\</span>
  <span class="nt">--name</span> <span class="nv">$CLUSTER_NAME</span> <span class="se">\</span>
  <span class="nt">--node-count</span> 3 <span class="se">\</span>
  <span class="nt">--node-vm-size</span> Standard_D4s_v3 <span class="se">\</span>
  <span class="nt">--enable-managed-identity</span> <span class="se">\</span>
  <span class="nt">--network-plugin</span> azure <span class="se">\</span>
  <span class="nt">--network-policy</span> calico <span class="se">\</span>
  <span class="nt">--generate-ssh-keys</span> <span class="se">\</span>
  <span class="nt">--zones</span> 1 2 3

az aks get-credentials <span class="nt">--resource-group</span> <span class="nv">$RESOURCE_GROUP</span> <span class="nt">--name</span> <span class="nv">$CLUSTER_NAME</span>
</code></pre></div></div>

<h3 id="create-dedicated-node-pool-for-elasticsearch">Create Dedicated Node Pool for Elasticsearch</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>az aks nodepool add <span class="se">\</span>
  <span class="nt">--resource-group</span> <span class="nv">$RESOURCE_GROUP</span> <span class="se">\</span>
  <span class="nt">--cluster-name</span> <span class="nv">$CLUSTER_NAME</span> <span class="se">\</span>
  <span class="nt">--name</span> espooldata <span class="se">\</span>
  <span class="nt">--node-count</span> 3 <span class="se">\</span>
  <span class="nt">--node-vm-size</span> Standard_E4s_v3 <span class="se">\</span>
  <span class="nt">--labels</span> <span class="nv">role</span><span class="o">=</span>elasticsearch-data <span class="se">\</span>
  <span class="nt">--node-taints</span> <span class="nv">elasticsearch</span><span class="o">=</span><span class="nb">true</span>:NoSchedule <span class="se">\</span>
  <span class="nt">--zones</span> 1 2 3
</code></pre></div></div>

<blockquote>
  <p><strong>Taint</strong> keeps non-ES pods out. <strong>Toleration</strong> (in ES spec) lets ES pods in. <strong>nodeSelector</strong> forces ES pods to only run here.</p>
</blockquote>

<h3 id="create-namespace">Create Namespace</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl create namespace logging
</code></pre></div></div>

<hr />

<h2 id="deploy-elasticsearch-eck">Deploy Elasticsearch (ECK)</h2>

<h3 id="install-eck-operator">Install ECK Operator</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># CRDs — teach K8s new resource types</span>
kubectl create <span class="nt">-f</span> https://download.elastic.co/downloads/eck/2.14.0/crds.yaml

<span class="c"># Operator — the specialist that manages ES &amp; Kibana</span>
kubectl apply <span class="nt">-f</span> https://download.elastic.co/downloads/eck/2.14.0/operator.yaml

kubectl <span class="nt">-n</span> elastic-system get pods
</code></pre></div></div>

<h3 id="elasticsearch-resource">Elasticsearch Resource</h3>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># elasticsearch.yaml</span>
<span class="na">apiVersion</span><span class="pi">:</span> <span class="s">elasticsearch.k8s.elastic.co/v1</span>
<span class="na">kind</span><span class="pi">:</span> <span class="s">Elasticsearch</span>
<span class="na">metadata</span><span class="pi">:</span>
  <span class="na">name</span><span class="pi">:</span> <span class="s">efk-cluster</span>
  <span class="na">namespace</span><span class="pi">:</span> <span class="s">logging</span>
<span class="na">spec</span><span class="pi">:</span>
  <span class="na">version</span><span class="pi">:</span> <span class="s">8.15.0</span>
  <span class="na">nodeSets</span><span class="pi">:</span>
    <span class="c1"># --- Master Nodes (cluster brain, no data) ---</span>
    <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">master</span>
      <span class="na">count</span><span class="pi">:</span> <span class="m">3</span>
      <span class="na">config</span><span class="pi">:</span>
        <span class="na">node.roles</span><span class="pi">:</span> <span class="pi">[</span><span class="s2">"</span><span class="s">master"</span><span class="pi">]</span>
        <span class="na">node.store.allow_mmap</span><span class="pi">:</span> <span class="no">false</span>
      <span class="na">podTemplate</span><span class="pi">:</span>
        <span class="na">spec</span><span class="pi">:</span>
          <span class="na">containers</span><span class="pi">:</span>
            <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">elasticsearch</span>
              <span class="na">resources</span><span class="pi">:</span>
                <span class="na">requests</span><span class="pi">:</span>
                  <span class="na">memory</span><span class="pi">:</span> <span class="s">2Gi</span>
                  <span class="na">cpu</span><span class="pi">:</span> <span class="s">500m</span>
                <span class="na">limits</span><span class="pi">:</span>
                  <span class="na">memory</span><span class="pi">:</span> <span class="s">4Gi</span>
                  <span class="na">cpu</span><span class="pi">:</span> <span class="m">2</span>
              <span class="na">env</span><span class="pi">:</span>
                <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">ES_JAVA_OPTS</span>
                  <span class="na">value</span><span class="pi">:</span> <span class="s2">"</span><span class="s">-Xms2g</span><span class="nv"> </span><span class="s">-Xmx2g"</span>
          <span class="na">initContainers</span><span class="pi">:</span>
            <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">sysctl</span>
              <span class="na">securityContext</span><span class="pi">:</span>
                <span class="na">privileged</span><span class="pi">:</span> <span class="no">true</span>
                <span class="na">runAsUser</span><span class="pi">:</span> <span class="m">0</span>
              <span class="na">command</span><span class="pi">:</span> <span class="pi">[</span><span class="s1">'</span><span class="s">sh'</span><span class="pi">,</span> <span class="s1">'</span><span class="s">-c'</span><span class="pi">,</span> <span class="s1">'</span><span class="s">sysctl</span><span class="nv"> </span><span class="s">-w</span><span class="nv"> </span><span class="s">vm.max_map_count=262144'</span><span class="pi">]</span>
          <span class="na">tolerations</span><span class="pi">:</span>
            <span class="pi">-</span> <span class="na">key</span><span class="pi">:</span> <span class="s2">"</span><span class="s">elasticsearch"</span>
              <span class="na">operator</span><span class="pi">:</span> <span class="s2">"</span><span class="s">Equal"</span>
              <span class="na">value</span><span class="pi">:</span> <span class="s2">"</span><span class="s">true"</span>
              <span class="na">effect</span><span class="pi">:</span> <span class="s2">"</span><span class="s">NoSchedule"</span>
          <span class="na">nodeSelector</span><span class="pi">:</span>
            <span class="na">role</span><span class="pi">:</span> <span class="s">elasticsearch-data</span>
      <span class="na">volumeClaimTemplates</span><span class="pi">:</span>
        <span class="pi">-</span> <span class="na">metadata</span><span class="pi">:</span>
            <span class="na">name</span><span class="pi">:</span> <span class="s">elasticsearch-data</span>
          <span class="na">spec</span><span class="pi">:</span>
            <span class="na">accessModes</span><span class="pi">:</span> <span class="pi">[</span><span class="s2">"</span><span class="s">ReadWriteOnce"</span><span class="pi">]</span>
            <span class="na">storageClassName</span><span class="pi">:</span> <span class="s">managed-premium</span>
            <span class="na">resources</span><span class="pi">:</span>
              <span class="na">requests</span><span class="pi">:</span>
                <span class="na">storage</span><span class="pi">:</span> <span class="s">10Gi</span>

    <span class="c1"># --- Data Nodes (stores and searches logs) ---</span>
    <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">data</span>
      <span class="na">count</span><span class="pi">:</span> <span class="m">3</span>
      <span class="na">config</span><span class="pi">:</span>
        <span class="na">node.roles</span><span class="pi">:</span> <span class="pi">[</span><span class="s2">"</span><span class="s">data"</span><span class="pi">,</span> <span class="s2">"</span><span class="s">ingest"</span><span class="pi">]</span>
        <span class="na">node.store.allow_mmap</span><span class="pi">:</span> <span class="no">false</span>
      <span class="na">podTemplate</span><span class="pi">:</span>
        <span class="na">spec</span><span class="pi">:</span>
          <span class="na">containers</span><span class="pi">:</span>
            <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">elasticsearch</span>
              <span class="na">resources</span><span class="pi">:</span>
                <span class="na">requests</span><span class="pi">:</span>
                  <span class="na">memory</span><span class="pi">:</span> <span class="s">4Gi</span>
                  <span class="na">cpu</span><span class="pi">:</span> <span class="m">2</span>
                <span class="na">limits</span><span class="pi">:</span>
                  <span class="na">memory</span><span class="pi">:</span> <span class="s">8Gi</span>
                  <span class="na">cpu</span><span class="pi">:</span> <span class="m">4</span>
              <span class="na">env</span><span class="pi">:</span>
                <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">ES_JAVA_OPTS</span>
                  <span class="na">value</span><span class="pi">:</span> <span class="s2">"</span><span class="s">-Xms4g</span><span class="nv"> </span><span class="s">-Xmx4g"</span>
          <span class="na">tolerations</span><span class="pi">:</span>
            <span class="pi">-</span> <span class="na">key</span><span class="pi">:</span> <span class="s2">"</span><span class="s">elasticsearch"</span>
              <span class="na">operator</span><span class="pi">:</span> <span class="s2">"</span><span class="s">Equal"</span>
              <span class="na">value</span><span class="pi">:</span> <span class="s2">"</span><span class="s">true"</span>
              <span class="na">effect</span><span class="pi">:</span> <span class="s2">"</span><span class="s">NoSchedule"</span>
          <span class="na">nodeSelector</span><span class="pi">:</span>
            <span class="na">role</span><span class="pi">:</span> <span class="s">elasticsearch-data</span>
      <span class="na">volumeClaimTemplates</span><span class="pi">:</span>
        <span class="pi">-</span> <span class="na">metadata</span><span class="pi">:</span>
            <span class="na">name</span><span class="pi">:</span> <span class="s">elasticsearch-data</span>
          <span class="na">spec</span><span class="pi">:</span>
            <span class="na">accessModes</span><span class="pi">:</span> <span class="pi">[</span><span class="s2">"</span><span class="s">ReadWriteOnce"</span><span class="pi">]</span>
            <span class="na">storageClassName</span><span class="pi">:</span> <span class="s">managed-premium</span>
            <span class="na">resources</span><span class="pi">:</span>
              <span class="na">requests</span><span class="pi">:</span>
                <span class="na">storage</span><span class="pi">:</span> <span class="s">100Gi</span>
</code></pre></div></div>

<blockquote>
  <p><strong>For dev/demo</strong> (2 CPU / 8GB): Use a single nodeSet with <code class="language-plaintext highlighter-rouge">count: 1</code>, combined roles, <code class="language-plaintext highlighter-rouge">managed</code> storageClass, and reduced resources. See <a href="#cost-optimization">Cost Optimization</a>.</p>
</blockquote>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl apply <span class="nt">-f</span> elasticsearch.yaml
kubectl <span class="nt">-n</span> logging get elasticsearch
kubectl <span class="nt">-n</span> logging get pods <span class="nt">-l</span> elasticsearch.k8s.elastic.co/cluster-name<span class="o">=</span>efk-cluster
</code></pre></div></div>

<h3 id="get-credentials">Get Credentials</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Username is always: elastic</span>
<span class="c"># Password:</span>
kubectl <span class="nt">-n</span> logging get secret efk-cluster-es-elastic-user <span class="nt">-o</span> <span class="nv">jsonpath</span><span class="o">=</span><span class="s1">'{.data.elastic}'</span> | <span class="nb">base64</span> <span class="nt">-d</span>
</code></pre></div></div>

<h3 id="verify">Verify</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl <span class="nt">-n</span> logging port-forward svc/efk-cluster-es-http 9200
curl <span class="nt">-k</span> <span class="nt">-u</span> <span class="s2">"elastic:&lt;password&gt;"</span> https://localhost:9200/_cluster/health?pretty
</code></pre></div></div>

<hr />

<h2 id="deploy-kibana">Deploy Kibana</h2>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># kibana.yaml</span>
<span class="na">apiVersion</span><span class="pi">:</span> <span class="s">kibana.k8s.elastic.co/v1</span>
<span class="na">kind</span><span class="pi">:</span> <span class="s">Kibana</span>
<span class="na">metadata</span><span class="pi">:</span>
  <span class="na">name</span><span class="pi">:</span> <span class="s">efk-kibana</span>
  <span class="na">namespace</span><span class="pi">:</span> <span class="s">logging</span>
<span class="na">spec</span><span class="pi">:</span>
  <span class="na">version</span><span class="pi">:</span> <span class="s">8.15.0</span>
  <span class="na">count</span><span class="pi">:</span> <span class="m">2</span>
  <span class="na">elasticsearchRef</span><span class="pi">:</span>
    <span class="na">name</span><span class="pi">:</span> <span class="s">efk-cluster</span>    <span class="c1"># ECK auto-injects ES credentials</span>
  <span class="na">podTemplate</span><span class="pi">:</span>
    <span class="na">spec</span><span class="pi">:</span>
      <span class="na">containers</span><span class="pi">:</span>
        <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">kibana</span>
          <span class="na">resources</span><span class="pi">:</span>
            <span class="na">requests</span><span class="pi">:</span>
              <span class="na">memory</span><span class="pi">:</span> <span class="s">768Mi</span>
              <span class="na">cpu</span><span class="pi">:</span> <span class="s">200m</span>
            <span class="na">limits</span><span class="pi">:</span>
              <span class="na">memory</span><span class="pi">:</span> <span class="s">1Gi</span>
              <span class="na">cpu</span><span class="pi">:</span> <span class="s">500m</span>
          <span class="na">env</span><span class="pi">:</span>
            <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">NODE_OPTIONS</span>
              <span class="na">value</span><span class="pi">:</span> <span class="s2">"</span><span class="s">--max-old-space-size=768"</span>    <span class="c1"># prevents JS heap OOM</span>
</code></pre></div></div>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl apply <span class="nt">-f</span> kibana.yaml
kubectl <span class="nt">-n</span> logging port-forward svc/efk-kibana-kb-http 5601
<span class="c"># Open https://localhost:5601 → login with elastic/&lt;password&gt;</span>
</code></pre></div></div>

<h3 id="expose-via-ingress-production">Expose via Ingress (Production)</h3>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># kibana-ingress.yaml</span>
<span class="na">apiVersion</span><span class="pi">:</span> <span class="s">networking.k8s.io/v1</span>
<span class="na">kind</span><span class="pi">:</span> <span class="s">Ingress</span>
<span class="na">metadata</span><span class="pi">:</span>
  <span class="na">name</span><span class="pi">:</span> <span class="s">kibana-ingress</span>
  <span class="na">namespace</span><span class="pi">:</span> <span class="s">logging</span>
  <span class="na">annotations</span><span class="pi">:</span>
    <span class="na">nginx.ingress.kubernetes.io/backend-protocol</span><span class="pi">:</span> <span class="s2">"</span><span class="s">HTTPS"</span>
    <span class="na">nginx.ingress.kubernetes.io/ssl-redirect</span><span class="pi">:</span> <span class="s2">"</span><span class="s">true"</span>
    <span class="na">cert-manager.io/cluster-issuer</span><span class="pi">:</span> <span class="s2">"</span><span class="s">letsencrypt-prod"</span>
<span class="na">spec</span><span class="pi">:</span>
  <span class="na">ingressClassName</span><span class="pi">:</span> <span class="s">nginx</span>
  <span class="na">tls</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="na">hosts</span><span class="pi">:</span>
        <span class="pi">-</span> <span class="s">kibana.yourdomain.com</span>
      <span class="na">secretName</span><span class="pi">:</span> <span class="s">kibana-tls</span>
  <span class="na">rules</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="na">host</span><span class="pi">:</span> <span class="s">kibana.yourdomain.com</span>
      <span class="na">http</span><span class="pi">:</span>
        <span class="na">paths</span><span class="pi">:</span>
          <span class="pi">-</span> <span class="na">path</span><span class="pi">:</span> <span class="s">/</span>
            <span class="na">pathType</span><span class="pi">:</span> <span class="s">Prefix</span>
            <span class="na">backend</span><span class="pi">:</span>
              <span class="na">service</span><span class="pi">:</span>
                <span class="na">name</span><span class="pi">:</span> <span class="s">efk-kibana-kb-http</span>
                <span class="na">port</span><span class="pi">:</span>
                  <span class="na">number</span><span class="pi">:</span> <span class="m">5601</span>
</code></pre></div></div>

<hr />

<h2 id="deploy-fluent-bit">Deploy Fluent Bit</h2>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># fluent-bit-values.yaml</span>
<span class="na">image</span><span class="pi">:</span>
  <span class="na">repository</span><span class="pi">:</span> <span class="s">fluent/fluent-bit</span>
  <span class="na">tag</span><span class="pi">:</span> <span class="s2">"</span><span class="s">3.1"</span>

<span class="na">daemonSetVolumes</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">varlog</span>
    <span class="na">hostPath</span><span class="pi">:</span>
      <span class="na">path</span><span class="pi">:</span> <span class="s">/var/log</span>
  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">varlibdockercontainers</span>
    <span class="na">hostPath</span><span class="pi">:</span>
      <span class="na">path</span><span class="pi">:</span> <span class="s">/var/lib/docker/containers</span>

<span class="na">daemonSetVolumeMounts</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">varlog</span>
    <span class="na">mountPath</span><span class="pi">:</span> <span class="s">/var/log</span>
  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">varlibdockercontainers</span>
    <span class="na">mountPath</span><span class="pi">:</span> <span class="s">/var/lib/docker/containers</span>
    <span class="na">readOnly</span><span class="pi">:</span> <span class="no">true</span>

<span class="na">config</span><span class="pi">:</span>
  <span class="na">service</span><span class="pi">:</span> <span class="pi">|</span>
    <span class="s">[SERVICE]</span>
        <span class="s">Flush         5              # send logs every 5 seconds</span>
        <span class="s">Log_Level     info</span>
        <span class="s">Daemon        off</span>
        <span class="s">Parsers_File  /fluent-bit/etc/parsers.conf</span>
        <span class="s">HTTP_Server   On             # metrics endpoint</span>
        <span class="s">HTTP_Listen   0.0.0.0</span>
        <span class="s">HTTP_Port     2020</span>
        <span class="s">Health_Check  On</span>

  <span class="na">inputs</span><span class="pi">:</span> <span class="pi">|</span>
    <span class="s">[INPUT]</span>
        <span class="s">Name              tail                           # tail log files</span>
        <span class="s">Tag               kube.*</span>
        <span class="s">Path              /var/log/containers/*.log      # all container logs</span>
        <span class="s">Parser            cri                            # AKS uses CRI format</span>
        <span class="s">DB                /var/log/flb_kube.db           # tracks read position</span>
        <span class="s">Mem_Buf_Limit     50MB                           # prevents OOM</span>
        <span class="s">Skip_Long_Lines   On</span>
        <span class="s">Refresh_Interval  10</span>
        <span class="s">Read_from_Head    False</span>

    <span class="s">[INPUT]</span>
        <span class="s">Name              systemd</span>
        <span class="s">Tag               node.systemd.*</span>
        <span class="s">Systemd_Filter    _SYSTEMD_UNIT=kubelet.service</span>
        <span class="s">Read_From_Tail    On</span>

  <span class="na">filters</span><span class="pi">:</span> <span class="pi">|</span>
    <span class="s">[FILTER]</span>
        <span class="s">Name                kubernetes          # enrich with K8s metadata</span>
        <span class="s">Match               kube.*</span>
        <span class="s">Kube_URL            https://kubernetes.default.svc:443</span>
        <span class="s">Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt</span>
        <span class="s">Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token</span>
        <span class="s">Kube_Tag_Prefix     kube.var.log.containers.</span>
        <span class="s">Merge_Log           On                  # parse JSON logs automatically</span>
        <span class="s">Merge_Log_Key       log_processed</span>
        <span class="s">K8S-Logging.Parser  On</span>
        <span class="s">K8S-Logging.Exclude On</span>
        <span class="s">Labels              On</span>
        <span class="s">Annotations         Off</span>
        <span class="s">Buffer_Size         0</span>

    <span class="s">[FILTER]</span>
        <span class="s">Name          modify</span>
        <span class="s">Match         kube.*</span>
        <span class="s">Add           cluster aks-efk-cluster   # tag with cluster name</span>
        <span class="s">Add           environment production</span>

    <span class="s">[FILTER]</span>
        <span class="s">Name          nest</span>
        <span class="s">Match         kube.*</span>
        <span class="s">Operation     lift</span>
        <span class="s">Nested_under  kubernetes                # flatten nested fields</span>

  <span class="na">outputs</span><span class="pi">:</span> <span class="pi">|</span>
    <span class="s">[OUTPUT]</span>
        <span class="s">Name            es</span>
        <span class="s">Match           kube.*</span>
        <span class="s">Host            efk-cluster-es-http.logging.svc.cluster.local</span>
        <span class="s">Port            9200</span>
        <span class="s">HTTP_User       elastic</span>
        <span class="s">HTTP_Passwd     ${ES_PASSWORD}</span>
        <span class="s">Logstash_Format On                      # creates daily indices</span>
        <span class="s">Logstash_Prefix k8s-logs                # index: k8s-logs-2026.03.24</span>
        <span class="s">Suppress_Type_Name On</span>
        <span class="s">tls             On</span>
        <span class="s">tls.verify      Off</span>
        <span class="s">Retry_Limit     5</span>
        <span class="s">Replace_Dots    On</span>
        <span class="s">Buffer_Size     512KB</span>

    <span class="s">[OUTPUT]</span>
        <span class="s">Name            es</span>
        <span class="s">Match           node.systemd.*</span>
        <span class="s">Host            efk-cluster-es-http.logging.svc.cluster.local</span>
        <span class="s">Port            9200</span>
        <span class="s">HTTP_User       elastic</span>
        <span class="s">HTTP_Passwd     ${ES_PASSWORD}</span>
        <span class="s">Logstash_Format On</span>
        <span class="s">Logstash_Prefix node-logs               # separate index for node logs</span>
        <span class="s">Suppress_Type_Name On</span>
        <span class="s">tls             On</span>
        <span class="s">tls.verify      Off</span>

  <span class="na">customParsers</span><span class="pi">:</span> <span class="pi">|</span>
    <span class="s">[PARSER]</span>
        <span class="s">Name        cri</span>
        <span class="s">Format      regex</span>
        <span class="s">Regex       ^(?&lt;time&gt;[^ ]+) (?&lt;stream&gt;stdout|stderr) (?&lt;logtag&gt;[^ ]*) (?&lt;message&gt;.*)$</span>
        <span class="s">Time_Key    time</span>
        <span class="s">Time_Format %Y-%m-%dT%H:%M:%S.%L%z</span>

    <span class="s">[PARSER]</span>
        <span class="s">Name        json</span>
        <span class="s">Format      json</span>
        <span class="s">Time_Key    time</span>
        <span class="s">Time_Format %Y-%m-%dT%H:%M:%S.%L</span>
        <span class="s">Time_Keep   On</span>

    <span class="s">[PARSER]</span>
        <span class="s">Name        docker</span>
        <span class="s">Format      json</span>
        <span class="s">Time_Key    time</span>
        <span class="s">Time_Format %Y-%m-%dT%H:%M:%S.%L</span>
        <span class="s">Time_Keep   On</span>

<span class="na">env</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">ES_PASSWORD</span>
    <span class="na">valueFrom</span><span class="pi">:</span>
      <span class="na">secretKeyRef</span><span class="pi">:</span>
        <span class="na">name</span><span class="pi">:</span> <span class="s">efk-cluster-es-elastic-user</span>
        <span class="na">key</span><span class="pi">:</span> <span class="s">elastic</span>

<span class="na">resources</span><span class="pi">:</span>
  <span class="na">limits</span><span class="pi">:</span>
    <span class="na">cpu</span><span class="pi">:</span> <span class="s">200m</span>
    <span class="na">memory</span><span class="pi">:</span> <span class="s">256Mi</span>
  <span class="na">requests</span><span class="pi">:</span>
    <span class="na">cpu</span><span class="pi">:</span> <span class="s">100m</span>
    <span class="na">memory</span><span class="pi">:</span> <span class="s">128Mi</span>

<span class="na">tolerations</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="na">operator</span><span class="pi">:</span> <span class="s">Exists</span>    <span class="c1"># run on ALL nodes</span>

<span class="na">serviceMonitor</span><span class="pi">:</span>
  <span class="na">enabled</span><span class="pi">:</span> <span class="no">true</span>
</code></pre></div></div>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>helm repo add fluent https://fluent.github.io/helm-charts
helm repo update

helm <span class="nb">install </span>fluent-bit fluent/fluent-bit <span class="se">\</span>
  <span class="nt">--namespace</span> logging <span class="se">\</span>
  <span class="nt">-f</span> fluent-bit-values.yaml

kubectl <span class="nt">-n</span> logging get pods <span class="nt">-l</span> app.kubernetes.io/name<span class="o">=</span>fluent-bit
kubectl <span class="nt">-n</span> logging logs <span class="nt">-l</span> app.kubernetes.io/name<span class="o">=</span>fluent-bit <span class="nt">--tail</span><span class="o">=</span>20
</code></pre></div></div>

<hr />
<h2 id="kibana-interface">Kibana Interface</h2>
<p><img width="1792" height="1043" alt="Screenshot from 2026-03-24 16-24-09" src="https://github.com/user-attachments/assets/c5b6c1f5-01b5-4350-a38f-2802ffa744cf" />
<img width="1794" height="338" alt="Screenshot from 2026-03-24 16-25-03" src="https://github.com/user-attachments/assets/9552ac9e-7dba-4ec2-a432-d52528c7660d" /></p>

<h2 id="azure-blob-archival">Azure Blob Archival</h2>

<p>Send logs to both Elasticsearch (short-term search) and Azure Blob (long-term archive).</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Fluent Bit ──▶ Elasticsearch (last 90 days, searchable in Kibana)
     │
     └──────▶ Azure Blob (forever, cheap, manual access only)
</code></pre></div></div>

<blockquote>
  <p>Kibana CANNOT read from Blob directly. To search archived logs, re-ingest them into ES temporarily.</p>
</blockquote>

<h3 id="setup">Setup</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Create storage account</span>
az storage account create <span class="se">\</span>
  <span class="nt">--name</span> efklogsarchive <span class="se">\</span>
  <span class="nt">--resource-group</span> <span class="nv">$RESOURCE_GROUP</span> <span class="se">\</span>
  <span class="nt">--location</span> centralindia <span class="se">\</span>
  <span class="nt">--sku</span> Standard_LRS

<span class="c"># Create container</span>
az storage container create <span class="nt">--name</span> logs-archive <span class="nt">--account-name</span> efklogsarchive

<span class="c"># Store key as K8s secret (no copy-paste errors)</span>
az storage account keys list <span class="nt">--account-name</span> efklogsarchive <span class="nt">--query</span> <span class="s2">"[0].value"</span> <span class="nt">-o</span> tsv | <span class="se">\</span>
  xargs <span class="nt">-I</span> <span class="o">{}</span> kubectl <span class="nt">-n</span> logging create secret generic azure-blob-secret <span class="se">\</span>
  <span class="nt">--from-literal</span><span class="o">=</span><span class="nv">shared_key</span><span class="o">={}</span>
</code></pre></div></div>

<h3 id="add-to-fluent-bit-config">Add to Fluent Bit Config</h3>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Add to env section</span>
<span class="na">env</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">AZURE_BLOB_KEY</span>
    <span class="na">valueFrom</span><span class="pi">:</span>
      <span class="na">secretKeyRef</span><span class="pi">:</span>
        <span class="na">name</span><span class="pi">:</span> <span class="s">azure-blob-secret</span>
        <span class="na">key</span><span class="pi">:</span> <span class="s">shared_key</span>

<span class="c1"># Add to outputs section (alongside existing ES output)</span>
<span class="na">outputs</span><span class="pi">:</span> <span class="pi">|</span>
    <span class="s"># ... existing ES outputs ...</span>

    <span class="s">[OUTPUT]</span>
        <span class="s">Name              azure_blob</span>
        <span class="s">Match             kube.*</span>
        <span class="s">account_name      efklogsarchive</span>
        <span class="s">shared_key        ${AZURE_BLOB_KEY}</span>
        <span class="s">container_name    logs-archive</span>
        <span class="s">path              year=%Y/month=%m/day=%d</span>
        <span class="s">auto_create_container  On</span>
        <span class="s">blob_type         blockblob</span>
        <span class="s">Retry_Limit       3</span>
</code></pre></div></div>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>helm upgrade fluent-bit fluent/fluent-bit <span class="nt">--namespace</span> logging <span class="nt">-f</span> fluent-bit-values.yaml

<span class="c"># Verify</span>
az storage blob list <span class="nt">--account-name</span> efklogsarchive <span class="nt">--container-name</span> logs-archive <span class="nt">--output</span> table
</code></pre></div></div>

<hr />

<h2 id="index-lifecycle-management-ilm">Index Lifecycle Management (ILM)</h2>

<p>Prevents disk from filling up by auto-managing index lifecycle.</p>

<p>Run in <strong>Kibana Dev Tools</strong>:</p>

<h3 id="create-ilm-policy">Create ILM Policy</h3>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="err">PUT</span><span class="w"> </span><span class="err">_ilm/policy/k</span><span class="mi">8</span><span class="err">s-logs-policy</span><span class="w">
</span><span class="p">{</span><span class="w">
  </span><span class="nl">"policy"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"phases"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
      </span><span class="nl">"hot"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
        </span><span class="nl">"min_age"</span><span class="p">:</span><span class="w"> </span><span class="s2">"0ms"</span><span class="p">,</span><span class="w">
        </span><span class="nl">"actions"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
          </span><span class="nl">"rollover"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
            </span><span class="nl">"max_age"</span><span class="p">:</span><span class="w"> </span><span class="s2">"1d"</span><span class="p">,</span><span class="w">
            </span><span class="nl">"max_primary_shard_size"</span><span class="p">:</span><span class="w"> </span><span class="s2">"50gb"</span><span class="w">
          </span><span class="p">}</span><span class="w">
        </span><span class="p">}</span><span class="w">
      </span><span class="p">},</span><span class="w">
      </span><span class="nl">"warm"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
        </span><span class="nl">"min_age"</span><span class="p">:</span><span class="w"> </span><span class="s2">"3d"</span><span class="p">,</span><span class="w">
        </span><span class="nl">"actions"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
          </span><span class="nl">"shrink"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"number_of_shards"</span><span class="p">:</span><span class="w"> </span><span class="mi">1</span><span class="w"> </span><span class="p">},</span><span class="w">
          </span><span class="nl">"forcemerge"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"max_num_segments"</span><span class="p">:</span><span class="w"> </span><span class="mi">1</span><span class="w"> </span><span class="p">}</span><span class="w">
        </span><span class="p">}</span><span class="w">
      </span><span class="p">},</span><span class="w">
      </span><span class="nl">"cold"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
        </span><span class="nl">"min_age"</span><span class="p">:</span><span class="w"> </span><span class="s2">"30d"</span><span class="p">,</span><span class="w">
        </span><span class="nl">"actions"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"freeze"</span><span class="p">:</span><span class="w"> </span><span class="p">{}</span><span class="w"> </span><span class="p">}</span><span class="w">
      </span><span class="p">},</span><span class="w">
      </span><span class="nl">"delete"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
        </span><span class="nl">"min_age"</span><span class="p">:</span><span class="w"> </span><span class="s2">"90d"</span><span class="p">,</span><span class="w">
        </span><span class="nl">"actions"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"delete"</span><span class="p">:</span><span class="w"> </span><span class="p">{}</span><span class="w"> </span><span class="p">}</span><span class="w">
      </span><span class="p">}</span><span class="w">
    </span><span class="p">}</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<h3 id="create-index-template">Create Index Template</h3>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="err">PUT</span><span class="w"> </span><span class="err">_index_template/k</span><span class="mi">8</span><span class="err">s-logs-template</span><span class="w">
</span><span class="p">{</span><span class="w">
  </span><span class="nl">"index_patterns"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"k8s-logs-*"</span><span class="p">],</span><span class="w">
  </span><span class="nl">"template"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"settings"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
      </span><span class="nl">"number_of_shards"</span><span class="p">:</span><span class="w"> </span><span class="mi">3</span><span class="p">,</span><span class="w">
      </span><span class="nl">"number_of_replicas"</span><span class="p">:</span><span class="w"> </span><span class="mi">1</span><span class="p">,</span><span class="w">
      </span><span class="nl">"index.lifecycle.name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"k8s-logs-policy"</span><span class="p">,</span><span class="w">
      </span><span class="nl">"index.lifecycle.rollover_alias"</span><span class="p">:</span><span class="w"> </span><span class="s2">"k8s-logs"</span><span class="w">
    </span><span class="p">},</span><span class="w">
    </span><span class="nl">"mappings"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
      </span><span class="nl">"properties"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
        </span><span class="nl">"@timestamp"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"date"</span><span class="w"> </span><span class="p">},</span><span class="w">
        </span><span class="nl">"message"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"text"</span><span class="w"> </span><span class="p">},</span><span class="w">
        </span><span class="nl">"level"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"keyword"</span><span class="w"> </span><span class="p">},</span><span class="w">
        </span><span class="nl">"kubernetes.pod_name"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"keyword"</span><span class="w"> </span><span class="p">},</span><span class="w">
        </span><span class="nl">"kubernetes.namespace_name"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"keyword"</span><span class="w"> </span><span class="p">},</span><span class="w">
        </span><span class="nl">"kubernetes.container_name"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"keyword"</span><span class="w"> </span><span class="p">},</span><span class="w">
        </span><span class="nl">"cluster"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"keyword"</span><span class="w"> </span><span class="p">},</span><span class="w">
        </span><span class="nl">"environment"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"keyword"</span><span class="w"> </span><span class="p">}</span><span class="w">
      </span><span class="p">}</span><span class="w">
    </span><span class="p">}</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<hr />

<h2 id="kibana-dashboards">Kibana Dashboards</h2>

<h3 id="first-create-data-view">First: Create Data View</h3>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Stack Management → Data Views → Create
  Name:         k8s-logs
  Pattern:      k8s-logs-*
  Time field:   @timestamp
</code></pre></div></div>

<h3 id="recommended-dashboard-panels">Recommended Dashboard Panels</h3>

<table>
  <thead>
    <tr>
      <th>Panel</th>
      <th>Chart Type</th>
      <th>Config</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Log volume over time</td>
      <td>Line</td>
      <td>X: @timestamp, Y: count(), Breakdown: namespace_name.keyword</td>
    </tr>
    <tr>
      <td>Error count</td>
      <td>Metric (big number)</td>
      <td>Filter: <code class="language-plaintext highlighter-rouge">stream: "stderr"</code></td>
    </tr>
    <tr>
      <td>Top pods by log volume</td>
      <td>Bar vertical</td>
      <td>X: pod_name.keyword (top 10), Y: count()</td>
    </tr>
    <tr>
      <td>Logs by namespace</td>
      <td>Pie</td>
      <td>Slice: namespace_name.keyword</td>
    </tr>
    <tr>
      <td>Logs by container image</td>
      <td>Bar horizontal</td>
      <td>Y: container_image.keyword (top 10), X: count()</td>
    </tr>
    <tr>
      <td>Recent errors table</td>
      <td>Table</td>
      <td>Filter: <code class="language-plaintext highlighter-rouge">stream: "stderr"</code>, Columns: @timestamp, namespace, pod, message</td>
    </tr>
  </tbody>
</table>

<hr />

<h2 id="kql-queries-reference">KQL Queries Reference</h2>

<p>Use in <strong>Discover</strong> or <strong>Dashboard</strong> KQL bar.</p>

<h3 id="pod-logs">Pod Logs</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Logs from specific pod</span>
pod_name.keyword: <span class="s2">"myapp-7d8f9b6c5-x2k4n"</span>

<span class="c"># Logs from specific namespace</span>
namespace_name.keyword: <span class="s2">"production"</span>

<span class="c"># Logs from specific app (by label)</span>
labels.app.keyword: <span class="s2">"payment-service"</span>

<span class="c"># Logs from specific container</span>
container_name.keyword: <span class="s2">"nginx"</span>

<span class="c"># Logs from specific node</span>
host.keyword: <span class="s2">"aks-nodepool1-12345-vmss000000"</span>

<span class="c"># Logs from specific container image</span>
container_image.keyword: <span class="s2">"myregistry.azurecr.io/myapp:v2.1"</span>
</code></pre></div></div>

<h3 id="error-hunting">Error Hunting</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># All stderr output (most reliable for errors)</span>
stream: <span class="s2">"stderr"</span>

<span class="c"># If apps log structured JSON with level field</span>
level: <span class="s2">"error"</span> OR level: <span class="s2">"ERROR"</span>

<span class="c"># Keyword search in message</span>
message: <span class="k">*</span><span class="nb">timeout</span><span class="k">*</span>
message: <span class="k">*</span>connection refused<span class="k">*</span>
message: <span class="k">*</span>OOMKilled<span class="k">*</span>
message: <span class="k">*</span>CrashLoopBackOff<span class="k">*</span>
message: <span class="k">*</span>error<span class="k">*</span>
message: <span class="k">*</span>exception<span class="k">*</span>
message: <span class="k">*</span>failed<span class="k">*</span>

<span class="c"># Probe failures</span>
message: <span class="k">*</span>probe failed<span class="k">*</span>
</code></pre></div></div>

<h3 id="combined-filters">Combined Filters</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Errors in production namespace</span>
namespace_name.keyword: <span class="s2">"production"</span> AND stream: <span class="s2">"stderr"</span>

<span class="c"># Timeout errors in specific app</span>
labels.app.keyword: <span class="s2">"api-gateway"</span> AND message: <span class="k">*</span><span class="nb">timeout</span><span class="k">*</span>

<span class="c"># All errors except from noisy pods</span>
stream: <span class="s2">"stderr"</span> AND NOT pod_name.keyword: <span class="s2">"health-checker-*"</span>

<span class="c"># Exclude system namespaces</span>
NOT namespace_name.keyword: <span class="s2">"kube-system"</span> AND NOT namespace_name.keyword: <span class="s2">"logging"</span>

<span class="c"># Multiple namespaces</span>
namespace_name.keyword: <span class="o">(</span><span class="s2">"production"</span> OR <span class="s2">"staging"</span><span class="o">)</span>
</code></pre></div></div>

<h3 id="cluster--environment">Cluster / Environment</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Specific cluster</span>
cluster.keyword: <span class="s2">"aks-efk-cluster"</span>

<span class="c"># Specific environment</span>
environment.keyword: <span class="s2">"production"</span>
</code></pre></div></div>

<hr />

<h2 id="elasticsearch-dev-tools-queries">Elasticsearch Dev Tools Queries</h2>

<p>Run in <strong>Kibana → Dev Tools</strong>.</p>

<h3 id="health--status">Health &amp; Status</h3>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="err">#</span><span class="w"> </span><span class="err">Cluster</span><span class="w"> </span><span class="err">health</span><span class="w">
</span><span class="err">GET</span><span class="w"> </span><span class="err">_cluster/health?pretty</span><span class="w">

</span><span class="err">#</span><span class="w"> </span><span class="err">List</span><span class="w"> </span><span class="err">all</span><span class="w"> </span><span class="err">indices</span><span class="w"> </span><span class="err">with</span><span class="w"> </span><span class="err">sizes</span><span class="w">
</span><span class="err">GET</span><span class="w"> </span><span class="err">_cat/indices?v&amp;s=store.size:desc</span><span class="w">

</span><span class="err">#</span><span class="w"> </span><span class="err">Node</span><span class="w"> </span><span class="err">resource</span><span class="w"> </span><span class="err">usage</span><span class="w">
</span><span class="err">GET</span><span class="w"> </span><span class="err">_cat/nodes?v&amp;h=name,heap.percent,ram.percent,cpu,load_</span><span class="mi">1</span><span class="err">m,disk.used_percent</span><span class="w">

</span><span class="err">#</span><span class="w"> </span><span class="err">Shard</span><span class="w"> </span><span class="err">allocation</span><span class="w">
</span><span class="err">GET</span><span class="w"> </span><span class="err">_cat/shards?v</span><span class="w">

</span><span class="err">#</span><span class="w"> </span><span class="err">Disk</span><span class="w"> </span><span class="err">allocation</span><span class="w"> </span><span class="err">per</span><span class="w"> </span><span class="err">node</span><span class="w">
</span><span class="err">GET</span><span class="w"> </span><span class="err">_cat/allocation?v</span><span class="w">

</span><span class="err">#</span><span class="w"> </span><span class="err">Check</span><span class="w"> </span><span class="err">ILM</span><span class="w"> </span><span class="err">policy</span><span class="w">
</span><span class="err">GET</span><span class="w"> </span><span class="err">_ilm/policy/k</span><span class="mi">8</span><span class="err">s-logs-policy</span><span class="w">
</span></code></pre></div></div>

<h3 id="search-queries">Search Queries</h3>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="err">#</span><span class="w"> </span><span class="err">Count</span><span class="w"> </span><span class="err">total</span><span class="w"> </span><span class="err">logs</span><span class="w">
</span><span class="err">GET</span><span class="w"> </span><span class="err">k</span><span class="mi">8</span><span class="err">s-logs-*/_count</span><span class="w">

</span><span class="err">#</span><span class="w"> </span><span class="err">Latest</span><span class="w"> </span><span class="mi">5</span><span class="w"> </span><span class="err">logs</span><span class="w">
</span><span class="err">GET</span><span class="w"> </span><span class="err">k</span><span class="mi">8</span><span class="err">s-logs-*/_search</span><span class="w">
</span><span class="p">{</span><span class="w">
  </span><span class="nl">"size"</span><span class="p">:</span><span class="w"> </span><span class="mi">5</span><span class="p">,</span><span class="w">
  </span><span class="nl">"sort"</span><span class="p">:</span><span class="w"> </span><span class="p">[{</span><span class="nl">"@timestamp"</span><span class="p">:</span><span class="w"> </span><span class="s2">"desc"</span><span class="p">}]</span><span class="w">
</span><span class="p">}</span><span class="w">

</span><span class="err">#</span><span class="w"> </span><span class="err">Search</span><span class="w"> </span><span class="err">for</span><span class="w"> </span><span class="err">errors</span><span class="w">
</span><span class="err">GET</span><span class="w"> </span><span class="err">k</span><span class="mi">8</span><span class="err">s-logs-*/_search</span><span class="w">
</span><span class="p">{</span><span class="w">
  </span><span class="nl">"size"</span><span class="p">:</span><span class="w"> </span><span class="mi">10</span><span class="p">,</span><span class="w">
  </span><span class="nl">"query"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"match"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"stream"</span><span class="p">:</span><span class="w"> </span><span class="s2">"stderr"</span><span class="w"> </span><span class="p">}</span><span class="w">
  </span><span class="p">},</span><span class="w">
  </span><span class="nl">"sort"</span><span class="p">:</span><span class="w"> </span><span class="p">[{</span><span class="nl">"@timestamp"</span><span class="p">:</span><span class="w"> </span><span class="s2">"desc"</span><span class="p">}]</span><span class="w">
</span><span class="p">}</span><span class="w">

</span><span class="err">#</span><span class="w"> </span><span class="err">Logs</span><span class="w"> </span><span class="err">from</span><span class="w"> </span><span class="err">specific</span><span class="w"> </span><span class="err">pod</span><span class="w">
</span><span class="err">GET</span><span class="w"> </span><span class="err">k</span><span class="mi">8</span><span class="err">s-logs-*/_search</span><span class="w">
</span><span class="p">{</span><span class="w">
  </span><span class="nl">"size"</span><span class="p">:</span><span class="w"> </span><span class="mi">10</span><span class="p">,</span><span class="w">
  </span><span class="nl">"query"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"term"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"pod_name.keyword"</span><span class="p">:</span><span class="w"> </span><span class="s2">"myapp-xyz"</span><span class="w"> </span><span class="p">}</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">

</span><span class="err">#</span><span class="w"> </span><span class="err">Full-text</span><span class="w"> </span><span class="err">search</span><span class="w"> </span><span class="err">in</span><span class="w"> </span><span class="err">message</span><span class="w">
</span><span class="err">GET</span><span class="w"> </span><span class="err">k</span><span class="mi">8</span><span class="err">s-logs-*/_search</span><span class="w">
</span><span class="p">{</span><span class="w">
  </span><span class="nl">"size"</span><span class="p">:</span><span class="w"> </span><span class="mi">10</span><span class="p">,</span><span class="w">
  </span><span class="nl">"query"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"match"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"message"</span><span class="p">:</span><span class="w"> </span><span class="s2">"timeout"</span><span class="w"> </span><span class="p">}</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<h3 id="aggregations-analytics">Aggregations (Analytics)</h3>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="err">#</span><span class="w"> </span><span class="err">Error</span><span class="w"> </span><span class="err">count</span><span class="w"> </span><span class="err">per</span><span class="w"> </span><span class="err">namespace</span><span class="w">
</span><span class="err">GET</span><span class="w"> </span><span class="err">k</span><span class="mi">8</span><span class="err">s-logs-*/_search</span><span class="w">
</span><span class="p">{</span><span class="w">
  </span><span class="nl">"size"</span><span class="p">:</span><span class="w"> </span><span class="mi">0</span><span class="p">,</span><span class="w">
  </span><span class="nl">"query"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"match"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"stream"</span><span class="p">:</span><span class="w"> </span><span class="s2">"stderr"</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="p">},</span><span class="w">
  </span><span class="nl">"aggs"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"by_namespace"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
      </span><span class="nl">"terms"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"field"</span><span class="p">:</span><span class="w"> </span><span class="s2">"namespace_name.keyword"</span><span class="w"> </span><span class="p">}</span><span class="w">
    </span><span class="p">}</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">

</span><span class="err">#</span><span class="w"> </span><span class="err">Top</span><span class="w"> </span><span class="mi">10</span><span class="w"> </span><span class="err">pods</span><span class="w"> </span><span class="err">by</span><span class="w"> </span><span class="err">log</span><span class="w"> </span><span class="err">volume</span><span class="w">
</span><span class="err">GET</span><span class="w"> </span><span class="err">k</span><span class="mi">8</span><span class="err">s-logs-*/_search</span><span class="w">
</span><span class="p">{</span><span class="w">
  </span><span class="nl">"size"</span><span class="p">:</span><span class="w"> </span><span class="mi">0</span><span class="p">,</span><span class="w">
  </span><span class="nl">"aggs"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"top_pods"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
      </span><span class="nl">"terms"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
        </span><span class="nl">"field"</span><span class="p">:</span><span class="w"> </span><span class="s2">"pod_name.keyword"</span><span class="p">,</span><span class="w">
        </span><span class="nl">"size"</span><span class="p">:</span><span class="w"> </span><span class="mi">10</span><span class="w">
      </span><span class="p">}</span><span class="w">
    </span><span class="p">}</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">

</span><span class="err">#</span><span class="w"> </span><span class="err">Log</span><span class="w"> </span><span class="err">count</span><span class="w"> </span><span class="err">per</span><span class="w"> </span><span class="err">hour</span><span class="w"> </span><span class="err">(histogram)</span><span class="w">
</span><span class="err">GET</span><span class="w"> </span><span class="err">k</span><span class="mi">8</span><span class="err">s-logs-*/_search</span><span class="w">
</span><span class="p">{</span><span class="w">
  </span><span class="nl">"size"</span><span class="p">:</span><span class="w"> </span><span class="mi">0</span><span class="p">,</span><span class="w">
  </span><span class="nl">"aggs"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"logs_over_time"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
      </span><span class="nl">"date_histogram"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
        </span><span class="nl">"field"</span><span class="p">:</span><span class="w"> </span><span class="s2">"@timestamp"</span><span class="p">,</span><span class="w">
        </span><span class="nl">"calendar_interval"</span><span class="p">:</span><span class="w"> </span><span class="s2">"hour"</span><span class="w">
      </span><span class="p">}</span><span class="w">
    </span><span class="p">}</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">

</span><span class="err">#</span><span class="w"> </span><span class="err">Unique</span><span class="w"> </span><span class="err">pod</span><span class="w"> </span><span class="err">count</span><span class="w"> </span><span class="err">per</span><span class="w"> </span><span class="err">namespace</span><span class="w">
</span><span class="err">GET</span><span class="w"> </span><span class="err">k</span><span class="mi">8</span><span class="err">s-logs-*/_search</span><span class="w">
</span><span class="p">{</span><span class="w">
  </span><span class="nl">"size"</span><span class="p">:</span><span class="w"> </span><span class="mi">0</span><span class="p">,</span><span class="w">
  </span><span class="nl">"aggs"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"by_namespace"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
      </span><span class="nl">"terms"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"field"</span><span class="p">:</span><span class="w"> </span><span class="s2">"namespace_name.keyword"</span><span class="w"> </span><span class="p">},</span><span class="w">
      </span><span class="nl">"aggs"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
        </span><span class="nl">"unique_pods"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
          </span><span class="nl">"cardinality"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"field"</span><span class="p">:</span><span class="w"> </span><span class="s2">"pod_name.keyword"</span><span class="w"> </span><span class="p">}</span><span class="w">
        </span><span class="p">}</span><span class="w">
      </span><span class="p">}</span><span class="w">
    </span><span class="p">}</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<h3 id="disk-management">Disk Management</h3>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="err">#</span><span class="w"> </span><span class="err">Emergency:</span><span class="w"> </span><span class="err">relax</span><span class="w"> </span><span class="err">disk</span><span class="w"> </span><span class="err">watermarks</span><span class="w">
</span><span class="err">PUT</span><span class="w"> </span><span class="err">_cluster/settings</span><span class="w">
</span><span class="p">{</span><span class="w">
  </span><span class="nl">"transient"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"cluster.routing.allocation.disk.watermark.low"</span><span class="p">:</span><span class="w"> </span><span class="s2">"90%"</span><span class="p">,</span><span class="w">
    </span><span class="nl">"cluster.routing.allocation.disk.watermark.high"</span><span class="p">:</span><span class="w"> </span><span class="s2">"95%"</span><span class="p">,</span><span class="w">
    </span><span class="nl">"cluster.routing.allocation.disk.watermark.flood_stage"</span><span class="p">:</span><span class="w"> </span><span class="s2">"97%"</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">

</span><span class="err">#</span><span class="w"> </span><span class="err">Delete</span><span class="w"> </span><span class="err">old</span><span class="w"> </span><span class="err">indices</span><span class="w">
</span><span class="err">DELETE</span><span class="w"> </span><span class="err">k</span><span class="mi">8</span><span class="err">s-logs</span><span class="mf">-2025.12</span><span class="err">.*</span><span class="w">

</span><span class="err">#</span><span class="w"> </span><span class="err">Check</span><span class="w"> </span><span class="err">index</span><span class="w"> </span><span class="err">size</span><span class="w">
</span><span class="err">GET</span><span class="w"> </span><span class="err">_cat/indices/k</span><span class="mi">8</span><span class="err">s-logs-*?v&amp;s=store.size:desc&amp;h=index,store.size,docs.count</span><span class="w">
</span></code></pre></div></div>

<hr />

<h2 id="security">Security</h2>

<h3 id="elasticsearch-rbac">Elasticsearch RBAC</h3>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="err">#</span><span class="w"> </span><span class="err">Write-only</span><span class="w"> </span><span class="err">role</span><span class="w"> </span><span class="err">for</span><span class="w"> </span><span class="err">Fluent</span><span class="w"> </span><span class="err">Bit</span><span class="w">
</span><span class="err">POST</span><span class="w"> </span><span class="err">_security/role/fluent_bit_writer</span><span class="w">
</span><span class="p">{</span><span class="w">
  </span><span class="nl">"cluster"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"monitor"</span><span class="p">,</span><span class="w"> </span><span class="s2">"manage_index_templates"</span><span class="p">,</span><span class="w"> </span><span class="s2">"manage_ilm"</span><span class="p">],</span><span class="w">
  </span><span class="nl">"indices"</span><span class="p">:</span><span class="w"> </span><span class="p">[{</span><span class="w">
    </span><span class="nl">"names"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"k8s-logs-*"</span><span class="p">,</span><span class="w"> </span><span class="s2">"node-logs-*"</span><span class="p">],</span><span class="w">
    </span><span class="nl">"privileges"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"create_index"</span><span class="p">,</span><span class="w"> </span><span class="s2">"create"</span><span class="p">,</span><span class="w"> </span><span class="s2">"write"</span><span class="p">,</span><span class="w"> </span><span class="s2">"manage"</span><span class="p">]</span><span class="w">
  </span><span class="p">}]</span><span class="w">
</span><span class="p">}</span><span class="w">

</span><span class="err">#</span><span class="w"> </span><span class="err">Read-only</span><span class="w"> </span><span class="err">role</span><span class="w"> </span><span class="err">for</span><span class="w"> </span><span class="err">developers</span><span class="w">
</span><span class="err">POST</span><span class="w"> </span><span class="err">_security/role/log_reader</span><span class="w">
</span><span class="p">{</span><span class="w">
  </span><span class="nl">"indices"</span><span class="p">:</span><span class="w"> </span><span class="p">[{</span><span class="w">
    </span><span class="nl">"names"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"k8s-logs-*"</span><span class="p">],</span><span class="w">
    </span><span class="nl">"privileges"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"read"</span><span class="p">,</span><span class="w"> </span><span class="s2">"view_index_metadata"</span><span class="p">]</span><span class="w">
  </span><span class="p">}]</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<h3 id="network-policy">Network Policy</h3>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">apiVersion</span><span class="pi">:</span> <span class="s">networking.k8s.io/v1</span>
<span class="na">kind</span><span class="pi">:</span> <span class="s">NetworkPolicy</span>
<span class="na">metadata</span><span class="pi">:</span>
  <span class="na">name</span><span class="pi">:</span> <span class="s">elasticsearch-allow</span>
  <span class="na">namespace</span><span class="pi">:</span> <span class="s">logging</span>
<span class="na">spec</span><span class="pi">:</span>
  <span class="na">podSelector</span><span class="pi">:</span>
    <span class="na">matchLabels</span><span class="pi">:</span>
      <span class="na">elasticsearch.k8s.elastic.co/cluster-name</span><span class="pi">:</span> <span class="s">efk-cluster</span>
  <span class="na">ingress</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="na">from</span><span class="pi">:</span>
        <span class="pi">-</span> <span class="na">podSelector</span><span class="pi">:</span>
            <span class="na">matchLabels</span><span class="pi">:</span>
              <span class="na">app.kubernetes.io/name</span><span class="pi">:</span> <span class="s">fluent-bit</span>
      <span class="na">ports</span><span class="pi">:</span>
        <span class="pi">-</span> <span class="na">port</span><span class="pi">:</span> <span class="m">9200</span>
    <span class="pi">-</span> <span class="na">from</span><span class="pi">:</span>
        <span class="pi">-</span> <span class="na">podSelector</span><span class="pi">:</span>
            <span class="na">matchLabels</span><span class="pi">:</span>
              <span class="na">kibana.k8s.elastic.co/name</span><span class="pi">:</span> <span class="s">efk-kibana</span>
      <span class="na">ports</span><span class="pi">:</span>
        <span class="pi">-</span> <span class="na">port</span><span class="pi">:</span> <span class="m">9200</span>
    <span class="pi">-</span> <span class="na">from</span><span class="pi">:</span>
        <span class="pi">-</span> <span class="na">podSelector</span><span class="pi">:</span>
            <span class="na">matchLabels</span><span class="pi">:</span>
              <span class="na">elasticsearch.k8s.elastic.co/cluster-name</span><span class="pi">:</span> <span class="s">efk-cluster</span>
      <span class="na">ports</span><span class="pi">:</span>
        <span class="pi">-</span> <span class="na">port</span><span class="pi">:</span> <span class="m">9300</span>
</code></pre></div></div>

<hr />

<h2 id="scaling--ha">Scaling &amp; HA</h2>

<h3 id="pod-disruption-budgets">Pod Disruption Budgets</h3>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">apiVersion</span><span class="pi">:</span> <span class="s">policy/v1</span>
<span class="na">kind</span><span class="pi">:</span> <span class="s">PodDisruptionBudget</span>
<span class="na">metadata</span><span class="pi">:</span>
  <span class="na">name</span><span class="pi">:</span> <span class="s">elasticsearch-pdb</span>
  <span class="na">namespace</span><span class="pi">:</span> <span class="s">logging</span>
<span class="na">spec</span><span class="pi">:</span>
  <span class="na">maxUnavailable</span><span class="pi">:</span> <span class="m">1</span>
  <span class="na">selector</span><span class="pi">:</span>
    <span class="na">matchLabels</span><span class="pi">:</span>
      <span class="na">elasticsearch.k8s.elastic.co/cluster-name</span><span class="pi">:</span> <span class="s">efk-cluster</span>
</code></pre></div></div>

<h3 id="storage-classes">Storage Classes</h3>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Production: Premium SSD</span>
<span class="na">storageClassName</span><span class="pi">:</span> <span class="s">managed-premium</span>

<span class="c1"># Dev/Test: Standard SSD (cheaper)</span>
<span class="na">storageClassName</span><span class="pi">:</span> <span class="s">managed</span>

<span class="c1"># Archive: Standard HDD (cheapest)</span>
<span class="na">storageClassName</span><span class="pi">:</span> <span class="s">default</span>
</code></pre></div></div>

<h3 id="volume-expansion-when-disk-fills-up">Volume Expansion (when disk fills up)</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># StorageClass must have: allowVolumeExpansion: true</span>
kubectl <span class="nt">-n</span> logging patch pvc elasticsearch-data-efk-cluster-es-data-0 <span class="se">\</span>
  <span class="nt">-p</span> <span class="s1">'{"spec":{"resources":{"requests":{"storage":"200Gi"}}}}'</span>
</code></pre></div></div>

<hr />

<h2 id="monitoring-the-stack">Monitoring the Stack</h2>

<h3 id="key-metrics">Key Metrics</h3>

<table>
  <thead>
    <tr>
      <th>Metric</th>
      <th>Alert Threshold</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Fluent Bit output errors</td>
      <td>Any increase</td>
    </tr>
    <tr>
      <td>Fluent Bit retry count</td>
      <td>&gt; 0 sustained</td>
    </tr>
    <tr>
      <td>ES cluster health</td>
      <td>Yellow = warning, Red = critical</td>
    </tr>
    <tr>
      <td>ES JVM heap usage</td>
      <td>&gt; 85%</td>
    </tr>
    <tr>
      <td>ES disk usage</td>
      <td>&gt; 80%</td>
    </tr>
  </tbody>
</table>

<h3 id="fluent-bit-metrics">Fluent Bit Metrics</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl <span class="nt">-n</span> logging <span class="nb">exec</span> <span class="nt">-it</span> &lt;fb-pod&gt; <span class="nt">--</span> curl http://localhost:2020/api/v1/metrics
kubectl <span class="nt">-n</span> logging <span class="nb">exec</span> <span class="nt">-it</span> &lt;fb-pod&gt; <span class="nt">--</span> curl http://localhost:2020/api/v1/health
</code></pre></div></div>

<hr />

<h2 id="troubleshooting">Troubleshooting</h2>

<h3 id="fluent-bit-not-sending-logs">Fluent Bit not sending logs</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl <span class="nt">-n</span> logging logs <span class="nt">-l</span> app.kubernetes.io/name<span class="o">=</span>fluent-bit <span class="nt">--tail</span><span class="o">=</span>50
kubectl <span class="nt">-n</span> logging <span class="nb">exec</span> <span class="nt">-it</span> &lt;fb-pod&gt; <span class="nt">--</span> curl <span class="nt">-k</span> https://efk-cluster-es-http:9200
</code></pre></div></div>

<h3 id="pod-wont-schedule-insufficient-cpu">Pod won’t schedule (Insufficient CPU)</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl describe node &lt;node-name&gt; | <span class="nb">grep</span> <span class="nt">-A</span> 5 <span class="s2">"Allocated resources"</span>
<span class="c"># Fix: Lower resource requests or scale up node</span>
</code></pre></div></div>

<h3 id="kibana-oom-javascript-heap-out-of-memory">Kibana OOM (JavaScript heap out of memory)</h3>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">env</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">NODE_OPTIONS</span>
    <span class="na">value</span><span class="pi">:</span> <span class="s2">"</span><span class="s">--max-old-space-size=768"</span>
</code></pre></div></div>

<h3 id="elasticsearch-disk-full">Elasticsearch disk full</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Check disk</span>
curl <span class="nt">-k</span> <span class="nt">-u</span> elastic:&lt;pass&gt; <span class="s2">"https://localhost:9200/_cat/allocation?v"</span>
<span class="c"># Delete old indices or expand PVC</span>
</code></pre></div></div>

<h3 id="no-level-field-in-kibana">No <code class="language-plaintext highlighter-rouge">level</code> field in Kibana</h3>

<p>Use <code class="language-plaintext highlighter-rouge">stream: "stderr"</code> instead. The <code class="language-plaintext highlighter-rouge">level</code> field only appears when apps log structured JSON with a level key.</p>

<hr />

<h2 id="cost-optimization">Cost Optimization</h2>

<h3 id="resource-sizing">Resource Sizing</h3>

<table>
  <thead>
    <tr>
      <th>Environment</th>
      <th>ES Spec</th>
      <th>Kibana Spec</th>
      <th>Storage</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Dev/Demo (2 CPU/8GB)</td>
      <td>1 pod, 200m CPU, 1.5Gi mem</td>
      <td>1 pod, 50m CPU, 512Mi mem</td>
      <td>5-10Gi <code class="language-plaintext highlighter-rouge">managed</code></td>
    </tr>
    <tr>
      <td>Production Small</td>
      <td>3+3 pods, 500m CPU, 4Gi mem</td>
      <td>2 pods, 200m CPU, 1Gi mem</td>
      <td>100Gi <code class="language-plaintext highlighter-rouge">managed-premium</code></td>
    </tr>
    <tr>
      <td>Production Large</td>
      <td>3+6 pods, 2 CPU, 8Gi mem</td>
      <td>3 pods, 500m CPU, 2Gi mem</td>
      <td>500Gi+ <code class="language-plaintext highlighter-rouge">managed-premium</code></td>
    </tr>
  </tbody>
</table>

<h3 id="reduce-log-volume">Reduce Log Volume</h3>

<div class="language-ini highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Drop debug logs
</span><span class="nn">[FILTER]</span>
    <span class="err">Name</span>    <span class="err">grep</span>
    <span class="err">Match</span>   <span class="err">kube.*</span>
    <span class="err">Exclude</span> <span class="err">log</span> <span class="py">level</span><span class="p">=</span><span class="s">debug</span>

<span class="c"># Drop health check logs
</span><span class="nn">[FILTER]</span>
    <span class="err">Name</span>    <span class="err">grep</span>
    <span class="err">Match</span>   <span class="err">kube.*</span>
    <span class="err">Exclude</span> <span class="err">log</span> <span class="err">GET</span> <span class="err">/healthz</span>
</code></pre></div></div>

<hr />

<h2 id="useful-commands">Useful Commands</h2>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># ── Elasticsearch ──</span>
kubectl <span class="nt">-n</span> logging get elasticsearch                         <span class="c"># cluster status</span>
kubectl <span class="nt">-n</span> logging get pods <span class="nt">-l</span> elasticsearch.k8s.elastic.co/cluster-name<span class="o">=</span>efk-cluster
curl <span class="nt">-k</span> <span class="nt">-u</span> elastic:&lt;pass&gt; <span class="s2">"https://localhost:9200/_cluster/health?pretty"</span>
curl <span class="nt">-k</span> <span class="nt">-u</span> elastic:&lt;pass&gt; <span class="s2">"https://localhost:9200/_cat/indices?v&amp;s=store.size:desc"</span>
curl <span class="nt">-k</span> <span class="nt">-u</span> elastic:&lt;pass&gt; <span class="s2">"https://localhost:9200/_cat/nodes?v&amp;h=name,heap.percent,ram.percent,cpu,disk.used_percent"</span>

<span class="c"># ── Kibana ──</span>
kubectl <span class="nt">-n</span> logging get kibana
kubectl <span class="nt">-n</span> logging port-forward svc/efk-kibana-kb-http 5601

<span class="c"># ── Fluent Bit ──</span>
kubectl <span class="nt">-n</span> logging get pods <span class="nt">-l</span> app.kubernetes.io/name<span class="o">=</span>fluent-bit
kubectl <span class="nt">-n</span> logging logs <span class="nt">-l</span> app.kubernetes.io/name<span class="o">=</span>fluent-bit <span class="nt">--tail</span><span class="o">=</span>20
kubectl <span class="nt">-n</span> logging <span class="nb">exec</span> <span class="nt">-it</span> &lt;fb-pod&gt; <span class="nt">--</span> curl http://localhost:2020/api/v1/metrics

<span class="c"># ── Credentials ──</span>
kubectl <span class="nt">-n</span> logging get secret efk-cluster-es-elastic-user <span class="nt">-o</span> <span class="nv">jsonpath</span><span class="o">=</span><span class="s1">'{.data.elastic}'</span> | <span class="nb">base64</span> <span class="nt">-d</span>

<span class="c"># ── AKS ──</span>
az aks nodepool list <span class="nt">--resource-group</span> <span class="nv">$RG</span> <span class="nt">--cluster-name</span> <span class="nv">$CLUSTER</span> <span class="nt">-o</span> table
kubectl describe node &lt;node&gt; | <span class="nb">grep</span> <span class="nt">-A</span> 5 <span class="s2">"Allocated resources"</span>
kubectl top pods <span class="nt">-n</span> logging
</code></pre></div></div>

<hr />

<h2 id="license">License</h2>

<p>MIT</p>]]></content><author><name>Amit Kumar</name></author><category term="Azure" /><category term="Kubernetes" /><category term="DevOps" /><category term="Observability" /><category term="AKS" /><category term="EFK" /><category term="Elasticsearch" /><category term="Fluent Bit" /><category term="Kibana" /><category term="ECK" /><category term="Logging" /><category term="ILM" /><category term="Azure Blob" /><summary type="html"><![CDATA[EFK Stack on AKS — Production Ready Setup]]></summary></entry><entry><title type="html">Secure AKS Ingress using Application Gateway and Let’s Encrypt</title><link href="http://itsamit.online/azure/kubernetes/devops/2026/03/08/aks-agic-letsencrypt.html" rel="alternate" type="text/html" title="Secure AKS Ingress using Application Gateway and Let’s Encrypt" /><published>2026-03-08T00:00:00+00:00</published><updated>2026-03-08T00:00:00+00:00</updated><id>http://itsamit.online/azure/kubernetes/devops/2026/03/08/aks-agic-letsencrypt</id><content type="html" xml:base="http://itsamit.online/azure/kubernetes/devops/2026/03/08/aks-agic-letsencrypt.html"><![CDATA[<h1 id="secure-aks-ingress-using-application-gateway-and-lets-encrypt">Secure AKS Ingress using Application Gateway and Let’s Encrypt</h1>

<h2 id="introduction">Introduction</h2>

<p>When deploying applications in Azure Kubernetes Service (AKS), exposing
services to the internet securely is an important requirement.</p>

<p>Azure provides Application Gateway Ingress Controller (AGIC) which
allows Azure Application Gateway to function as an ingress controller
for Kubernetes workloads.</p>

<p>In this guide we will cover the complete workflow:</p>

<ol>
  <li>Enable Application Gateway Ingress Controller</li>
  <li>Deploy a demo NGINX application</li>
  <li>Expose the application using Kubernetes Ingress</li>
  <li>Verify application access over HTTP</li>
  <li>Install cert-manager</li>
  <li>Configure Let’s Encrypt TLS certificates</li>
  <li>Enable HTTPS for secure access</li>
</ol>

<p>By the end of this guide, your application will be accessible securely
using HTTPS.</p>

<hr />

<h1 id="architecture-overview">Architecture Overview</h1>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>User
↓
DNS (test.yourdomain.com)
↓
Azure Application Gateway
↓
Application Gateway Ingress Controller (AGIC)
↓
Kubernetes Service
↓
NGINX Pod
</code></pre></div></div>

<p>For HTTPS certificate issuance:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cert-manager 
    ↓ 
Let's Encrypt 
    ↓
TLS Certificate 
    ↓ 
Application Gateway HTTPS Listener
</code></pre></div></div>

<hr />

<h1 id="prerequisites">Prerequisites</h1>

<p>Before starting this tutorial ensure the following requirement is
satisfied.</p>

<ul>
  <li>An Azure Kubernetes Service (AKS) cluster is already created and
running</li>
  <li>You have access to Azure Portal</li>
  <li>kubectl is installed and connected to the cluster</li>
  <li>helm is installed</li>
  <li>A domain name (required later for HTTPS)</li>
</ul>

<p>Verify cluster connectivity:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl get nodes
</code></pre></div></div>

<hr />

<h1 id="step-1--enable-application-gateway-ingress">Step 1 — Enable Application Gateway Ingress</h1>

<p>The first step is to enable Application Gateway Ingress Controller
(AGIC).</p>

<p>Steps:</p>

<ol>
  <li>Navigate to Azure Portal</li>
  <li>Open your AKS Cluster</li>
  <li>Select Networking</li>
  <li>Locate Application Gateway Ingress</li>
  <li>Click Enable Application Gateway Ingress</li>
</ol>

<p>Azure will automatically deploy the AGIC controller inside the AKS
cluster.
<img width="826" height="479" alt="Screenshot from 2026-03-08 17-11-13" src="https://github.com/user-attachments/assets/a548d4ba-d7a6-45f6-846d-2732588c0b7d" /></p>

<hr />

<h1 id="step-2--verify-agic-controller">Step 2 — Verify AGIC Controller</h1>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl get pods -n kube-system
</code></pre></div></div>
<p>Expected output:</p>

<p>ingress-appgw-xxxxx Running</p>

<hr />

<h1 id="step-3--deploy-demo-nginx-application">Step 3 — Deploy Demo NGINX Application</h1>

<p>Create nginx-deployment.yaml</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-demo
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx-demo
  template:
    metadata:
      labels:
        app: nginx-demo
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80
</code></pre></div></div>
<p>Apply:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl apply -f nginx-deployment.yaml
</code></pre></div></div>
<hr />

<h1 id="step-4--create-kubernetes-service">Step 4 — Create Kubernetes Service</h1>

<p>service.yaml</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apiVersion: v1
kind: Service
metadata:
  name: nginx-demo-service
spec:
  selector:
    app: nginx-demo
  ports:
  - port: 80
    targetPort: 80
</code></pre></div></div>
<p>Apply:</p>

<p>kubectl apply -f service.yaml</p>

<hr />

<h1 id="step-5--create-ingress-resource">Step 5 — Create Ingress Resource</h1>

<p>ingress.yaml</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: nginx-demo-ingress
  annotations:
    kubernetes.io/ingress.class: azure/application-gateway

spec:
  rules:
  - host: test.yourdomain.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: nginx-demo-service
            port:
              number: 80
</code></pre></div></div>

<p>Apply:</p>

<p>kubectl apply -f ingress.yaml</p>

<hr />

<h1 id="step-6--verify-application-http">Step 6 — Verify Application (HTTP)</h1>

<p>Ensure DNS points to the Application Gateway public IP.</p>

<p>Open:</p>

<p>http://test.yourdomain.com</p>

<p>You should see the NGINX welcome page.</p>

<hr />

<h1 id="step-7--install-cert-manager">Step 7 — Install cert-manager</h1>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>helm repo add jetstack https://charts.jetstack.io helm repo update

helm install cert-manager jetstack/cert-manager --namespace cert-manager
--create-namespace --set installCRDs=true
</code></pre></div></div>

<p>Verify:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl get pods -n cert-manager
</code></pre></div></div>
<p>Expected output:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cert-manager
cert-manager-webhook
cert-manager-cainjector
</code></pre></div></div>
<hr />

<h1 id="step-8--create-clusterissuer">Step 8 — Create ClusterIssuer</h1>

<p>cluster-issuer.yaml</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    email: your-email@example.com
    server: https://acme-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: letsencrypt-prod
    solvers:
    - http01:
        ingress:
          class: azure/application-gateway
</code></pre></div></div>

<p>Apply:</p>

<p>kubectl apply -f cluster-issuer.yaml</p>

<p>Verify:</p>

<p>kubectl get clusterissuer</p>

<hr />

<h1 id="step-9--enable-tls-in-ingress">Step 9 — Enable TLS in Ingress</h1>

<p>Add TLS configuration in ingress.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: nginx-demo-ingress
  annotations:
    kubernetes.io/ingress.class: azure/application-gateway
    cert-manager.io/cluster-issuer: letsencrypt-prod
    acme.cert-manager.io/http01-edit-in-place: "true"
    appgw.ingress.kubernetes.io/ssl-redirect: "true"
    cert-manager.io/acme-challenge-type: http01

spec:
  tls:
  - hosts:
    - test.yourdomain.com
    secretName: test.yourdomain.com-tls

  rules:
  - host: test.yourdomain.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: nginx-demo-service
            port:
              number: 80
</code></pre></div></div>

<hr />

<h1 id="step-10--verify-certificate">Step 10 — Verify Certificate</h1>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl get certificate
</code></pre></div></div>
<p>Expected:</p>

<p>test.yourdomain.com-tls True</p>

<p>Check secret:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl get secret test.yourdomain.com-tls
</code></pre></div></div>
<hr />

<h1 id="step-11--verify-https">Step 11 — Verify HTTPS</h1>

<p>Open browser:</p>

<p>https://test.yourdomain.com</p>

<p>You should see a secure connection.</p>

<p>Certificate issued by:</p>

<p>Let’s Encrypt</p>

<h1 id="troubleshooting">Troubleshooting</h1>
<h2 id="certificate-not-issued">Certificate not issued</h2>

<p>Check:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl get challenge
kubectl get orders
kubectl describe certificate
</code></pre></div></div>
<h2 id="dns-issues">DNS issues</h2>

<p>Ensure domain points to Application Gateway IP.</p>

<p>nslookup test.yourdomain.com</p>
<h2 id="agic-issues">AGIC issues</h2>

<p>Check logs:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl logs -n kube-system deploy/ingress-appgw
</code></pre></div></div>
<hr />

<h1 id="conclusion">Conclusion</h1>

<p>In this guide we:</p>

<ul>
  <li>Enabled Application Gateway Ingress Controller</li>
  <li>Deployed an NGINX application</li>
  <li>Exposed it using Ingress</li>
  <li>Verified HTTP access</li>
  <li>Installed cert-manager</li>
  <li>Configured Let’s Encrypt</li>
  <li>Enabled HTTPS for secure access</li>
</ul>

<p>This setup provides automated TLS certificate management for
applications running in AKS.</p>]]></content><author><name>Amit Kumar</name></author><category term="Azure" /><category term="Kubernetes" /><category term="DevOps" /><category term="AKS" /><category term="AGIC" /><category term="cert-manager" /><category term="LetsEncrypt" /><summary type="html"><![CDATA[Secure AKS Ingress using Application Gateway and Let’s Encrypt]]></summary></entry></feed>