<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Posts on Eric Daly&#39;s Blog</title>
    <link>https://blog.dalydays.com/post/</link>
    <description>Recent content in Posts on Eric Daly&#39;s Blog</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en</language>
    <lastBuildDate>Mon, 23 Feb 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://blog.dalydays.com/post/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>GitLab Runners On EKS Using Cluster Autoscaler</title>
      <link>https://blog.dalydays.com/post/gitlab-runners-on-eks-with-cluster-autoscaler/</link>
      <pubDate>Mon, 23 Feb 2026 00:00:00 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/gitlab-runners-on-eks-with-cluster-autoscaler/</guid>
      <description>A deep dive on using EKS to power GitLab runners with autoscaling, and how I overcame some gotchas.</description>
      <content:encoded><![CDATA[<h1 id="background">Background</h1>
<p>Inheriting legacy infrastructure is an adventure. We had been using the legacy runners for many years now, before I started with the company. They had some issues but overall they served their purpose. They were based on docker+machine which allowed for dynamically scaling EC2 instances in AWS. Those runners checked all the boxes originally - they ran jobs, they autoscaled, and they were easy to configure when needed. It was a great starting point, maybe even the best available option at the time. But over the years the pain points continued to grow, and I was already passively considering alternatives. Some of the pain points include:</p>
<ol>
<li>One job per EC2 instance - I was disappointed when I found this out, to say the least</li>
<li>Slow scale up - it would take about 5 minutes for a new EC2 instance to bootstrap before it could accept a job</li>
<li>docker+machine weirdness - it would orphan EC2 instances forever, requiring periodic manual cleanup</li>
<li>No shared image cache - job startup time was abysmal due to a combination of insane image sizes and essentially no reuse of cached images (due to new EC2 instances for every job)</li>
<li>Base AMI images needed to be updated manually when the security team asked (bad, we should obviously be automating and doing this proactively, but nobody had time or desire to prioritize this)</li>
<li>docker+machine sometimes got in a weird state where it could not scale up new EC2 instances and instead would get stuck in a retry loop, preventing any new jobs in that queue from running until we resolved it manually</li>
<li>docker+machine has been deprecated for several years - we tried updating to a later version but things broke. It wasn&rsquo;t worth the burden of tracking down why, rather than looking for a better path forward.</li>
<li>Observability. We just don&rsquo;t have a good way to measure performance or monitor for issues (though admittedly that is largely an &ldquo;us&rdquo; issue)</li>
<li>Long lived EC2 instances. In our configuration, docker+machine idles down to 1 node per queue. In practice, this means that the first ever EC2 instance that was deployed for that queue remains online as long as possible, storing up image cache and other ephemeral data that is unneeded. Even with 200GB disks, we&rsquo;ve had many occasions where it was necessary to SSH into the EC2 instances and run prune commands. We tried automating this but something about <code>docker system prune -af</code> non-interactively is unreliable over an SSH session.</li>
<li>Troubleshooting is tricky. We can SSH into the host running docker+machine, get a list of EC2 instances, then <code>docker-machine ssh instance-name</code> to access the instance and troubleshoot.</li>
<li>Cost (and waste). One EC2 instance per job adds up, especially when the vast majority of those jobs use under 10% of the allocated resources.</li>
</ol>
<p>This project kind of developed organically out of multiple needs that seemed to come in all around the same time. I was already bored with the idea of the existing maintenance plan, and we were not proactive about it partly due to the toil involved. Developers started complaining about slow startup time - this was because of the lack of shared image caching, combined with developers pulling multi gigabyte images for test jobs. Sure, it&rsquo;s easy to say they should optimize their images (and it couldn&rsquo;t be more true). Security started pushing on us to &ldquo;patch&rdquo; the runners, which meant updating the base AMI images manually, but if we needed to keep this up to date monthly, we were going to need to automate somewhere. Now there was a compelling reason for me to get buy-in from leadership to assign this project, and I already had a plan in mind for how we could replace docker+machine and simplify the maintenance burden - Kubernetes. I had hopes that it would solve every pain point we had with docker+machine:</p>
<ol>
<li>Workers could run multiple jobs, making better use of available resources. After all, that was supposed to be one of the main benefits of using Docker.</li>
<li>Fast scale up. Larger EC2 instances per worker meant that adding one more EC2 instance allowed X more jobs to run as soon as the new instance was ready, plus we could use lightweight base images rather than our bulky AMI.</li>
<li>Cluster Autoscaler should be capable of handling scaling in production environments, and it is actively maintained unlike docker+machine.</li>
<li>Better shared image cache. Since many jobs would now run per EC2 instance, all of those jobs would share the image cache that was pulled to that instance. There are better ways, but this was still a good start, and significantly better than the existing option.</li>
<li>Using AWS provided AMIs meant we no longer have to maintain or build custom AMIs with security and logging agents, etc.</li>
<li>Ideally, eliminate docker+machine weirdness by simply eliminating docker+machine itself.</li>
<li>EKS is supported by AWS, along with their AMIs. Kubernetes is actively maintained. Cluster Autoscaler is actively maintained. GitLab runners support Kubernetes.</li>
<li>We can get insight into overall health and performance using EKS container insights. It&rsquo;s a major leap forward from what we had before, although not the most impressive tool in the world either.</li>
<li>No more long-lived EC2 instances. Cluster autoscaler happily scales down any which one, and given even a low amount of activity throughout the days and weeks it does a nice job balancing instances and cycling them out, which I love to see. I don&rsquo;t need stale workers starting to exhibit weird issues that I probably don&rsquo;t care about the root cause.</li>
<li>Maybe this one is personal preference, but I find it much easier to troubleshoot within Kubernetes. I use <code>k9s</code> which makes it amazingly easy (dare I say fun) to see the state of all pods running in the namespace, exec into them if desired, and easily get a picture of the pods, nodes, resource utilization, logs, etc. in realtime.</li>
<li>Thanks to the magic of Cluster Autoscaler, we can idle down the worker nodes very, very low while there are no jobs demanding resources. On average, I estimated this to save at least 50%, assuming the exact same workload we ran last year, with potential to save even more if we continue monitoring and tuning. And this still includes all the other benefits including less maintenance and faster performance.</li>
</ol>
<p>Our team also explored a simpler option where we could build and maintain a few EC2 instances running Docker and use the docker executor. This would be a lot more in line with what the original engineer thought was going to happen when he decided on docker+machine years ago, running more than 1 job per EC2 instance. This would solve some of our pain points, but still leave us with some maintenance burden, having to care about long-lived EC2 instances, and still having to patch or automate a rolling build. It would provide better scaling and resource utilization, but still require us to manually right-size forever. While the team built and benchmarked a POC on this idea, I was testing on EKS. We figured we knew which option was going to win, but of course you don&rsquo;t know until you try.</p>
<p>It turns out that yes, EKS is better in almost every way. It really did solve the pain points, got us into a much better security posture, and reduced our maintenance burden by well over 50%. Job startup time increased on average, and job performance even increased significantly. I may elaborate on that piece later on.</p>
<p>It was also the perfect way to introduce a production Kubernetes environment to the organization (yes this is the first thing we have ever done as a company with Kubernetes) because of the relative level of complexity.</p>
<h1 id="design">Design</h1>
<p>This was the goal: to build a scalable, maintainable platform on which to run stateless jobs for GitLab as efficiently as possible. I was solving for all of the pain points listed above, but primarily thinking about job startup performance, security, and long term maintenance. Cost savings is always good, but for this project it was just the icing on the cake.</p>
<h2 id="build-tooling">Build Tooling</h2>
<p>I&rsquo;m not new to Kubernetes, but I was new to EKS. I wanted to use a GitLab pipeline to build the EKS infrastructure. After some quick research, I found <code>eksctl</code> to be the easiest way to build a cluster. It simplifies many aspects that would otherwise require a fair bit of CloudFormation, but is flexible <strong>enough</strong> to get the job done, with only minor effort required outside of that.</p>
<p>Other options included CloudFormation or CDK, but why bother when there is a bespoke tool directly from Amazon.</p>
<p>I built a pipeline in GitLab to deploy EKS using <code>eksctl</code> and AWS CLI. It relies on a YAML file that is specific to <code>eksctl</code> to define some specifics that I needed. I needed to use the AWS CLI to change the EKS upgrade policy to STANDARD. The pipeline is idempotent, so that I can safely run it as many times as I want without making unnecessary changes (think &ldquo;if cluster exists, then <code>eksctl upgrade cluster</code> else <code>eksctl create cluster</code>&rdquo;). I built some logic around nodegroups, so that when I change or add nodegroups they are deployed properly, but existing nodegroups are not changed or removed. I also opted to use access entries for authentication, so by including the name of the role in the config file, the pipeline automatically ensures the proper access entry is added for that role. That allows us to authenticate with the control plane through our AWS SSO configuration.</p>
<h2 id="networking">Networking</h2>
<p>Originally, I was leaning toward putting everything in its own VPC. However, there was additional work to get the GitLab VPC talking to the EKS VPC, but most importantly I quickly realized that some jobs were connecting to external systems over the public internet with IP allow listing. I needed to reuse the existing EIP if at all possible.</p>
<p>I chose to place the EKS workers in the same VPC subnet as the existing runners due to IP allow listing on external systems from the NAT gateway EIP. It was easier if all outbound traffic came from the same IP. Don&rsquo;t get me started on why we send all of this traffic over the public internet to begin with, that&rsquo;s a battle for another day.</p>
<h2 id="compute">Compute</h2>
<p>The other primary decision to make was compute. EKS delivers the control plane, but you still need somewhere to run your stuff. Our two options are EC2 or Fargate.</p>
<p>Based on this project&rsquo;s goal, EC2 was the obvious choice because spinning up a single EC2 instance instantly provides capacity for X number of jobs to run in parallel. Maybe even more importantly is that jobs within the EC2 instance can share the image cache, greatly increasing the odds that a given job will start significantly faster if its target image is already cached.</p>
<p>Within EC2, I had 3 options: bring your own EC2, managed nodegroups, or EKS Auto Mode. At the time, Auto mode was so new that I wasn&rsquo;t ready to commit to that for production workloads. Perhaps that would have turned out great, but it wasn&rsquo;t ready in my opinion. Bring your own has its own obvious downsides, and we have another option so it was the obvious choice to go with managed nodegroups. It&rsquo;s yet another thing that simplifies some of the deployment and maintenance burden, and we generally trust AWS solutions to work well which has been the case with managed nodegroups for us.</p>
<h2 id="access-control">Access Control</h2>
<p>I&rsquo;m talking about specifically how to authenticate to the cluster API. It was a no-brainer to map IAM roles that our team already uses with SSO to Access Entries on the cluster. This allows us to go into the terminal, run <code>assume</code> (a granted.dev tool) and authenticate to AWS SSO.</p>
<p>I evaluated IRSA, but there was more effort to integrate with our OIDC provider and we didn&rsquo;t have a need for fine-grained access controls up front.</p>
<p>RBAC is an area where we have lots of room for improvement. For now, it&rsquo;s like giving the whole team root access to the whole cluster. Granted it&rsquo;s only our team who has any access, but we still don&rsquo;t follow the principle of least privilege here at this time, partly due to the process requiring manual updates via Helm, etc. No other teams require any level of access to the cluster, so we aren&rsquo;t concerned with taking a tiered approach in this particular situation.</p>
<h2 id="config-management">Config Management</h2>
<p>Let&rsquo;s talk about updating <code>config.toml</code>. I originally included the GitLab runner config within the same project repo that builds and deploys the EKS cluster. I now realize my mistake, because changing a runner config (maybe we need to increase the log limit) means running the EKS deployment pipeline. While it is idempotent, it&rsquo;s still awkward at best. Also, the process is to update the repo, run through the pipeline, and on top of that we still have a manual process to authenticate and manually run <code>helm</code> commands to upgrade deployments.</p>
<p>I see a couple of opportunities for improvement:</p>
<ul>
<li>I should have put gitlab runner jobs in their own namespace. I did create a namespace for everything gitlab runner related, but this also includes the gitlab-runner deployment itself which is a minor annoyance, and doesn&rsquo;t help when aggregating things like performance metrics.</li>
<li>Config for gitlab-runner deployments (used with the Helm charts) are mixed into the same repo as the EKS infrastructure build. This should go in its own repo.</li>
<li>GitOps. What was I thinking? I should have used FluxCD up front. Even fixing some issues with config management and splitting out namespaces properly leaves us with a significant amount of manual effort to make simple config changes.
<ul>
<li>The process is more tedious than I would expect. With the old system it was: SSH into the test GitLab host &gt; backup and live edit <code>config.toml</code> &gt; test &gt; repeat for production. With the new system, the process is longer: checkout a feature branch &gt; update the config &gt; manually test the config by running <code>helm upgrade</code> commands (assuming I&rsquo;m already set up with kubectl, kubeconfig, authentication, helm, helm repos, etc.) &gt; test &gt; merge to main &gt; repeat for production.</li>
<li>Why bother updating a config in a repo when you&rsquo;re going to change it manually in production? We all hope nobody ever makes a mistake and they get out of sync, but I&rsquo;ve never seen this work well, anywhere. I assume there will ALWAYS be drift because we built a system that allows that there CAN be drift. What if there&rsquo;s some production outage, and the on-call needs to fix it quickly overnight? The chances it will go back into the repo later are low, despite everyone&rsquo;s best intentions.</li>
<li>All this complexity, all a waste of time and effort. What if there was a system that could proactively sync what&rsquo;s in the config repo with what&rsquo;s currently running in the cluster? Welcome to GitOps.
<ul>
<li>This is on the roadmap before we do anything new in the cluster. The new process will be: update the config in the repo &gt; open a MR &gt; test (in the cluster) &gt; merge to main &gt; validate in production</li>
<li>Anyone can do it</li>
<li>Nobody needs to install special tools or question IAM permissions, including junior engineers or even interns</li>
<li>We can enforce change approvals</li>
<li>Visibility - we have an audit trail. If it&rsquo;s not in the repo, it&rsquo;s not in the cluster</li>
</ul>
</li>
<li>I haven&rsquo;t even talked about what else is deployed in the cluster beyond gitlab runners. It&rsquo;s not much, but it&rsquo;s still something. FluxCD can and should also manage Cluster Autoscaler and Metrics Server.</li>
<li>FluxCD flips the traditional deployment model on its head, which has major security implications. If we grant FluxCD enough privileges to do its job, then we no longer need every engineer on our team to have full root access to the cluster via kubectl commands (any changes need to go through the repo, through FluxCD). Let&rsquo;s go into read only mode for debugging/troubleshooting but leave updates/writes to GitOps.
<ul>
<li>Sure, we still need a way in, maybe a break-glass option. One simple approach would be to authenticate to the AWS account where EKS lives, manually add a new access entry, and boom, you&rsquo;re in. Document this. Or even automate this, put a manual job in the repo pipeline called &ldquo;break glass&rdquo; with clear, simple instructions. It should be easy for anyone to find and execute, but also auditable.</li>
</ul>
</li>
</ul>
</li>
</ul>
<h2 id="all-together">All Together</h2>
<p>I&rsquo;ve talked through most of the design, much of which came from the initial POC but some of which comes from having been running this in production for over a year and what I&rsquo;ve learned as new needs arise.</p>
<p>This also doesn&rsquo;t give the full picture of everything involved in making it work. Beyond deploying EKS, managed nodegroups, cluster autoscaler, metrics server, and gaining access to the cluster, we still have to deploy gitlab-runners and wire them up to gitlab to accept jobs. We also have to consider nuances when it comes to how cluster autoscaler works and some quirks that it has.</p>
<p>Deploying this was a very low risk deployment due to the ability to add the new EKS infrastructure in parallel with existing runners, allowing us to phase out the old stuff tag by tag. EKS runners have new, unique tags, and eventually we would add legacy tags, and finally phase out the old runners once we validated that jobs worked reliably.</p>
<p>In the end it has been a success, despite having some snags along the way. We learned new things, and always came up with a new, better way to move forward. One of the major benefits to this architecture is the flexibility. It feels unlimited, there doesn&rsquo;t seem to be any hurdle too big to get past, and most hurdles have been fairly minor.</p>
<p>Other factors I considered throughout this process:</p>
<ul>
<li>Skillset. Our team is not experienced in Kubernetes, though several on the team are working towards the CKA certification. But with a team of Linux Admins, I know everyone is capable of following the steps. Once you get past the initial hurdle of getting authenticated and setting up local tools, the rest is pretty straightforward even for inexperienced engineers.
<ul>
<li>The way I see it, this is a good way to level up our team in general. In many ways already, it demonstrates best practices and the way things could/should be. It gives everyone production experience with Kubernetes and this only makes us better.</li>
</ul>
</li>
<li>Unknown behavior with migrating all of the different pipeline jobs to the new system. In theory it should work the same because it has the same default CPU/RAM resources available. In practice, resource limits are handled very differently. We are essentially introducing new limits or enforcing stricter limits than what we had previously. And this is essential so that we can tune scaling, establish safe defaults, and manage cost over time.</li>
<li>New maintenance schedule and process. EKS upgrades. Add-on upgrades. Component upgrades. Nodegroup AMI upgrades. The great news here is that it&rsquo;s possible to do all of these while workloads continue running, so we just unlocked significant maintenance improvements and security by extension by being able to keep things updated more frequently with less friction. And we can patch zero-day vulnerabilities any time with very little risk.</li>
<li>Migration from legacy runners. The plan was to introduce the new runners and ask developers to start testing. We had mixed results. The backup plan was to start switching tags to the new runners, and then removing those tags from the old runners, phasing them out. Once we phase the old runners out and haven&rsquo;t had any outages or break/fix tickets come in, we can finally decommission the old infrastructure and fully realize the security and cost savings.</li>
</ul>
<h1 id="gotchas">Gotchas</h1>
<h2 id="-eksctl">&gt; eksctl</h2>
<p>This is a handy utility, but is not idempotent by design. If you deploy a new cluster with this tool, and later on decide to change something about that cluster (but don&rsquo;t want to rebuild), you can&rsquo;t necessarily update your config and redeploy. Even the command is different: new builds are <code>eksctl create cluster</code> where certain changes can be done with <code>eksctl upgrade cluster</code>. Some changes to managed nodegroups can be done with <code>eksctl update nodegroup</code> but others require completely rebuilding the nodegroup. But in the end, the total cluster config and deployment pipeline is significantly simpler to build and understand than it would have been using CloudFormation.</p>
<h2 id="-eks-built-in-cni">&gt; EKS Built-in CNI</h2>
<p>The AWS CNI that comes out of the box is that assigns a unique IP from your VPC subnet to every pod. I was not expecting this, and during initial POC testing I quickly found out when I started scaling up parallel jobs. Luckily it&rsquo;s possible to add a new CIDR block to the existing VPC, so I went ahead and added a new block and split that into 2 subnets in different AZs, then wired those up to the existing route table and out through the existing NAT gateway. Crisis averted. We could have switched CNIs as well, but I was ready to move forward, and since this was supported it was a little bit less effort in the short term. If we start going beyond 1000 parallel jobs then I&rsquo;ll be revisiting this one and moving to something else. For now, we&rsquo;re well below that, so it&rsquo;s noted and we can move on.</p>
<h2 id="-manual-changes">&gt; Manual Changes</h2>
<p>Manual changes still need to be made in the current state. After building the cluster (fully automated), it&rsquo;s in a vanilla state and requires you to manually deploy your resources. For us this includes Metrics Server, Cluster Autoscaler and gitlab-runners. This is an area where GitOps would help tremendously.</p>
<h2 id="-spot-instances">&gt; Spot Instances</h2>
<p>This isn&rsquo;t too difficult. In the eksctl config, enable <code>spot: true</code> and be sure to specify at least 2 or 3 different instance types within the same family with the same minimum CPU/RAM specs. Just in case one type is not available during spot request, it can pick from something else so that jobs are not stuck pending. I also ran into this. You&rsquo;re already saving so much money, this is not the time to be a stickler.</p>
<p>I think that&rsquo;s it for spot instances.</p>
<h2 id="-oomkilled-jobs">&gt; OOMKilled Jobs</h2>
<p>In our case going from EC2 instances with 8GB RAM and 4GB swap to pods with a strict 8GB limit caused a couple of OOMKills here and there. It turns out this was one of the unknowns we anticipated going into the migration.</p>
<p>Why does this happen if the RAM limit matches? On the EC2 instance, the job has full reign of the VM it runs on, so not only is the Linux kernel more conservative about preventing OOMkills, it also has a 4GB swap buffer that it can exhaust before giving up.</p>
<p>This is actually a major factors I found to be related to the significant performance increase for some workloads on EKS, despite no changes to the pipeline and similar resource limites from a CPU/RAM perspective. Why would a job run 25% faster? It could be that we were silently exceeding the resource limits on legacy runners but it wasn&rsquo;t bad enough to break jobs, just bad enough to make them slow.</p>
<p>Does this mean EKS is worse or less reliable in a way? No. This makes it obvious that jobs were already overprovisioned, but nobody was aware of it. Nobody was even aware of the performance degradation due to heavy over-utilization of swap. I see it as a feature, it highlights the need for better visibility, and it highlights the importance of understanding what we <strong>truly</strong> need from a resource perspective to run jobs. No more throwing hardware at the problem.</p>
<p>We should understand what jobs cost from a resource and financial perspective, because that will allow us properly tune autoscaling, make things run much more efficiently, likely save significantly on cost, and make informed decisions on resource quota increases if needed.</p>
<p>We&rsquo;re almost done with OOMkilled, but not quite. Guess what, it&rsquo;s super easy to override requests and limits in GitLab pipelines to ask for more CPU or RAM. I have this documented in a simple to follow guide with screenshots and specific examples of different messages that developers might see which could be related. I set upper limits on CPU and RAM requests. I explain why you should probably not set CPU limits, especially with Java workloads. And I just handed them self-service autonomy that frees up our team from troubleshooting OOMkill issues and rather provides developers what they need to keep working 24/7.</p>
<p>There&rsquo;s more to talk about on this topic, including setting lower default limits once developers are more comfortable overriding these values. Looking at utilization, 8GB RAM is a pretty high limit which means we are very loosely packing jobs and wasting a lot of resources still. Keep in mind this still fits within the 50% savings estimate, but we could be doing significantly better by tightening up requests for most jobs that only need 1-2GB, and packing more of those jobs into a single worker. Another approach to this problem is encouraging developers who will listen to override their requests for lightweight jobs to lower values, proactively. Even if we leave the default at 8GB, if developers are proactive about this it can still offer them tangible benefits. They are more likely to get their jobs scheduled sooner, because a 2GB slot is easier to schedule than an 8GB slot, etc.</p>
<h2 id="-cluster-autoscaler">&gt; Cluster Autoscaler</h2>
<p>To me, Cluster Autoscaler is the source of the most interesting gotchas I ran into. These took the most time and caused most of the hurdles that we have had to overcome. Here&rsquo;s what I ran into, in order.</p>
<h3 id="-scaling-up">&raquo; Scaling Up</h3>
<p>This is the easy part. Deploy Cluster Autoscaler (CA) and configure its min/max thresholds. However, there&rsquo;s a little bit of glue you need to tie this together with the Managed Nodegroups and the AutoScalingGroups (ASG) they manage. It&rsquo;s not 100% ready out of the box. Here&rsquo;s a recap of what needs to happen, how it works:</p>
<ul>
<li>CA needs permission to read and make changes to the ASG. This is done via an IAM role attached to the EC2 instance powering your worker nodes (every worker node has the same profile). If you specify <code>autoScaler: true</code> in the eksctl config, this is all handled automatically.</li>
<li>Cluster Autoscaler (CA) interacts directly with the ASG, increasing desired capacity to scale up. It identifies the correct ASG by matching with tags on the ASG. These are already set in the eksctl config.</li>
<li>It&rsquo;s pretty easy to check if this works. If you submit more jobs from GitLab, wait a minute and then check to see if the ASG requested more instances. If that doesn&rsquo;t work, move to troubleshooting (checking CA logs, IAM permissions, etc.).</li>
</ul>
<h3 id="-scaling-down">&raquo; Scaling Down</h3>
<p>This is where it starts to get really interesting. During initial testing, scale-up and scale-down worked great. I believe the default was 10 minutes before scaling down, but I didn&rsquo;t experience any problems. After running more workloads over time, I got a ticket that jobs were pending for a long time. I quickly realized that autoscaling wasn&rsquo;t working. The online worker nodes were taking jobs just fine, but CA wasn&rsquo;t doing what I expected.</p>
<p>What went wrong here?</p>
<p>The initial design was intentionally simple (KISS). I deployed a single managed nodegroup and deployed everything on it, including all gitlab-runner deployments, and CA itself. Now it&rsquo;s obvious in hindsight. When you run important things on the very nodes you are potentially scaling down and terminating, bad things can happen. CA terminated the node where the CA deployment itself lived, and it got stuck in a bad state.</p>
<ul>
<li>&ldquo;If we were your kids, we&rsquo;d punish ourselves.&rdquo; - Little Rascals, 1994</li>
</ul>
<p>CA was punishing itself and I gave it no other choice.</p>
<p>The solution? What if I deployed another, smaller managed nodegroup (2 nodes for HA), labeled it management, and isolated it from CA? So that&rsquo;s exactly what I did. Add that to the EKS deployment, add labels and taints, and update taints/tolerations on the CA deployment. I protected it from itself, and this problem is solved.</p>
<h4 id="-management-nodegroup">&gt; Management Nodegroup</h4>
<p>Without going into too much detail, just keep in mind that this means we have to manage taints and tolerations so that management workloads go to management nodes, and job workloads go to job nodes. This involves node selectors and tolerations on the deployment side of things.</p>
<h3 id="-scaling-down-to-0">&raquo; Scaling Down To 0</h3>
<p>This actually began as a request for a brand new type of runner we had not implemented before. There was a new need to build Dockerfiles on <code>arm64</code>, so I began evaluating options. Just because you can doesn&rsquo;t mean you should, and while my first option was running a QEMU based build which could spit out multi-platform images from a single <code>amd64</code> node, it wasn&rsquo;t efficient and wasn&rsquo;t worth the effort. The developers already had a completely separate Dockerfile with different build steps, so why not just give them a runner to build on <code>arm64</code> natively?</p>
<p>I decided to create another managed nodegroup, but given the type of workload, the infrequency, and risk if jobs get interrupted, I decided it was worthwhile to base this on spot instances, plus while we&rsquo;re at it let&rsquo;s scale this one down to 0 since most of the time it&rsquo;s unused and would be sitting idle, wasting money.</p>
<h4 id="-scaling-back-up-from-0">&gt; Scaling Back Up From 0</h4>
<p>The easy part is scaling down, that was already functioning. But Once you remove all instances from the ASG, now there&rsquo;s a problem. How does CA know which ASG to go modify? Previously, it would use the running Kubernetes node to discover ASG info. When you get to the point where this ASG has no worker nodes running in the cluster, CA has less visibility, requiring you to add a specific tag to the ASG that it uses to identify it.</p>
<p>Specifically, you need to manually add a <strong>tag</strong> on the ASG <code>k8s.io/cluster-autoscaler/node-template/label/nodegroup: [your-nodegroup-name]</code> where you substitute the name of your nodegroup there, plus you need to have a <strong>label</strong> on your managed nodegroup that matches, which can be predefined in the eksctl config when you create the nodegroup.</p>
<p>I may have missed how to do this. I scoured the documentation and spent some trial/error time on this, but was unable to find a solution to fully automate this tagging in the build pipeline. Therefore I have instructions to specify that whenever using a managed nodegroup that scales to 0, you need to manually add the appropriate tag on the ASG. It&rsquo;s one extra manual step. It would still be nice if it could be automated, and perhaps Copilot is right when it suggests it could be done by adding the right label in the right format within eksctl config.</p>
<p>I should revisit this some time. Even 1 simple manual step can become a larger issue over time.</p>
<h3 id="-scaling-down-again">&raquo; Scaling Down, Again</h3>
<p>Wait, there&rsquo;s more?! Yes, there&rsquo;s more. This is the one I find the most interesting. I&rsquo;m not sure if favorite is the right word but&hellip;</p>
<p>So we&rsquo;re finally in a pretty good state, and people know what to do if they see OOMKilled (they have documentation and practice adjusting limits). I receive another troubleshooting ticket. This time a job has failed because the pod was terminated unexpectedly. Let&rsquo;s dive into what happened.</p>
<p>Starting with the failed job log, it clearly states that the pod was terminated before the timeout, which was unexpected. I looked at the cluster autoscaler logs to find that it scaled down the very node this job was running on.</p>
<p>Back to the documentation: <a href="https://docs.aws.amazon.com/eks/latest/best-practices/cas.html">https://docs.aws.amazon.com/eks/latest/best-practices/cas.html</a></p>
<ul>
<li>I find a section named &ldquo;Prevent Scale Down Eviction&rdquo; which clearly states that expensive to evict pods should have the annotation <code>cluster-autoscaler.kubernetes.io/safe-to-evict=false</code></li>
<li>I apply the annotation and test in a test environment (run several long-running jobs in parallel to trigger scale-up, then wait)</li>
<li>It doesn&rsquo;t help, and Cluster Autoscaler is still resulting in premature job termination during scale-down</li>
</ul>
<p>What&rsquo;s wrong with Cluster Autoscaler on EKS, and why doesn&rsquo;t it behave as advertised in the AWS documentation? I can&rsquo;t answer that question, to be honest. The documentation is in fact wrong. It actually doesn&rsquo;t work. Is there a bug with the Cluster Autoscaler?</p>
<p>I found an issue on Cluster Autoscaler&rsquo;s GitHub page that matches my symptoms: <a href="https://github.com/kubernetes/autoscaler/issues/8196">https://github.com/kubernetes/autoscaler/issues/8196</a></p>
<p>If I understand the analysis correctly, the root cause is related to AZRebalance and how AWS terminates nodes when decreasing the desired capacity on the ASG. It&rsquo;s not a bug that can be fixed in Cluster Autoscaler, and AWS is not going to address this issue on their end.</p>
<p>OK&hellip; so what can we do?</p>
<p>I figured since there will be no official fix, we should try to work around the issue. PodDisruptionBudget came to mind. What if you can apply a PDB with <code>maxUnavailable: 0</code>? I tried it, and this time it worked. Great.</p>
<p>I posted my workaround in the GitHub issue so that hopefully it helps some people:https://github.com/kubernetes/autoscaler/issues/8196#issuecomment-3353620324</p>
<p>One last thing to note about this workaround: The documentation explains that using .spec.maxUnavailable is unsupported when using a PDB selector on a resource that is managed by something else. We are in a gray area, because while the gitlab-runner deployment is responsible for deploying job pods, the pods themselves aren&rsquo;t controlled by something else like a ReplicaSet or Deployment. So having the label/selector on the pod, combined with .spec.maxUnavailable does work as expected. However, there are <code>UnmanagedPods</code> warnings in the PDB event history:
<code>Pods selected by this PodDisruptionBudget (selector: &amp;LabelSelector{MatchLabels:map[string]string{eviction-protection-pdb: true,},MatchExpressions:[]LabelSelectorRequirement{},}) were found to be unmanaged. As a result, the status of the PDB cannot be calculated correctly, which may result in undefined behavior. To account for these pods please set &quot;.spec.minAvailable&quot; field of the PDB to an integer value.</code></p>
<p>I&rsquo;m OK with this warning because this works exactly as I expect it to. I have this documented thoroughly for future reference, and I consider this one mitigated.</p>
<h3 id="-azrebalance">&raquo; AZRebalance</h3>
<p>Here we go again. Another ticket comes in, another job terminated prematurely. I thought we fixed autoscaling (and we kind of did), but now the AZRebalance feature is biting us.</p>
<p>This is not exactly related to Cluster Autoscaler, but kind of similar in a way. Instead of scaling up or down, it rebalances which is a fancy way of saying add a new node in a different AZ, then terminate a node in the hot-spot AZ.</p>
<p>Verifying the failed job log, it failed prematurely. Digging deeper, we look at ASG events and also CloudTrail, and find a tight correlation: a node was terminated seconds after the job failed unexpectedly. Full disclosure, <code>kiro-cli</code> is an amazing tool for digging through event history and logs in AWS, and correlating data with event times. It basically did all the work in this case, although it had completely the wrong event initially, I called out the timing mismatch, and it found another AZRebalance event that matched up perfectly.</p>
<p>What happened in this case?</p>
<ul>
<li>As nodes are scaled up and down, it&rsquo;s possible to find yourself in a situation where there&rsquo;s an unbalanced number of nodes in a single AZ.</li>
<li>Independently, AZRebalance is checking for this case. Its job is to rebalance when things are not balanced.</li>
<li>In order to balance things, AZRebalance will spin up a new node in another AZ with fewer nodes, and then it will terminate the extras in the hot-spot AZ.</li>
<li>AZRebalance doesn&rsquo;t discriminate, it doesn&rsquo;t even check what&rsquo;s running. It cares not about your workloads (and rightfully so, we need to build around this understanding). In this case, the node it selected as tribute was one that had production jobs running on it.</li>
</ul>
<p>What can we do about it?</p>
<ul>
<li>We can add yet another tool, AWS Node Termination Handler. That also involves setting up an SQS Queue and EventBridge. Not the end of the world, but one more thing to maintain.
<ul>
<li>More importantly, we are under pressure to deliver on other things right now, so building this out requires more engineering effort.</li>
</ul>
</li>
<li>OR the quick and dirty approach is to disable AZRebalance. It&rsquo;s not ideal, but it got us in a more reliable state immediately. The caveat is that we document this, and make sure it&rsquo;s included in our backlog to resolve later on. This still leaves us in a potentially bad situation where nodes can hot-spot in a single AZ, exposing us to more risk if that AZ has an outage. But we are OK with the tradeoffs for the short term.</li>
</ul>
<p>My tentative plan is to incorporate AWS Node Termination Handler. It appears to be relatively straightforward to implement and deploy. Obviously this would be built right into the pipeline that builds the EKS cluster, automating all of the setup. I&rsquo;m not sure if I would jump to CloudFormation or the CDK, but whichever seems simpler to configure and maintain is what I would lean toward, starting with CloudFormation just because we are using it for another small piece of the pipeline already.</p>
<p>Disabling the rebalancer is not maintainable long term, as any new managed nodegroups would have it enabled by default and require someone to remember (never ideal). Even upgrading a nodegroup as we did when moving from BottleRocket to AmazonLinux2023 would have the same impact.</p>
<h1 id="conclusion">Conclusion</h1>
<p>What began as an oversimplified idea in my head evolved into a production ready, scalable project that resulted in noticeably better performance, significantly reduced maintenance burden, better observability, and significant cost savings. Despite thinking this would be a relatively straightforward intro to Kubernetes for our team, we still ran into lots of mostly small gotchas. Luckily, similar to Linux, Kubernetes is extremely flexible and powerful, making it possible to find a way forward in every circumstance. This is where Kubernetes shines, helping us manage a sufficiently complex architecture and standardizing deployment, troubleshooting and observability. Developers are happy, the business is happy, our team is happy, and we learned a lot along the way.</p>
]]></content:encoded>
    </item>
    
    <item>
      <title>Kubernetes Storage - OpenEBS Replicated Storage Mayastor</title>
      <link>https://blog.dalydays.com/post/kubernetes-storage-with-openebs/</link>
      <pubDate>Tue, 21 Jan 2025 00:00:00 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/kubernetes-storage-with-openebs/</guid>
      <description>Let&amp;rsquo;s walk through deploying OpenEBS Replicated Storage with the Mayastor engine on Talos Linux!</description>
      <content:encoded><![CDATA[<h1 id="intro-and-prerequisites">Intro and Prerequisites</h1>
<p>In a previous post, I mentioned that I struggled to get OpenEBS working in Talos and instead went with democratic-csi. In recent weeks, I decided to revisit this and figure out how to get OpenEBS replicated storage working in order to evaluate replicated storage in my cluster. I now have multiple disks that I can dedicate to my Kubernetes cluster and wanted to avoid the issue with the single point of failure using a TrueNAS VM for democratic-csi.</p>
<p>If you are following along, I will assume you are familiar with deploying Talos Linux itself and have talosctl installed with an existing cluster running. If you need more details on how to do that, check out <a href="https://blog.dalydays.com/post/kubernetes-homelab-series-part-1-talos-linux-proxmox/">https://blog.dalydays.com/post/kubernetes-homelab-series-part-1-talos-linux-proxmox/</a>.</p>
<h1 id="dedicated-storage-node">Dedicated Storage Node</h1>
<p>It&rsquo;s not absolutely necessary to use a dedicated storage node. I&rsquo;m doing this because I want to pass a disk directly to a VM for storage on each physical host and want to keep storage somewhat isolated from other worker nodes, and I can spare the few extra resources to dedicate to this purpose. If you want to use existing worker nodes, just follow this process for your existing nodes instead of creating new ones.</p>
<h2 id="create-new-talos-nodes">Create New Talos Node(s)</h2>
<ul>
<li>Create a VM in Proxmox with 4GB RAM and 4 vCPU cores (2GB RAM is not enough due to the fact that you will be enabling hugepages which takes up 2GB and you would see oom-kills otherwise. You also need a dedicated CPU core just for the io-engine, along with all the other stuff that runs. I tried with 2vCPU and it wouldn&rsquo;t schedule the io-engine pod due to insufficient resources). I named my first one talos-storage-1
<ul>
<li>Attach a Talos ISO to the CD ROM and boot from it</li>
<li>Get the IP address from the node</li>
</ul>
</li>
<li>Install Talos using the worker.yaml template used for other worker nodes (you may want to get a current or updated version of Talos from the image factory):
<ul>
<li><code>talosctl apply-config --insecure -n 10.0.50.135 --file _out/worker.yaml</code></li>
</ul>
</li>
<li>Apply a patch to set a static IP and node label, e.g.</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># ./patches/storage1.patch</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">machine</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">network</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">hostname</span><span class="p">:</span><span class="w"> </span><span class="l">talos-storage-1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">interfaces</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">deviceSelector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">busPath</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;0*&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">addresses</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="m">10.0.50.31</span><span class="l">/24</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">routes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="nt">network</span><span class="p">:</span><span class="w"> </span><span class="m">0.0.0.0</span><span class="l">/0</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">gateway</span><span class="p">:</span><span class="w"> </span><span class="m">10.0.50.1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">nameservers</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="m">192.168.1.22</span><span class="w">
</span></span></span></code></pre></div><ul>
<li><code>talosctl patch mc -n 10.0.50.135 --patch @patches/storage1.patch</code>
<ul>
<li>I&rsquo;m having trouble here with the node name changing, and I have to manually delete the random name from the cluster:</li>
<li>e.g. <code>kubectl delete node talos-lry-si8</code></li>
<li>Also you might need to delete the label <code>openebs.io/nodename=</code> if you already have openebs running and are adding/changing nodes
<ul>
<li><code>kubectl edit node talos-storage-1</code> and change the value to the current node name</li>
</ul>
</li>
</ul>
</li>
<li>Apply a patch to set some machine config stuff for OpenEBS which includes hugepages, a nodeLabel for where mayastor engine should run, and the <code>/var/local</code> bind mount:</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># ./patches/openebs.patch</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">machine</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">sysctls</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">vm.nr_hugepages</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;1024&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">nodeLabels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">openebs.io/engine</span><span class="p">:</span><span class="w"> </span><span class="l">mayastor</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">extraMounts</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">destination</span><span class="p">:</span><span class="w"> </span><span class="l">/var/local</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l">bind</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">source</span><span class="p">:</span><span class="w"> </span><span class="l">/var/local</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">options</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="l">rbind</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="l">rshared</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="l">rw</span><span class="w">
</span></span></span></code></pre></div><ul>
<li><code>talosctl patch mc -n 10.0.50.31 --patch @patches/openebs.patch</code></li>
<li>If you have an additional disk to use with OpenEBS, you&rsquo;ll need to pass it directly to the Talos node VM. I&rsquo;m using Proxmox
<ul>
<li>SSH into the Proxmox host and find the disk ID to be passed. I just run <code>ls -lh /dev/disk/by-id/ and get the root disk (not containing any &quot;_1&quot; or &quot;_part*&quot;, for example </code>/dev/disk/by-id/nvme-Samsung_SSD_970_EVO_Plus_2TB_S59CNM0W635077P`)</li>
<li>Pass the disk directly to the Talos VM, where 511 is the VM ID and assuming you only have 1 disk already on scsi0: <code> qm set 511 -scsi1 /dev/disk/by-id/nvme-Samsung_SSD_970_EVO_Plus_2TB_S59CNM0W635077P</code></li>
<li>Checking the hardware tab on VM 511 in Proxmox, you should see this new disk. Double click it and make sure to check &ldquo;Advanced&rdquo;, &ldquo;Discard&rdquo;, and &ldquo;SSD emulation&rdquo;
<ul>
<li>If you see orange on these settings, you will need to shut down the VM, then power it back on for the changes to apply. Rebooting won&rsquo;t do it.</li>
</ul>
</li>
<li>Now that the disk has been added, look for it with talosctl: <code>talosctl get disks -n 10.0.50.31</code>
<ul>
<li>In my case I see a disk named <code>sdb</code> which is 2.0TB with model &ldquo;QEMU HARDDISK&rdquo;</li>
</ul>
</li>
<li>Mount the disk to be passed to containers with appropriate privileges. This is required for openebs-io-engine to access the extra disk.</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># ./patches/mount-sdb.patch</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">machine</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">disks</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">device</span><span class="p">:</span><span class="w"> </span><span class="l">/dev/sdb</span><span class="w">
</span></span></span></code></pre></div><ul>
<li>Apply: <code>talosctl patch mc -n 10.0.50.31 --patch @patches/mount-sdb.patch</code> - At this point the Talos node will reboot and should come back up healthy in a minute.</li>
<li>View the console or check the dashboard with <code>talosctl dashbord -n 10.0.50.31</code></li>
<li>If you see an error about being unable to mount the disk or the partition being the wrong type, etc. you will need to wipe the disk and create a fresh GPT partition. As of Talos 1.9.0 this can be done with <code>talosctl wipe disk sdb -n 10.0.50.31</code>, otherwise you would need to do this outside of Talos.
<ul>
<li><code>talosctl wipe disk sdb -n 10.0.50.31</code> - where <code>sdb</code> is the device, you confirmed this right? Confirm using <code>talosctl get disks -n 10.0.50.31</code></li>
<li>Otherwise from Proxmox you could do this: Shut down the VM and do this in proxmox with <code>wipefs /dev/yourdev</code> and then use <code>fdisk /dev/yourdev</code> &gt; <code>g</code> &gt; <code>w</code> (<code>g</code> writes a new GPT table, <code>w</code> writes to disk). Now power Talos back on and it should do its thing.</li>
<li>Yet another option would be to boot into a different Linux ISO on the VM and use a tool like Gparted. Whatever you like best.</li>
</ul>
</li>
</ul>
</li>
</ul>
<h3 id="lets-verify-our-disk-mount">Let&rsquo;s Verify Our Disk Mount</h3>
<p>When Talos successfully mounts the extra disk, we should see it listed with <code>lsblk</code> without any partitions. We want to pass the raw disk to OpenEBS. In order to check, run a debug pod on your storage node and check the bind mounts.</p>
<ul>
<li><code>kubectl debug node/talos-storage-1 -it --image=alpine -- /bin/sh</code></li>
<li><code>apk add lsblk</code></li>
<li><code>lsblk</code></li>
<li>Check for the mount, showing the full capacity of your disk.</li>
</ul>
<p>Now repeat this whole process for any other Talos nodes you need. I have 3, so I&rsquo;m doing <code>talos-storage-1</code>, <code>talos-storage-2</code> and <code>talos-storage-3</code>.</p>
<h2 id="worker-nodes-also-need-varlocal-mounted">Worker Nodes Also Need /var/local Mounted</h2>
<p>Certain OpenEBS components run on any node, and this requires all worker nodes to have <code>/var/local</code> mounted.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># ./patches/mount-var-local.patch</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">machine</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">kubelet</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">extraMounts</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">destination</span><span class="p">:</span><span class="w"> </span><span class="l">/var/local</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l">bind</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">source</span><span class="p">:</span><span class="w"> </span><span class="l">/var/local</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">options</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="l">rbind</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="l">rshared</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="l">rw</span><span class="w">
</span></span></span></code></pre></div><p>In my case, I applied this to my 3 worker nodes. I don&rsquo;t think a reboot is required, but you could if you wanted to:</p>
<ul>
<li><code>talosctl patch mc -n 10.0.50.21 --patch @patches/mount-var-local.patch</code></li>
<li><code>talosctl patch mc -n 10.0.50.22 --patch @patches/mount-var-local.patch</code></li>
<li><code>talosctl patch mc -n 10.0.50.23 --patch @patches/mount-var-local.patch</code></li>
</ul>
<p>In order to check the other bind mount for <code>/var/local</code>, we have to wait until after deploying OpenEBS because the mount isn&rsquo;t utilized until a pod is deployed with a HostPath volume at or below this path. Specifically, the <code>openebs-io-engine-*</code> Daemonset maps to this path.</p>
<h1 id="installing-openebs">Installing OpenEBS</h1>
<p>This was a pain to figure out. Documentation from OpenEBS is lacking, and so is documentation from Talos on the same topic. Here&rsquo;s what I found to work. You need a privileged namespace, bind mounts on all worker nodes, then DiskPools before you can start testing PVCs.</p>
<h2 id="privileged-namespace">Privileged Namespace</h2>
<p>OpenEBS requires privileges, and the easiest way to handle that is by making the namespace privileged (rather than messing with machine configs).</p>
<ul>
<li>Add a new privileged namespace. The Helm chart wants you to use <code>openebs</code> so do this:</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># namespace.yaml</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Namespace</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">openebs</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">labels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">pod-security.kubernetes.io/enforce</span><span class="p">:</span><span class="w"> </span><span class="l">privileged</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">pod-security.kubernetes.io/warn</span><span class="p">:</span><span class="w"> </span><span class="l">privileged</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">pod-security.kubernetes.io/audit</span><span class="p">:</span><span class="w"> </span><span class="l">privileged</span><span class="w">
</span></span></span></code></pre></div><ul>
<li><code>kubectl apply -f namespace.yaml</code></li>
</ul>
<h2 id="helm-installation">Helm Installation</h2>
<ul>
<li><code>helm repo add openebs https://openebs.github.io/openebs</code></li>
<li><code>helm repo update</code></li>
<li>Grab the values from the Helm chart (<code>helm show values openebs/openebs &gt; values.yaml</code>), or use this. I have already modified the config to disable initContainers which is a known issue with Talos, and disabled local provisioners that I&rsquo;m not interested in.</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># values.yaml</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">openebs-crds</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">csi</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">volumeSnapshots</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">enabled</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">keep</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="c"># Refer to https://github.com/openebs/dynamic-localpv-provisioner/blob/v4.1.2/deploy/helm/charts/values.yaml for complete set of values.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">localpv-provisioner</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">rbac</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">create</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="c"># Refer to https://github.com/openebs/zfs-localpv/blob/v2.6.2/deploy/helm/charts/values.yaml for complete set of values.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">zfs-localpv</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">crds</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">zfsLocalPv</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">enabled</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">csi</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">volumeSnapshots</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">enabled</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="c"># Refer to https://github.com/openebs/lvm-localpv/blob/lvm-localpv-1.6.2/deploy/helm/charts/values.yaml for complete set of values.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">lvm-localpv</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">crds</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">lvmLocalPv</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">enabled</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">csi</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">volumeSnapshots</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">enabled</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="c"># Refer to https://github.com/openebs/mayastor-extensions/blob/v2.7.2/chart/values.yaml for complete set of values.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">mayastor</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">csi</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">node</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">initContainers</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">enabled</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">etcd</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c"># -- Kubernetes Cluster Domain</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">clusterDomain</span><span class="p">:</span><span class="w"> </span><span class="l">cluster.local</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">localpv-provisioner</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">enabled</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">crds</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">enabled</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="c"># -- Configuration options for pre-upgrade helm hook job.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">preUpgradeHook</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">image</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c"># -- The container image registry URL for the hook job</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">registry</span><span class="p">:</span><span class="w"> </span><span class="l">docker.io</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c"># -- The container repository for the hook job</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">repo</span><span class="p">:</span><span class="w"> </span><span class="l">bitnami/kubectl</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c"># -- The container image tag for the hook job</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">tag</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;1.25.15&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c"># -- The imagePullPolicy for the container</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">pullPolicy</span><span class="p">:</span><span class="w"> </span><span class="l">IfNotPresent</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">engines</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">local</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">lvm</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">enabled</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">zfs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">enabled</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">replicated</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">mayastor</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">enabled</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span></code></pre></div><ul>
<li><code>helm install openebs -n openebs openebs/openebs -f values.yaml</code></li>
<li>Verify: <code>kubectl get po -n openebs</code> and it should look something like this:</li>
</ul>
<pre tabindex="0"><code>NAME                                          READY   STATUS    RESTARTS      AGE
openebs-agent-core-74d4ddc7c5-hjnxl           2/2     Running   0             9m23s
openebs-agent-ha-node-f9bsb                   1/1     Running   0             9m23s
openebs-agent-ha-node-gjdbt                   1/1     Running   0             9m23s
openebs-agent-ha-node-mwjq9                   1/1     Running   0             9m23s
openebs-agent-ha-node-rfjrw                   1/1     Running   0             93s
openebs-api-rest-757d87d4bd-zd2ms             1/1     Running   0             9m23s
openebs-csi-controller-58c7dfcd5b-6jtcq       6/6     Running   0             9m23s
openebs-csi-node-fmmwg                        2/2     Running   0             9m23s
openebs-csi-node-j95f5                        2/2     Running   2 (60s ago)   93s
openebs-csi-node-jxkvq                        2/2     Running   0             9m23s
openebs-csi-node-xtsnt                        2/2     Running   0             9m23s
openebs-etcd-0                                1/1     Running   0             9m23s
openebs-etcd-1                                1/1     Running   0             9m23s
openebs-etcd-2                                1/1     Running   0             9m23s
openebs-io-engine-lb8zr                       2/2     Running   0             9m23s
openebs-localpv-provisioner-657c44878-wjmwr   1/1     Running   0             9m23s
openebs-loki-0                                1/1     Running   0             9m23s
openebs-nats-0                                3/3     Running   0             9m23s
openebs-nats-1                                3/3     Running   0             9m23s
openebs-nats-2                                3/3     Running   0             9m23s
openebs-obs-callhome-8665bb8f6f-4ntrd         2/2     Running   0             9m23s
openebs-operator-diskpool-6d44884f8f-52rrx    1/1     Running   0             9m23s
openebs-promtail-2g6kz                        1/1     Running   0             9m23s
openebs-promtail-d72cx                        1/1     Running   0             93s
openebs-promtail-hsxlc                        1/1     Running   0             9m23s
openebs-promtail-npw7f                        1/1     Running   0             9m23s
</code></pre><ul>
<li>If pods are stuck initializing after a few minutes, start checking logs with <code>openebs-etcd-0</code> since a lot of things depend on that being up before it will initialize.
<ul>
<li>Related to <code>/var/local</code> bind mounts not being added to worker nodes, <code>openebs-etcd-*</code> will have Warning events about &ldquo;FailedMount&rdquo;, stating that a PVC doesn&rsquo;t exist. If you look closely, the path starts with <code>/var/local/...</code> which means you&rsquo;re missing the <code>/var/local</code> bind mount on one or more Talos worker/storage nodes.</li>
<li>The fix in this situation would be to fix the bind mount on the worker nodes, and you could also try a reboot (you can identify the node based on the pod having issues). There&rsquo;s no need to change the OpenEBS deployment.</li>
<li>After fixing it, you might see stale pods with errors. You can terminate pods in an error state, and if they are needed the Deployment or Daemonset responsible for them will recreate them as needed.</li>
</ul>
</li>
</ul>
<h2 id="add-diskpools">Add DiskPool(s)</h2>
<p>I was excited at this point to test with a PVC, but then was confused about why it wouldn&rsquo;t provision anything. The Talos docs feel a bit sparse and seem to imply that now is the time to test with a PVC. But if you pay close attention they only mention testing the local provisioner, adding to my confusion. It turns out you need to add DiskPools, which makes sense in hindsight. If you&rsquo;ve ever used Longhorn there is a similar config needed after the initial install, so that you know what disk capacity you have to work with.</p>
<ul>
<li>Earlier, we mounted that 2TB disk in the talos-storage-1 node. Now we&rsquo;ll use that for our first DiskPool</li>
<li>Get the disk ID by exec-ing into the openebs-io-engine pod
<ul>
<li>Identify one of the io-engine pods: <code>kubectl get po -l app=io-engine -n openebs</code></li>
<li>Exec into the pod: <code>kubectl exec -it openebs-io-engine-jpnrh -c io-engine -n openebs -- bin/sh</code></li>
<li><code>ls -lh /dev/disk/by-id/</code> - grab the one pointing to <code>/dev/sdb</code> in our case, which for me is <code>scsi-0QEMU_QEMU_HARDDISK_drive-scsi1</code></li>
</ul>
</li>
<li>FYI I went with uring instead of aio since it&rsquo;s the new kid on the block</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># diskpool-1.yaml</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;openebs.io/v1beta2&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">DiskPool</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">pool-1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">openebs</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">node</span><span class="p">:</span><span class="w"> </span><span class="l">talos-storage-1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">disks</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">&#34;uring:///dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi1&#34;</span><span class="p">]</span><span class="w">
</span></span></span></code></pre></div><ul>
<li><code>kubectl apply -f diskpool-1.yaml</code></li>
<li>Verify: <code>kubectl get dsp -n openebs</code> - quickly it should change to Created state and Online POOL_STATUS</li>
</ul>
<p>Repeat this process for any/all storage nodes you have. Since I&rsquo;m virtualizing Talos, the disk path is exactly the same on all 3 nodes so I can reuse the config, just updating the pool name and the node name.</p>
<h3 id="troubleshooting">Troubleshooting</h3>
<p>Sorry I can&rsquo;t be a ton of help here since I have only done really limited troubleshooting. If you run into issues with diskpools stuck in Creating status, you will need to describe the dsp or check logs.</p>
<p>What I can say is that I&rsquo;m running a homelab, which means I have 3 different disks, from 3 different manufacturers. Two of them worked in this configuration, but one did not. The disk is fine, it&rsquo;s fairly new, I&rsquo;ve tried wiping it multiple times, multiple ways, but it just would not work with the diskpool. I tried doing it as a bind mount and using /mnt/local/nvme2tb (this one requires adding the volume to the io-engine Daemonset), mounting /dev/sdb, mounting /dev/sdb1, everything I could think of, but it would not create. I rebuilt the Talos node but got the same results. I switched to a different disk and changed nothing else, and it works totally fine. For prosperity, these are the disks I have and which ones worked in this configuration.</p>
<ul>
<li>Samsung 970 EVO Plus 2TB - no problems</li>
<li>Samsung 990 EVO 2TB - no problems</li>
<li>WD BLACK SN770 2TB - no problems</li>
<li>Crucial P3 Plus 2TB (CT2000P3PSSD8) - COULD NOT GET THIS WORKING :(</li>
</ul>
<p>If you are reading this and you know or think you might know why this didn&rsquo;t work, please reach out! I&rsquo;m interested in understanding why this wouldn&rsquo;t work and how I could troubleshoot better.</p>
<h2 id="testing-a-replicated-pvc">Testing A Replicated PVC</h2>
<p>If you are here, you have at least one working diskpool and are ready to test that PVC provisioning works and can be attached to a running pod. Let&rsquo;s test that.</p>
<ul>
<li>Verify diskpools: <code>kubectl get dsp -n openebs</code> - for my 3 storage nodes with 2TB volumes, I&rsquo;m seeing this</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl">NAME     NODE              STATE     POOL_STATUS   CAPACITY        USED   AVAILABLE
</span></span><span class="line"><span class="cl">pool-1   talos-storage-1   Created   Online        <span class="m">1998443249664</span>   <span class="m">0</span>      <span class="m">1998443249664</span>
</span></span><span class="line"><span class="cl">pool-2   talos-storage-2   Created   Online        <span class="m">1998443249664</span>   <span class="m">0</span>      <span class="m">1998443249664</span>
</span></span><span class="line"><span class="cl">pool-3   talos-storage-3   Created   Online        <span class="m">1998443249664</span>   <span class="m">0</span>      <span class="m">1998443249664</span>
</span></span></code></pre></div><ul>
<li>Check what storage classes are available: <code>kubectl get sc</code></li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl">NAME                     PROVISIONER               RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
</span></span><span class="line"><span class="cl">mayastor-etcd-localpv    openebs.io/local          Delete          WaitForFirstConsumer   <span class="nb">false</span>                  6h15m
</span></span><span class="line"><span class="cl">mayastor-loki-localpv    openebs.io/local          Delete          WaitForFirstConsumer   <span class="nb">false</span>                  6h15m
</span></span><span class="line"><span class="cl">openebs-hostpath         openebs.io/local          Delete          WaitForFirstConsumer   <span class="nb">false</span>                  6h15m
</span></span><span class="line"><span class="cl">openebs-single-replica   io.openebs.csi-mayastor   Delete          Immediate              <span class="nb">true</span>                   6h15m
</span></span></code></pre></div><ul>
<li>For now, we&rsquo;re interested in testing that <code>openebs-single-replica</code> SC that uses Mayastor, so write this file. Note that this uses the default namespace:</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># test-pvc-openebs-single-replica.yaml</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nn">---</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">PersistentVolumeClaim</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">openebs-testpvc</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">storageClassName</span><span class="p">:</span><span class="w"> </span><span class="l">openebs-single-replica</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">accessModes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">ReadWriteOnce</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">resources</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">requests</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">storage</span><span class="p">:</span><span class="w"> </span><span class="l">10Gi</span><span class="w">
</span></span></span></code></pre></div><ul>
<li>Apply it: <code>kubectl apply -f test-pvc-openebs-single-replica.yaml</code></li>
<li>Check PV and PVC:
<ul>
<li><code>kubectl get pv</code></li>
<li><code>kubectl get pvc</code> - you should see a PVC named <code>openebs-testpvc</code> with status Bound and storage class <code>openebs-single-replica</code></li>
</ul>
</li>
<li>Deploy a test pod to attach the PVC to - this pod is also in the default namespace:</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># pod-using-testpvc.yaml</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nn">---</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Pod</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">testlogger</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">containers</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">testlogger</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">image</span><span class="p">:</span><span class="w"> </span><span class="l">alpine:3.20</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">command</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">&#34;/bin/ash&#34;</span><span class="p">]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">args</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">&#34;-c&#34;</span><span class="p">,</span><span class="w"> </span><span class="s2">&#34;while true; do echo \&#34;$(date) - test log\&#34; &gt;&gt; /mnt/test.log &amp;&amp; sleep 1; done&#34;</span><span class="p">]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">volumeMounts</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">testvol</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">mountPath</span><span class="p">:</span><span class="w"> </span><span class="l">/mnt</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">volumes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">testvol</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">persistentVolumeClaim</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">claimName</span><span class="p">:</span><span class="w"> </span><span class="l">openebs-testpvc</span><span class="w">
</span></span></span></code></pre></div><ul>
<li><code>kubectl apply -f pod-using-testpvc.yaml</code></li>
<li>Verify it&rsquo;s running: <code>kubectl get po testlogger</code></li>
<li>Exec into the test pod: <code>kubectl exec -it testlogger -- /bin/sh</code></li>
<li>Look at your mounts with <code>df -h /mnt</code>. Since OpenEBS Mayastor uses NVMe-oF you should see that mount path <code>/mnt</code> attached to what looks like a NVMe block device.</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl">Filesystem                Size      Used Available Use% Mounted on
</span></span><span class="line"><span class="cl">/dev/nvme0n1              9.7G     28.0K      9.2G   0% /mnt
</span></span></code></pre></div><ul>
<li>Hmm. The size doesn&rsquo;t quite add up. 9.7G is close to 10Gi but then 28.0K used does not equal 9.2G available. Hmm&hellip;</li>
<li>Cleanup:
<ul>
<li><code>kubectl delete -f pod-using-testpvc.yaml</code></li>
<li><code>kubectl delete -f test-pvc-openebs-single-replica.yaml</code></li>
</ul>
</li>
</ul>
<h2 id="what-is-a-single-replica-anyway">What Is A &ldquo;Single&rdquo; Replica Anyway?</h2>
<blockquote>
<p>A replica is an exact reproduction of something&hellip;</p>
</blockquote>
<p>A replica by definition cannot exist without copying something that already exists, which inherently means there must be at least 2. In the context of OpenEBS, a single replica just means you only have ONE copy of the data. It gets randomly assigned to one of the diskpools available. If that disk fails, that data is gone. But we are using OpenEBS for the purpose of replication, so how do we get more replicas??? Follow me.</p>
<ul>
<li>Create a new storage class with 2 replicas (feel free to do 3 or any value at this point):</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># openebs-2-replicas-sc.yaml</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nn">---</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">storage.k8s.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">StorageClass</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">openebs-2-replicas</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">parameters</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">protocol</span><span class="p">:</span><span class="w"> </span><span class="l">nvmf</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">repl</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;2&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">provisioner</span><span class="p">:</span><span class="w"> </span><span class="l">io.openebs.csi-mayastor</span><span class="w">
</span></span></span></code></pre></div><ul>
<li><code>kubectl apply -f openebs-2-replicas-sc.yaml</code></li>
<li>Verify: <code>kubectl get sc</code></li>
<li>Test - deploy another test-pvc using the new SC:</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># test-pvc-openebs-2-replicas.yaml</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nn">---</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">PersistentVolumeClaim</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">openebs-testpvc</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">storageClassName</span><span class="p">:</span><span class="w"> </span><span class="l">openebs-2-replicas</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">accessModes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">ReadWriteOnce</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">resources</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">requests</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">storage</span><span class="p">:</span><span class="w"> </span><span class="l">10Gi</span><span class="w">
</span></span></span></code></pre></div><ul>
<li><code>kubectl apply -f openebs-2-replicas.yaml</code></li>
<li>Test with a pod, using the same pod from earlier - <code>kubectl apply -f pod-using-testpvc.yaml</code></li>
<li>Exec into the test pod: <code>kubectl exec -it testlogger -- /bin/sh</code></li>
<li>Check your mounted disk: <code>df -h /mnt</code></li>
</ul>
<h1 id="conclusion">Conclusion</h1>
<p>I&rsquo;m going to stop here. I didn&rsquo;t go into any details about how to check which diskpools hold the replica(s) for the PV, but I assume there is a way to do that. I also did not look at how to recover in case there is a diskpool or storage node failure. Assuming you had 3 replicas, and one went down, there would be no data loss.</p>
<p>I didn&rsquo;t cover performance, monitoring, recovery, or anything else that you probably care about long term. That could be a future post, but my next stop is actually evaluating Longhorn with the v2 engine. As of today, they have released 1.8.0-rc5, which enables support for their v2 engine with Talos (this just means they support NVMe-oF). If Longhorn can now work with NVMe-oF and Talos, to me that is a much more mature and feature rich product, with more community support than OpenEBS. I believe it also supports snapshots and other features that OpenEBS does not currently.</p>
<p>My next post will be all about blowing this setup away and doing it all over with Longhorn. Hopefully by then the stable 1.8.0 will have been released.</p>
]]></content:encoded>
    </item>
    
    <item>
      <title>Upgrade Talos Linux and Kubernetes</title>
      <link>https://blog.dalydays.com/post/kubernetes-talos-upgrades/</link>
      <pubDate>Tue, 21 Jan 2025 00:00:00 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/kubernetes-talos-upgrades/</guid>
      <description>A quick rundown of Talos OS upgrades and Kubernetes version upgrades</description>
      <content:encoded><![CDATA[<h1 id="upgrade-kubernetes-version">Upgrade Kubernetes Version</h1>
<p><a href="https://www.talos.dev/v1.9/kubernetes-guides/upgrading-kubernetes/">https://www.talos.dev/v1.9/kubernetes-guides/upgrading-kubernetes/</a></p>
<blockquote>
<p>Upgrading Kubernetes is non-disruptive to the cluster workloads.</p>
</blockquote>
<p>You can do this live, assuming you don&rsquo;t have single-replica workloads that are node-specific.</p>
<p>Today I will be upgrading to Kubernetes version <code>v1.31.5</code>. I&rsquo;m currently on <code>v1.30.0</code> but I want to make sure I&rsquo;m running the same version that is being tested on the CKA exam that I&rsquo;m studying for which is currently 1.31.</p>
<p>Talos recommends using the <code>talosctl upgrade-k8s</code> command which automatically upgrades the entire cluster and has built in safety checks. They explain how to do it manually, but I chose Talos Linux partly based on the ease of ongoing maintenance and upgrades so I will be using the easy button here!</p>
<ul>
<li>Check current version: <code>kubectl get node</code></li>
<li>Upgrade: <code>talosctl -n 10.0.50.11 upgrade-k8s --to 1.31.5</code>
<ul>
<li>You only need to specify a single control plane node, but this will upgrade the whole cluster</li>
<li>You need to choose a real Kubernetes version - <a href="https://kubernetes.io/releases/">https://kubernetes.io/releases/</a></li>
<li>This will take a while, so try to be patient.</li>
</ul>
</li>
<li>Verify version: <code>kubectl get node</code></li>
</ul>
<p>I like to update my local talosconfig repo which was used to deploy the original Talos cluster and also includes secrets used to recover in case of any problems. This is a good time to update the Kubernetes version in controlplane.yaml and worker.yaml for any new nodes you deploy.</p>
<h1 id="upgrade-talos-os">Upgrade Talos OS</h1>
<p><a href="https://www.talos.dev/v1.9/talos-guides/upgrading-talos/">https://www.talos.dev/v1.9/talos-guides/upgrading-talos/</a></p>
<p>The Talos team recommends using the same version of <code>talosctl</code> that your nodes are running. You will then upgrade <code>talosctl</code> after the node upgrades are complete.</p>
<p>Be sure to upgrade one node at a time and check that it&rsquo;s healthy before moving on. You can blast through them using a for loop, or do them by hand. Just don&rsquo;t do them all at the same time :)</p>
<ul>
<li>Check versions:
<ul>
<li>Client: <code>talosctl version --client</code></li>
<li>Server: <code>kubectl get node -o wide</code> (OS-IMAGE column)</li>
</ul>
</li>
<li>Get a new Talos OS image from the factory: <a href="https://factory.talos.dev">https://factory.talos.dev</a>
<ul>
<li>Make sure to add any existing extensions you&rsquo;re using such is <code>iscsi-tools</code></li>
<li>Copy the image string under the &ldquo;Upgrading Talos Image&rdquo; header. In my case this looks like <code>factory.talos.dev/installer/88d1f7a5c4f1d3aba7df787c448c1d3d008ed29cfb34af53fa0df4336a56040b:v1.9.2</code></li>
</ul>
</li>
<li>Upgrade one node: <code>talosctl upgrade -n 10.0.50.11 --image factory.talos.dev/installer/88d1f7a5c4f1d3aba7df787c448c1d3d008ed29cfb34af53fa0df4336a56040b:v1.9.2 --preserve</code>
<ul>
<li><code>-n</code>: Specify the node to upgrade</li>
<li><code>--image</code>: Specify the factory image to use</li>
<li><code>--preserve</code>: Don&rsquo;t wipe extraMounts if applicable. I default to using this unless I have a specific reason to wipe additional mounts.</li>
</ul>
</li>
<li>Repeat the upgrade command for each node, one at a time, until all nodes have been upgraded.</li>
</ul>
<p>In my homelab, I am comfortable blasting through upgrades with a for loop, so my upgrade command looks like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="k">for</span> node in <span class="m">11</span> <span class="m">12</span> <span class="m">13</span> <span class="m">21</span> <span class="m">22</span> <span class="m">23</span> <span class="m">31</span> <span class="m">32</span> 33<span class="p">;</span> <span class="k">do</span> talosctl upgrade -n 10.0.50.<span class="nv">$node</span> --image factory.talos.dev/installer/88d1f7a5c4f1d3aba7df787c448c1d3d008ed29cfb34af53fa0df4336a56040b:v1.9.2 --preserve<span class="p">;</span> <span class="k">done</span>
</span></span></code></pre></div><p>Once Nodes have been upgraded, upgrade the <code>talosctl</code> client so the version matches the Talos node version.</p>
<ul>
<li><code>rm /usr/local/bin/talosctl</code></li>
<li><code>curl -sL https://talos.dev/install | sh</code> (this gets the latest version)
<ul>
<li>You can also download a specific release from <a href="https://github.com/siderolabs/talos/releases">https://github.com/siderolabs/talos/releases</a>, e.g. <code>curl -LJO https://github.com/siderolabs/talos/releases/download/v1.8.3/talosctl-linux-amd64</code></li>
</ul>
</li>
<li>Verify: <code>talosctl version --client</code></li>
</ul>
<p>I like to update my local talosconfig repo which was used to deploy the original Talos cluster and also includes secrets used to recover in case of any problems. This is a good time to update the factory image in controlplane.yaml and worker.yaml for any new nodes you deploy.</p>
<p>That&rsquo;s it!</p>
]]></content:encoded>
    </item>
    
    <item>
      <title>Kubernetes - External Services And Ingress</title>
      <link>https://blog.dalydays.com/post/kubernetes-ingress-to-external-service/</link>
      <pubDate>Sun, 08 Dec 2024 00:00:00 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/kubernetes-ingress-to-external-service/</guid>
      <description>A quick overview of using Ingress to proxy to a service that doesn&amp;rsquo;t live in the Kubernetes cluster.</description>
      <content:encoded><![CDATA[<h1 id="about-external-services">About External Services</h1>
<p>External services are anything you want to route traffic to that does not live in your Kubernetes cluster. For example, you might be running MinIO in a VM and accessing it via IP address, but you want to basically reverse proxy to it. You can use Kubernetes Ingress to act as a reverse proxy to pretty much anything, even if it doesn&rsquo;t live within your cluster.</p>
<p>It&rsquo;s pretty straightforward to do this, and you just need to create a few resources: <code>Service</code>, <code>EndpointSlice</code> and <code>Ingress</code>. For the sake of organization, I like to use a separate namespace to separate any external service resources. If you&rsquo;re doing this, just create a new namespace as your first step.</p>
<h1 id="minio-example">MinIO Example</h1>
<p>I have MinIO running outside my Kubernetes cluster. For now, I just want basically a reverse proxy in front of it, but I want to use Kubernetes Ingress since I already have that running and configured to get TLS certificates from Let&rsquo;s Encrypt.</p>
<ul>
<li>Optionally, create a new <code>Namespace</code>. I&rsquo;m using &ldquo;externalservices&rdquo;
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Namespace</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">externalservices</span><span class="w">
</span></span></span></code></pre></div></li>
<li>Ceate an <code>EndpointSlice</code>. This object is the newer replacement for <code>Endpoints</code> which are now managed by <code>EndpointSlice</code>, so it&rsquo;s no longer recommended to manually create <code>Endpoints</code> directly. <code>EndpointSlice</code> requires a name and a label that matches the service name you will be using. The label&rsquo;s key is <code>kubernetes.io/service-name</code>. <code>namespace</code> is optional but recommended. Then you just specify address type (IPv4 or IPv6), ports, and the actual endpoint with the IP address. Here&rsquo;s an example of my EndpointSlice manifest where MinIO listening at 192.168.1.35 on port 80. I&rsquo;m using a namespace called &ldquo;externalservices&rdquo; and the service name is minio-service.
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">discovery.k8s.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">EndpointSlice</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">minio-service</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">externalservices</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">labels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">kubernetes.io/service-name</span><span class="p">:</span><span class="w"> </span><span class="l">minio-service</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">addressType</span><span class="p">:</span><span class="w"> </span><span class="l">IPv4</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">ports</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="m">80</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">endpoints</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">addresses</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="s2">&#34;192.168.1.35&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">conditions</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">ready</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span></code></pre></div></li>
<li>Next, create a ClusterIP <code>Service</code> without a selector. Having no selectors allows the service to be attached directly to an Endpoint that you create instead of only being able to attach to a pod. The way it knows how to attach to the EndpointSlice is based on the label you used in the EndpointSlice manifest which has to exactly match the name you use in this service (in this case minio-service). This service just maps port 80 to targetPort 80, and we will handle HTTP to HTTPS redirection, plus TLS termination in the Ingress resource.
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Service</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">minio-service</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">externalservices</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l">ClusterIP</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">ports</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="m">80</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">protocol</span><span class="p">:</span><span class="w"> </span><span class="l">TCP</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">targetPort</span><span class="p">:</span><span class="w"> </span><span class="m">80</span><span class="w">
</span></span></span></code></pre></div></li>
<li>Finally, create an <code>Ingress</code>. Here&rsquo;s a basic example using a subdomain, using a Let&rsquo;s Encrypt ClusterIssuer, and using traefik as the ingress class.
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">networking.k8s.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Ingress</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">minio-service-ingress</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">externalservices</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">annotations</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">cert-manager.io/cluster-issuer</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;letsencrypt-staging&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">traefik.ingress.kubernetes.io/router.entrypoints</span><span class="p">:</span><span class="w"> </span><span class="l">websecure</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">ingressClassName</span><span class="p">:</span><span class="w"> </span><span class="l">traefik</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">tls</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">hosts</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">minio.example.com</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">secretName</span><span class="p">:</span><span class="w"> </span><span class="l">tls-example-com</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">rules</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">host</span><span class="p">:</span><span class="w"> </span><span class="l">minio.example.com</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">http</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">paths</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">backend</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">service</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">minio-service</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">port</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">number</span><span class="p">:</span><span class="w"> </span><span class="m">80</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">path</span><span class="p">:</span><span class="w"> </span><span class="l">/</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">pathType</span><span class="p">:</span><span class="w"> </span><span class="l">Prefix</span><span class="w">
</span></span></span></code></pre></div></li>
</ul>
<p>TLDR; Put it all together into a single manifest:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Namespace</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">externalservices</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nn">---</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">discovery.k8s.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">EndpointSlice</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">minio-service</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">externalservices</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">labels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">kubernetes.io/service-name</span><span class="p">:</span><span class="w"> </span><span class="l">minio-service</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">addressType</span><span class="p">:</span><span class="w"> </span><span class="l">IPv4</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">ports</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="m">80</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">endpoints</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">addresses</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="s2">&#34;192.168.1.35&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">conditions</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">ready</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nn">---</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Service</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">minio-service</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">externalservices</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l">ClusterIP</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">ports</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="m">80</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">protocol</span><span class="p">:</span><span class="w"> </span><span class="l">TCP</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">targetPort</span><span class="p">:</span><span class="w"> </span><span class="m">80</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nn">---</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">networking.k8s.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Ingress</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">minio-service-ingress</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">externalservices</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">annotations</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">cert-manager.io/cluster-issuer</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;letsencrypt-staging&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">traefik.ingress.kubernetes.io/router.entrypoints</span><span class="p">:</span><span class="w"> </span><span class="l">websecure</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">ingressClassName</span><span class="p">:</span><span class="w"> </span><span class="l">traefik</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">tls</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">hosts</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">minio.example.com</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">secretName</span><span class="p">:</span><span class="w"> </span><span class="l">tls-example-com</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">rules</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">host</span><span class="p">:</span><span class="w"> </span><span class="l">minio.example.com</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">http</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">paths</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">backend</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">service</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">minio-service</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">port</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">number</span><span class="p">:</span><span class="w"> </span><span class="m">80</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">path</span><span class="p">:</span><span class="w"> </span><span class="l">/</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">pathType</span><span class="p">:</span><span class="w"> </span><span class="l">Prefix</span><span class="w">
</span></span></span></code></pre></div><ul>
<li>Apply <code>kubectl apply -f minio-external-service.yaml</code></li>
<li>Shortly after deploying the resources, update DNS or your local hosts file and point to minio.example.com. It may take a bit to generate a signed certificate from the issuer, but you should still be able to route to your endpoint through the Ingress and verify that it&rsquo;s working.</li>
</ul>
<p>That&rsquo;s it. That&rsquo;s the end. You are now done.</p>
]]></content:encoded>
    </item>
    
    <item>
      <title>Kubernetes Homelab Series Part 7 - Backups With Velero</title>
      <link>https://blog.dalydays.com/post/kubernetes-homelab-series-part-7-backups-with-velero/</link>
      <pubDate>Sat, 30 Nov 2024 00:00:00 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/kubernetes-homelab-series-part-7-backups-with-velero/</guid>
      <description>A look into backing up etcd in a Talos Linux cluster, along with full cluster resource backups using Velero</description>
      <content:encoded><![CDATA[<h1 id="what-do-you-back-up-in-kubernetes">What Do You Back Up In Kubernetes?</h1>
<p>It&rsquo;s easy to gloss over backups especially when you are setting up a new cluster, finally getting stuff running on it, and feeling excited that you can now deploy some service and basically automate certificates, ingress, and everything else that used to be a pain to do manually. But as soon as something fails, you will wish you spent a little more time not only thinking about backups, but also documenting step by step how to recover from a disaster.</p>
<p>In Kubernetes, we have two main things to back up: persistent data, and the state of the cluster resources.</p>
<p>With Velero, backups in Kubernetes are EASY. Just do it now, before you put stuff on the cluster, and write down the steps to recover. Why not?</p>
<h1 id="etcd-backups-in-talos-linux">etcd Backups In Talos Linux</h1>
<p>Before we dive into Velero, I want to mention that theoretically, you could just use Velero for backups and never worry about etcd. However, it&rsquo;s a best practice to backup etcd and it&rsquo;s easy to do, there&rsquo;s no reason not to have another recovery point.</p>
<p>What is etcd (I think it&rsquo;s pronounced &ldquo;et-see-dee&rdquo;)? This is the database that the control plane nodes use to keep track of any and all resources in the cluster. It&rsquo;s the cluster database that keeps track of every resource, status, etc. in the cluster. This includes every node, pod, deployment, secret, etc. If you had a catastrophe and lost all your cluster resource state, or had something get corrupted, you should be able to restore the etcd database and get back to the exact state it was in when the backup was taken.</p>
<p>Side note: This tracks real time data so whatever storage the etcd cluster is running on (the actual disks the control plane node stores this data on) needs to be pretty fast. If it&rsquo;s slow, you can run into issues where <code>kubectl</code> commands are very slow, or stuff stops working as expected in the cluster. Maybe this note belongs in the Talos Linux setup section, but here we are.</p>
<p>Talos Linux, unlike a traditional kubeadm install, does not run etcd as a static pod. Instead, it&rsquo;s hidden behind the Talos API, meaning you have to use <code>talosctl</code> to manage it: <a href="https://www.talos.dev/v1.8/advanced/etcd-maintenance/">https://www.talos.dev/v1.8/advanced/etcd-maintenance/</a></p>
<h2 id="how-to-back-up-etcd-in-talos">How To Back Up etcd In Talos</h2>
<p><a href="https://www.talos.dev/v1.8/advanced/disaster-recovery/#backup">https://www.talos.dev/v1.8/advanced/disaster-recovery/#backup</a></p>
<ul>
<li><code>talosctl -n &lt;IP&gt; etcd snapshot /backup_path/etcd.snapshot.$(/bin/date +%Y%m%d)</code>
<ul>
<li>You need to pass any one of the control plane node IPs to the <code>-n</code> flag</li>
</ul>
</li>
</ul>
<p>You could automate this somewhere so you have regular backups. Highly recommended! If you do a bash script, you&rsquo;ll need to pass the <code>--talosconfig /path/to/talosconfig</code> option.</p>
<p>If you&rsquo;re currently in a Disaster Recovery scenario and the snapshot command is failing, copy the etcd database directly: <a href="https://www.talos.dev/v1.8/advanced/disaster-recovery/#disaster-database-snapshot">https://www.talos.dev/v1.8/advanced/disaster-recovery/#disaster-database-snapshot</a></p>
<ul>
<li><code>talosctl -n &lt;IP&gt; cp /var/lib/etcd/member/snap/db .</code></li>
</ul>
<h2 id="how-to-recover-etcd-in-talos">How To Recover etcd In Talos</h2>
<p><a href="https://www.talos.dev/v1.8/advanced/disaster-recovery/#recovery">https://www.talos.dev/v1.8/advanced/disaster-recovery/#recovery</a></p>
<p>I haven&rsquo;t done this process yet (and hopefully I don&rsquo;t need it). Follow the official guide.</p>
<h1 id="whats-velero">What&rsquo;s Velero?</h1>
<p><a href="https://velero.io/">https://velero.io/</a></p>
<p>Velero is an open source DR and backup/migration/recovery tool for Kubernetes. It backs up all cluster resources (or whatever you specify) and can be used for backup, restore, disaster recovery and migrating to new clusters. It&rsquo;s also capable of backing up PVs. Backups are stored on external endpoints (mostly S3 capable services including MinIO). Backup/restore is managed through CRDs. There are two components: cluster resources and a client side utility <code>velero</code>.</p>
<p>Don&rsquo;t trust their blog to be up to date because the current release according to the blog is 1.11, released April 26, 2023, although I&rsquo;m currently running 1.15 which was released November 5, 2024.</p>
<p>Get an accurate changelog right from the source: <a href="https://github.com/vmware-tanzu/velero/releases">https://github.com/vmware-tanzu/velero/releases</a></p>
<h1 id="velero-installationsetup">Velero Installation/Setup</h1>
<p>I&rsquo;ll walk through installing Velero with some options you might want to consider. Then I&rsquo;ll dive into running a backup and restore manually, scheduling backups, and backing up persistent volumes with Kopia.</p>
<h2 id="prerequisites">Prerequisites</h2>
<ul>
<li>Access to a Kubernetes cluster, v1.16 or later, with DNS and container networking enabled. For more information on supported Kubernetes versions, see the Velero <a href="https://github.com/vmware-tanzu/velero#velero-compatibility-matrix">compatibility matrix</a>.</li>
<li><code>kubectl</code> installed locally</li>
<li>Object storage. I&rsquo;m running MinIO in my lab. You could use MinIO, Ceph, AWS S3, Backblaze B2 (S3 API), or any other object storage you prefer.</li>
</ul>
<h2 id="installing-velero---client-side">Installing Velero - Client Side</h2>
<p><a href="https://velero.io/docs/v1.15/basic-install/">https://velero.io/docs/v1.15/basic-install/</a></p>
<ul>
<li>Install the <code>velero</code> client utility
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="nv">VERSION</span><span class="o">=</span>v1.15.0
</span></span><span class="line"><span class="cl">curl -LJO https://github.com/vmware-tanzu/velero/releases/download/<span class="nv">$VERSION</span>/velero-<span class="nv">$VERSION</span>-linux-amd64.tar.gz
</span></span><span class="line"><span class="cl">tar zxvf velero-<span class="nv">$VERSION</span>-linux-amd64.tar.gz
</span></span><span class="line"><span class="cl">install -m <span class="m">755</span> velero-<span class="nv">$VERSION</span>-linux-amd64/velero /usr/local/bin/velero
</span></span><span class="line"><span class="cl">rm -rf velero-<span class="nv">$VERSION</span>-linux-amd64
</span></span><span class="line"><span class="cl">rm -rf velero-<span class="nv">$VERSION</span>-linux-amd64.tar.gz
</span></span><span class="line"><span class="cl">velero version
</span></span></code></pre></div></li>
</ul>
<h2 id="installing-velero---server-side">Installing Velero - Server Side</h2>
<p>There are two installation methods, using the Helm chart or using the <code>velero</code> utility. Let&rsquo;s use <code>velero</code> CLI since the Helm method seems complicated based on what I can find in the docs.</p>
<h3 id="minio-setup">MinIO Setup</h3>
<p>We need to start with an access key for the S3 backup target. I&rsquo;m using MinIO: <a href="https://velero.io/docs/v1.15/contributions/minio/">https://velero.io/docs/v1.15/contributions/minio/</a></p>
<ul>
<li>Create a bucket <code>velero-talos</code></li>
<li>Create a group <code>velero-svc</code></li>
<li>Create a service account user <code>velero</code> and add to <code>velero-svc</code> group</li>
<li>Create a policy <code>velero-rw</code>. Raw Policy:</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-json" data-lang="json"><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;Version&#34;</span><span class="p">:</span> <span class="s2">&#34;2012-10-17&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;Statement&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="nt">&#34;Effect&#34;</span><span class="p">:</span> <span class="s2">&#34;Allow&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="nt">&#34;Action&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;s3:ListBucket&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;s3:PutObject&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;s3:DeleteObject&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;s3:GetObject&#34;</span>
</span></span><span class="line"><span class="cl">            <span class="p">],</span>
</span></span><span class="line"><span class="cl">            <span class="nt">&#34;Resource&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;arn:aws:s3:::velero-talos&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;arn:aws:s3:::velero-talos/*&#34;</span>
</span></span><span class="line"><span class="cl">            <span class="p">]</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">]</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></div><ul>
<li>Assign that policy to the <code>velero-svc</code> group</li>
<li>Create an access key on the <code>velero</code> user</li>
<li>Create a file named <code>minio-access-key.txt</code>, replacing the values from your access key</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-ini" data-lang="ini"><span class="line"><span class="cl"><span class="k">[default]</span>
</span></span><span class="line"><span class="cl"><span class="na">aws_access_key_id</span> <span class="o">=</span> <span class="s">minio</span>
</span></span><span class="line"><span class="cl"><span class="na">aws_secret_access_key</span> <span class="o">=</span> <span class="s">minio123</span>
</span></span></code></pre></div><ul>
<li>Decide what you think is best for storing this file or the credentials somewhere  securely. I would suggest using SOPS, but I will leave the implementation up to you.</li>
</ul>
<h3 id="install-using-velero-utility">Install Using <code>velero</code> Utility</h3>
<ul>
<li>Arguments reference:
<ul>
<li><code>--provider</code> - Use the AWS provider which is also used for any generic S3 object storage, including MinIO</li>
<li><code>--secret-file</code> - Specifies the secrets to use for authentication against the MinIO storage</li>
<li><code>--plugins</code> - Required for Velero to connect to an S3 backend</li>
<li><code>--bucket</code> - specifies the name of the bucket on the S3 backend</li>
<li><code>--use-volume-snapshots=true</code> - Enables volume snapshots</li>
<li><code>--backup-location-config</code> - Configures the connection URL and options for the S3 backend</li>
<li><code>--features=EnableCSI</code> - Enables Velero to use a built in CSI snapshot driver, such as democratic-csi and take snapshots using a volumesnapshotclass. To be clear, Velero would create a volumesnapshot and it would be stored wherever democratic-csi stores snapshots (in our case that&rsquo;s in another dataset on the TrueNAS instance). This is the easy button if you already have CSI snapshots enabled, but the alternative is using Kopia and storing snapshots on the MinIO backend.</li>
</ul>
</li>
<li>Install:
<ul>
<li><code>velero install --provider aws --secret-file minio-access-key.txt --plugins velero/velero-plugin-for-aws:v1.11.0 --bucket velero --use-volume-snapshots=true --backup-location-config region=us-east-1,s3ForcePathStyle=&quot;true&quot;,s3Url=http://192.168.1.35:9000 --features=EnableCSI</code>
<ul>
<li>You may need to update the plugin version and s3URL.</li>
<li>Optionally, set <code>use-volume-snapshots=false</code> if you don&rsquo;t want to back up PVs.</li>
</ul>
</li>
</ul>
</li>
<li>Verify:
<ul>
<li><code>kubectl get all -n velero</code></li>
<li><code>kubectl -n velero get pod -l deploy=velero</code></li>
<li>Check logs, in particular to verify that Velero can reach the backup location
<ul>
<li><code>kubectl -n velero logs -l deploy=velero</code></li>
</ul>
<pre tabindex="0"><code>level=info msg=&#34;BackupStorageLocations is valid, marking as available&#34;
</code></pre></li>
</ul>
</li>
<li>Post install:
<ul>
<li>If you are using democratic-csi for snapshots, you also need to add a label on the VolumeSnapshotClass to let Velero know which one to use by default. This must only be set on 1 volumeSnapshotClass
<ul>
<li><code>kubectl edit volumesnapshotclass truenas-iscsi</code></li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="w">  </span><span class="nt">velero.io/csi-volumesnapshot-class</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;true&#34;</span><span class="w">
</span></span></span></code></pre></div></li>
<li>If you&rsquo;re ADDING this funtionality after you&rsquo;ve already installed Velero, you just need to modify the server deployment and also enable on the client: velero.io/csi-volumesnapshot-class: &ldquo;true&rdquo;
<ul>
<li>Server:
<ul>
<li><code>kubectl edit -n velero deploy/velero</code></li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl">--<span class="l">features=EnableCSI</span><span class="w">
</span></span></span></code></pre></div></li>
<li>Client:
<ul>
<li><code>velero client config set features=EnableCSI</code></li>
<li><code>velero client config get features</code></li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
</ul>
<h2 id="testing">Testing</h2>
<p>I&rsquo;ll go through a specific example that should work to follow along if you&rsquo;ve followed the series up to now. Otherwise, there are some good examples in the docs that you can reference: <a href="https://velero.io/docs/v1.15/examples/">https://velero.io/docs/v1.15/examples/</a></p>
<h3 id="manual-backuprestore">Manual Backup/Restore</h3>
<p>Let&rsquo;s back up the traefik namespace:</p>
<ul>
<li><code>velero backup create traefik-backup --include-namespaces traefik</code></li>
<li>Verify: <code>velero backup describe traefik-backup</code>
<ul>
<li>Looking for Phase: Complete</li>
</ul>
</li>
<li>Test a DR scenario:
<ul>
<li>Blow away the traefik namespace: <code>kubectl delete ns traefik</code></li>
<li>Restore traefik-backup with Velero: <code>velero restore create --from-backup traefik-backup</code></li>
<li>Verify: <code>kubectl get ns</code> or <code>velero restore describe traefik-backup-20241130181633</code> (get the restore name from the previous restore command)</li>
</ul>
</li>
</ul>
<p>Create a full backup: <code>velero backup create full-20241130</code> - If you don&rsquo;t specify namespaces, etc. then everything gets backed up.</p>
<h2 id="create-a-backup-schedule">Create A Backup Schedule</h2>
<p>I like to do daily full backups at 7am. TTL is the expiration time for each backup created, 720h is 30 days. So I get daily backups that fall off after 30 days.</p>
<ul>
<li>Create schedule: <code>velero schedule create daily-full --schedule &quot;0 7 * * *&quot; --ttl 720h</code></li>
<li>Verify: <code>velero get schedules</code></li>
</ul>
<p>Alternatively, you can use a manifest to define your backup schedule. You may not want to include certain namespaces in your daily backups since most likely you won&rsquo;t need/want to restore namespaces like <code>kube-system</code>, <code>kube-public</code>, and <code>kube-node-lease</code>. Let&rsquo;s look at defining a backup schedule using a manifest:</p>
<ul>
<li>Create the Schedule manifest, <code>schedule.yaml</code>:
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">velero.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Schedule</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">daily-skipping-system-namespaces</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">schedule</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;0 12 * * *&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">skipImmediately</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">template</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">excludedNamespaces</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="l">kube-public</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="l">kube-system</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="l">kube-node-lease</span><span class="w">
</span></span></span></code></pre></div></li>
<li>Apply the schedule: <code>kubectl apply -f schedule.yaml</code></li>
<li>Verify: <code>velero get schedules</code></li>
</ul>
<h3 id="restore-from-scheduled-backup">Restore From Scheduled Backup</h3>
<ul>
<li><code>velero restore create --from-schedule SCHEDULE_NAME</code></li>
</ul>
<h2 id="restores-in-general">Restores In General</h2>
<p>See previous examples for specific restore commands. The basic syntax will be <code>velero restore create [RESTORE_NAME] [--from-backup BACKUP_NAME | --from-schedule SCHEDULE_NAME] [flags]</code></p>
<p>You can restore from a manually created backup or a scheduled backup. You can also restore specific items from a backup, so let&rsquo;s say you just want to restore your secret key to your sealed secrets from a daily-full scheduled backup, you can use flags to specify exactly what from that backup you want to restore. For example:</p>
<ul>
<li><code>velero restore create restore-sealed-secret-key --from-schedule daily-full --selector sealedsecrets.bitnami.com/sealed-secrets-key=active</code></li>
</ul>
<h2 id="back-up-pvcs-using-democratic-csi-snapshots">Back Up PVCs Using democratic-csi Snapshots</h2>
<p>If you enabled snapshots in democratic-csi and also enabled snapshos in Velero as described above, then anything you back up with Velero that includes a PVC will be snapshotted. I tested this by deploying a pod/deployment in the default namespace, attached to a test PVC, then ran a Velero backup against it.</p>
<ul>
<li><code>velero backup create full-backup</code></li>
</ul>
<h2 id="kopia">Kopia</h2>
<p>Velero uses Kopia (as a plugin) to copy existing snapshots to your S3 backend. This means that if you&rsquo;re not already creating &ldquo;regular&rdquo; snapshots already outside of Velero, you need to have EnableCSI enabled and working already so that a VolumeSnapshot is created, which can then be copied out to S3. In a homelab environment, this may be redundant if your MinIO storage is backed by the same storage you&rsquo;re using for snapshots, but for the sake of completeness I will run through how we can make this work and test it.</p>
]]></content:encoded>
    </item>
    
    <item>
      <title>Kubernetes Homelab Series Part 6 - Storage With democratic-csi</title>
      <link>https://blog.dalydays.com/post/kubernetes-homelab-series-part-6-storage-with-democratic-csi/</link>
      <pubDate>Mon, 25 Nov 2024 00:00:00 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/kubernetes-homelab-series-part-6-storage-with-democratic-csi/</guid>
      <description>Diving into the depths of Kubernetes storage, then walking through using democratic-csi for iSCSI and NFS with Talos Linux.</description>
      <content:encoded><![CDATA[<h1 id="whats-so-hard-about-storage">What&rsquo;s So Hard About Storage?</h1>
<p>There are several questions to answer when deciding how to handle storage for Kubernetes.</p>
<ul>
<li>How many disks do you have?</li>
<li>Do you want/need replicated storage?</li>
<li>What are your storage capacity requirements?</li>
<li>What are your performance requirements?</li>
<li>Do you need dynamically provisioned storage or will you be doing it manually?</li>
<li>NFS or iSCSI or NVMe-oF?</li>
<li>How will you back up your data?</li>
<li>Do you need snapshots?</li>
<li>Does your storage need to be highly available?</li>
<li>Does it need to be accessible from any node in the cluster, or are you good with node local storage?</li>
</ul>
<p>Ultimately it&rsquo;s not actually hard, it&rsquo;s just complex if you want to achieve anything like you get with EBS volumes in AWS, but in your homelab. Here&rsquo;s how I always try to approach complex problems: start as simple as possible, make it work, then add complexity only as needed.</p>
<h2 id="my-requirements">My Requirements</h2>
<ul>
<li>To have dynamically provisioned persistent volumes</li>
<li>To have that persistent volume be accessible from any node in my cluster</li>
<li>Keep it as simple as reasonably possible</li>
</ul>
<h2 id="how-am-i-doing-it">How Am I Doing It?</h2>
<p>Currently I have a single 2TB NVMe disk and I want to use that for dynamically provisioned storage. I&rsquo;m not worried about replication right now since I have backups in place and this is just for my homelab. If I wanted to do replicated storage, I might consider a Ceph cluster, but that realistically requires a decent amount of hardware and fast networking interconnectivity (greater than 1GB, ideally 10GB minimum for replication to keep up).</p>
<p>In order to manage the disk I&rsquo;m using TrueNAS Scale, which is basically ZFS on Linux with a nice web GUI to manage things. This actually provides the option of doing a zpool with maybe 2 disks mirrored, or even a RAIDZ6 as your storage target to easily solve for replication across disks.</p>
<p>In Proxmox, I&rsquo;m passing the disk itself directly to TrueNAS VM. You <strong>should</strong> pass an entire disk controller if you need to pool multiple drives together in ZFS. If you&rsquo;re dealing with a single disk, it&rsquo;s OK to do it this way.</p>
<p>Spin up the VM, format the disk, ZFS, etc. and you&rsquo;re ready to go. In my case, this is on the Kubernetes VLAN and assigned a static IP. Originally I had a second NIC attached to the primary VLAN but this caused some weird web UI performance issues so I removed that one.</p>
<h1 id="truenas-setup-in-proxmox">TrueNAS Setup In Proxmox</h1>
<p>This isn&rsquo;t intended to be a TrueNAS tutorial, so I&rsquo;ll just list the steps at a high level. Basically, get a TrueNAS server running and then proceed.</p>
<ul>
<li>Pass a disk (or HBA controller) to a VM in Proxmox</li>
<li>Install TrueNAS Scale</li>
<li>Create a zpool</li>
<li>Generate an API Key - in the top right corner go to Admin &gt; API Keys</li>
<li>Make sure the network is accessible from your Kubernetes cluster</li>
</ul>
<h1 id="install-democratic-csi-with-truenas">Install democratic-csi With TrueNAS</h1>
<p><a href="https://github.com/democratic-csi/democratic-csi">https://github.com/democratic-csi/democratic-csi</a></p>
<p>This is a straightforward CSI provider that focuses on dynamically provisioned storage from TrueNAS or generic ZFS on Linux backends. Protocols include NFS, iSCSI, and NVMe-oF. I&rsquo;ll show you how to use the API variation and do NFS and iSCSI shares, plus talk about almost getting NVMe-oF working.</p>
<p>You will install a separate Helm chart for each provisioner, and you can actually run multiple at the same time, which is what I will be doing with both NFS and iSCSI. This is helpful since NFS even supports RWX volumes (if you actually have a use case for that), while iSCSI is a good default for RWO volumes.</p>
<h2 id="volumesnapshot-support">VolumeSnapshot Support</h2>
<p>This is optional, but if you want to utilize volume snapshots (which became GA as of Kubernetes 1.20), you will need to install the CRDs which aren&rsquo;t included with vanilla Talos, along with installing the &ldquo;snapshotter.&rdquo; This implements Volume Snapshots - <a href="https://kubernetes.io/docs/concepts/storage/volume-snapshots/">https://kubernetes.io/docs/concepts/storage/volume-snapshots/</a></p>
<ul>
<li>Clone repo: <code>git clone https://github.com/kubernetes-csi/external-snapshotter.git</code></li>
<li><code>cd external-snapshotter</code></li>
<li>Apply CRDs: <code> kubectl kustomize client/config/crd | kubectl create -f -</code></li>
<li>Install snapshotter into kube-system: <code>kubectl -n kube-system kustomize deploy/kubernetes/snapshot-controller | kubectl create -f -</code></li>
<li>Verify: <code>kubectl get deploy snapshot-controller -n kube-system</code></li>
</ul>
<h2 id="dynamic-iscsi-provisioner-with-freenas-api-iscsi">Dynamic iSCSI Provisioner With freenas-api-iscsi</h2>
<p>My single 2TB disk is in a pool named <code>nvme2tb</code>. I created a dataset in TrueNAS named <code>iscsi</code>. Those may vary in your case, so pay attention to the configuration and update those values according to your environment.</p>
<p>Don&rsquo;t forget, your Talos installation needs to include the iscsi extension or the nodes won&rsquo;t be able to connect to TrueNAS.</p>
<ul>
<li>Create a dataset named <code>iscsi</code></li>
<li>Make sure Block (iSCSI) Shares Targets is running, and click Configure</li>
<li>Save the defaults for Target Global Configuration</li>
<li>Add a portal on 0.0.0.0:3260 named <code>k8s-democratic-csi</code></li>
<li>Add an Initiator Group, Allow all initiators, and name it something like <code>k8s-talos</code></li>
<li>Create a Target named <code>donotdelete</code> and alias <code>donotdelete</code>, then add iSCSI group selecting the Portal and Initiator Group you just created. This prevents TrueNAS from deleting the Initiator Group if you&rsquo;re testing and you delete the one and only PV.</li>
<li>Make note of the portal ID and the Initiator Group ID and update these values in the file <code>freenas-api-iscsi.yaml</code> if needed
<ul>
<li>During testing, the manually created Initiator Group was getting deleted whenever deleting the last PV. This appears to be a bug in TrueNAS somewhere according to <a href="https://github.com/democratic-csi/democratic-csi/issues/412">https://github.com/democratic-csi/democratic-csi/issues/412</a>. Essentially TrueNAS deletes the Initiator Group automatically if an associated Target is deleted and no others exist. If you followed the instructions and created a manual Target this won&rsquo;t be an issue :)</li>
</ul>
</li>
<li>Create the democratic-csi namespace: <code>kubectl create ns democratic-csi</code></li>
<li>Make that namespace privileged: <code>kubectl label --overwrite namespace democratic-csi pod-security.kubernetes.io/enforce=privileged</code></li>
<li>Create <code>freenas-api-iscsi.yaml</code> and update <code>apiKey</code>, <code>host</code>, <code>targetPortal</code>, <code>datasetParentName</code>, and <code>detachedSnapshotsDatasetParentName</code>. Other common considerations in storage class YAML config are <code>storageClasses.defaultClass</code> (true/false, there can only be one), and <code>storageClasses.reclaimPolicy</code> (Delete/Retain) and if you set this to Retain, there is less chance that data will be deleted if you delete a pod (for example), but you are also responsible for deleting this manually in TrueNAS if you ever need to.
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">driver</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">config</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">driver</span><span class="p">:</span><span class="w"> </span><span class="l">freenas-api-iscsi</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">httpConnection</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">protocol</span><span class="p">:</span><span class="w"> </span><span class="l">https</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">apiKey</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="l">your-truenas-api-key-here]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">host</span><span class="p">:</span><span class="w"> </span><span class="m">10.0.50.99</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="m">443</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">allowInsecure</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">zfs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">datasetParentName</span><span class="p">:</span><span class="w"> </span><span class="l">nvme2tb/iscsi/volumes</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">detachedSnapshotsDatasetParentName</span><span class="p">:</span><span class="w"> </span><span class="l">nvme2tb/iscsi/snapshots</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">zvolCompression</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">zvolDedup</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">zvolEnableReservation</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">zvolBlockSize</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">iscsi</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">targetPortal</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;10.0.50.99:3260&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">targetPortals</span><span class="p">:</span><span class="w"> </span><span class="p">[]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">interface</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">namePrefix</span><span class="p">:</span><span class="w"> </span><span class="l">csi-</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">nameSuffix</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;-talos&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">targetGroups</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">targetGroupPortalGroup</span><span class="p">:</span><span class="w"> </span><span class="m">1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">targetGroupInitiatorGroup</span><span class="p">:</span><span class="w"> </span><span class="m">5</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">targetGroupAuthType</span><span class="p">:</span><span class="w"> </span><span class="l">None</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">targetGroupAuthGroup</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">extentInsecureTpc</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">extentXenCompat</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">extentDisablePhysicalBlocksize</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">extentBlocksize</span><span class="p">:</span><span class="w"> </span><span class="m">512</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">extentRpm</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;SSD&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">extentAvailThreshold</span><span class="p">:</span><span class="w"> </span><span class="m">0</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">csiDriver</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c"># should be globally unique for a given cluster</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;org.democratic-csi.freenas-api-iscsi&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">storageClasses</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">truenas-iscsi</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">defaultClass</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">reclaimPolicy</span><span class="p">:</span><span class="w"> </span><span class="l">Delete</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">volumeBindingMode</span><span class="p">:</span><span class="w"> </span><span class="l">Immediate</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">allowVolumeExpansion</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">parameters</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">fsType</span><span class="p">:</span><span class="w"> </span><span class="l">ext4</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">detachedVolumesFromSnapshots</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;false&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">mountOptions</span><span class="p">:</span><span class="w"> </span><span class="p">[]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">secrets</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">provisioner-secret</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">controller-publish-secret</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">node-stage-secret</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">node-publish-secret</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">controller-expand-secret</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">volumeSnapshotClasses</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">truenas-iscsi</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">parameters</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">detachedSnapshots</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;true&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">node</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">hostPID</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">driver</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">extraEnv</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">ISCSIADM_HOST_STRATEGY</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">value</span><span class="p">:</span><span class="w"> </span><span class="l">nsenter</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">ISCSIADM_HOST_PATH</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">value</span><span class="p">:</span><span class="w"> </span><span class="l">/usr/local/sbin/iscsiadm</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">iscsiDirHostPath</span><span class="p">:</span><span class="w"> </span><span class="l">/usr/local/etc/iscsi</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">iscsiDirHostPathType</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;&#34;</span><span class="w">
</span></span></span></code></pre></div></li>
<li>Deploy: <code>helm upgrade --install --namespace democratic-csi --values freenas-api-iscsi.yaml truenas-iscsi democratic-csi/democratic-csi</code></li>
<li>Verify:
<ul>
<li>You&rsquo;re looking to see that everything is fully running. It may take a minute to spin up.</li>
<li><code>kubectl get all -n democratic-csi</code></li>
<li><code>kubectl get storageclasses</code> or <code>kubectl get sc</code></li>
</ul>
</li>
</ul>
<h3 id="test---deploy-a-pvc">Test - Deploy A PVC</h3>
<ul>
<li>Test with a simple PVC, targeting our new <code>truenas-iscsi</code> storage class, <code>test-pvc-truenas-iscsi.yaml</code>:
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">PersistentVolumeClaim</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">testpvc-iscsi</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">storageClassName</span><span class="p">:</span><span class="w"> </span><span class="l">truenas-iscsi</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">accessModes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">ReadWriteOnce</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">resources</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">requests</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">storage</span><span class="p">:</span><span class="w"> </span><span class="l">5Gi</span><span class="w">
</span></span></span></code></pre></div><ul>
<li><code>kubectl apply -f test-pvc-truenas-iscsi.yaml</code></li>
<li>Be patient, this can take a minute to provision the new Zvol on TrueNAS (looks something like <code>nvme2tb/iscsi/volumes/pvc-25e70f84-91c7-4e49-a9f1-e324681a3b7d</code>) and get everything mapped in Kubernetes.</li>
<li>Check the Persistent Volume itself: <code>kubectl get pv</code>
<ul>
<li>Looking for a new entry here</li>
</ul>
</li>
<li>Check the Persistent Volume Claim: <code>kubectl get pvc</code>
<ul>
<li>Looking for status Bound to the newly created PV</li>
</ul>
</li>
<li>If you need to investigate, next look at <code>kubectl describe pvc</code> and <code>kubectl describe pv</code>, or go look in the TrueNAS UI to see if a new disk has been created</li>
</ul>
</li>
</ul>
<h3 id="test---deploy-a-pod">Test - Deploy A Pod</h3>
<p>At this point there should be a PV and PVC, but they are not actually connected to a pod yet. The moment a pod claims a PVC, that&rsquo;s when the actual node that the pod is running on will mount the iSCSI target, and this is where the <code>iscsi-utils</code> extension comes into play in Talos Linux. Let&rsquo;s test to make sure we can actually connect to the PVC from a pod.</p>
<p>This test pod uses a small Alpine image and writes to a log file every second. The two lines commented out at the bottom are there in case you want to target a specific node. I would recommend if you&rsquo;re not sure all your Talos Linux nodes are configured properly for iSCSI to target each of them and verify from every pod. You can delete the pod, but preserve the PVC and if you reconnect to the PVC from another pod, even if it&rsquo;s running on another node, it should still contain the same data.</p>
<ul>
<li>Create <code>pod-using-testpvc-iscsi.yaml</code>:
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Pod</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">testlogger-iscsi</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">containers</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">testlogger-iscsi</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">image</span><span class="p">:</span><span class="w"> </span><span class="l">alpine:3.20</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">command</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">&#34;/bin/ash&#34;</span><span class="p">]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">args</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">&#34;-c&#34;</span><span class="p">,</span><span class="w"> </span><span class="s2">&#34;while true; do echo \&#34;$(date) - test log\&#34; &gt;&gt; /mnt/test.log &amp;&amp; sleep 1; done&#34;</span><span class="p">]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">volumeMounts</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">testvol</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">mountPath</span><span class="p">:</span><span class="w"> </span><span class="l">/mnt</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">volumes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">testvol</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">persistentVolumeClaim</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">claimName</span><span class="p">:</span><span class="w"> </span><span class="l">testpvc-iscsi</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="c">#    nodeSelector:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="c">#      kubernetes.io/hostname: taloswk1</span><span class="w">
</span></span></span></code></pre></div></li>
<li>Deploy: <code>kubectl apply -f pod-using-testpvc-iscsi.yaml</code></li>
<li>Verify <code>kubectl get po</code>
<ul>
<li>Check which node it&rsquo;s on with <code>kubectl get po -o wide</code> or <code>kubectl describe po testlogger-iscsi | grep Node:</code></li>
</ul>
</li>
<li>Validate data is being written to the PVC:
<ul>
<li>Exec into the pod: <code>kubectl exec -it testlogger-iscsi -- /bin/sh</code></li>
<li>Look at the file: <code>cat /mnt/test.log</code></li>
<li>Show line count: <code>wc -l /mnt/test.log</code></li>
</ul>
</li>
</ul>
<h3 id="test---cleanup">Test - Cleanup</h3>
<ul>
<li>Delete pod: <code>kubectl delete -f pod-using-testpvc-iscsi.yaml</code></li>
<li>Delete PVC: <code>kubectl delete -f test-pvc-truenas-iscsi.yaml</code></li>
</ul>
<h2 id="dynamic-nfs-provisioner-with-freenas-api-nfs">Dynamic NFS Provisioner With freenas-api-nfs</h2>
<p>This one&rsquo;s a little simpler than iSCSI since support is built into Talos automatically, and there&rsquo;s less setup on the TrueNAS side.</p>
<ul>
<li>Create a dataset named <code>nfs</code></li>
<li>Create the democratic-csi namespace: <code>kubectl create ns democratic-csi</code></li>
<li>Make that namespace privileged: <code>kubectl label --overwrite namespace democratic-csi pod-security.kubernetes.io/enforce=privileged</code></li>
<li>Create <code>freenas-api-nfs.yaml</code> and update <code>apiKey</code>, <code>host</code>, <code>shareHost</code>, <code>datasetParentName</code>, and <code>detachedSnapshotsDatasetParentName</code>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">driver</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">config</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">driver</span><span class="p">:</span><span class="w"> </span><span class="l">freenas-api-nfs</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">httpConnection</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">protocol</span><span class="p">:</span><span class="w"> </span><span class="l">https</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">apiKey</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="l">api-key-goes-here]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">host</span><span class="p">:</span><span class="w"> </span><span class="m">10.0.50.99</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="m">443</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">allowInsecure</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">zfs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">datasetParentName</span><span class="p">:</span><span class="w"> </span><span class="l">nvme2tb/nfs</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">detachedSnapshotsDatasetParentName</span><span class="p">:</span><span class="w"> </span><span class="l">nvme2tb/nfs/snaps</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">datasetEnableQuotas</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">datasetEnableReservation</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">datasetPermissionsMode</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;0777&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">datasetPermissionsUser</span><span class="p">:</span><span class="w"> </span><span class="m">0</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">datasetPermissionsGroup</span><span class="p">:</span><span class="w"> </span><span class="m">0</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">nfs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">shareCommentTemplate</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;{{ parameters.[csi.storage.k8s.io/pvc/namespace] }}-{{ parameters.[csi.storage.k8s.io/pvc/name] }}&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">shareHost</span><span class="p">:</span><span class="w"> </span><span class="m">10.0.50.99</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">shareAlldirs</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">shareAllowedHosts</span><span class="p">:</span><span class="w"> </span><span class="p">[]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">shareAllowedNetworks</span><span class="p">:</span><span class="w"> </span><span class="p">[]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">shareMaprootUser</span><span class="p">:</span><span class="w"> </span><span class="l">root</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">shareMaprootGroup</span><span class="p">:</span><span class="w"> </span><span class="l">root</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">shareMapallUser</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">shareMapallGroup</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">csiDriver</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c"># should be globally unique for a given cluster</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;org.democratic-csi.freenas-api-nfs&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">storageClasses</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">truenas-nfs</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">defaultClass</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">reclaimPolicy</span><span class="p">:</span><span class="w"> </span><span class="l">Delete</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">volumeBindingMode</span><span class="p">:</span><span class="w"> </span><span class="l">Immediate</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">allowVolumeExpansion</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">parameters</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">fsType</span><span class="p">:</span><span class="w"> </span><span class="l">nfs</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">mountOptions</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="l">noatime</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="l">nfsvers=4</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">volumeSnapshotClasses</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">truenas-nfs</span><span class="w">
</span></span></span></code></pre></div></li>
<li>Since we are using snapshots, you also need to install a snapshot controller such as <a href="https://github.com/democratic-csi/charts/tree/master/stable/snapshot-controller">https://github.com/democratic-csi/charts/tree/master/stable/snapshot-controller</a>
<ul>
<li>You can skip this step if you disable snapshots in your YAML file</li>
<li><code>helm upgrade --install --namespace kube-system --create-namespace snapshot-controller democratic-csi/snapshot-controller</code></li>
<li><code>kubectl -n kube-system logs -f -l app=snapshot-controller</code></li>
</ul>
</li>
<li>Deploy: <code>helm upgrade --install --namespace democratic-csi --values freenas-api-nfs.yaml truenas-nfs democratic-csi/democratic-csi</code></li>
<li>Verify:
<ul>
<li>You&rsquo;re looking to see that everything is fully running. It may take a minute to spin up.</li>
<li><code>kubectl get all -n democratic-csi</code></li>
<li><code>kubectl get storageclasses</code> or <code>kubectl get sc</code></li>
</ul>
</li>
</ul>
<h3 id="test---deploy-a-pvc-1">Test - Deploy A PVC</h3>
<ul>
<li>Test with a simple PVC, targeting our new <code>truenas-nfs</code> storage class, <code>test-pvc-truenas-nfs.yaml</code>:
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">PersistentVolumeClaim</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">testpvc-nfs</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">storageClassName</span><span class="p">:</span><span class="w"> </span><span class="l">truenas-nfs</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">accessModes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">ReadWriteOnce</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">resources</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">requests</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">storage</span><span class="p">:</span><span class="w"> </span><span class="l">5Gi</span><span class="w">
</span></span></span></code></pre></div><ul>
<li><code>kubectl apply -f test-pvc-truenas-nfs.yaml</code></li>
<li>Be patient, this can take a minute to provision the new block device on TrueNAS and get everything mapped in Kubernetes.</li>
<li>Check the Persistent Volume itself: <code>kubectl get pv</code>
<ul>
<li>Looking for a new entry here</li>
</ul>
</li>
<li>Check the Persistent Volume Claim: <code>kubectl get pvc</code>
<ul>
<li>Looking for status Bound to the newly created PV</li>
</ul>
</li>
<li>If you need to investigate, next look at <code>kubectl describe pvc</code> and <code>kubectl describe pv</code>, or go look in the TrueNAS UI to see if a new disk has been created</li>
</ul>
</li>
</ul>
<h3 id="test---deploy-a-pod-1">Test - Deploy A Pod</h3>
<p>At this point there should be a PV and PVC, but they are not actually connected to a pod yet. The moment a pod claims a PVC, that&rsquo;s when the actual node that the pod is running on will mount the NFS share.</p>
<p>This test pod uses a small Alpine image and writes to a log file every second. The two lines commented out at the bottom are there in case you want to target a specific node. I would recommend if you&rsquo;re not sure all your Talos Linux nodes are configured properly for iSCSI to target each of them and verify from every pod. You can delete the pod, but preserve the PVC and if you reconnect to the PVC from another pod, even if it&rsquo;s running on another node, it should still contain the same data.</p>
<ul>
<li>Create <code>pod-using-testpvc-nfs.yaml</code>:
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Pod</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">testlogger-nfs</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">containers</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">testlogger-nfs</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">image</span><span class="p">:</span><span class="w"> </span><span class="l">alpine:3.20</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">command</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">&#34;/bin/ash&#34;</span><span class="p">]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">args</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">&#34;-c&#34;</span><span class="p">,</span><span class="w"> </span><span class="s2">&#34;while true; do echo \&#34;$(date) - test log\&#34; &gt;&gt; /mnt/test.log &amp;&amp; sleep 1; done&#34;</span><span class="p">]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">volumeMounts</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">testvol</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">mountPath</span><span class="p">:</span><span class="w"> </span><span class="l">/mnt</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">volumes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">testvol</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">persistentVolumeClaim</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">claimName</span><span class="p">:</span><span class="w"> </span><span class="l">testpvc-nfs</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="c">#    nodeSelector:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="c">#      kubernetes.io/hostname: taloswk1</span><span class="w">
</span></span></span></code></pre></div></li>
<li>Deploy: <code>kubectl apply -f pod-using-testpvc-nfs.yaml</code></li>
<li>Verify <code>kubectl get po</code>
<ul>
<li>Check which node it&rsquo;s on with <code>kubectl get po -o wide</code> or <code>kubectl describe po testlogger-nfs | grep Node:</code></li>
</ul>
</li>
<li>Validate data is being written to the PVC:
<ul>
<li>Exec into the pod: <code>kubectl exec -it testlogger-nfs -- /bin/sh</code></li>
<li>Look at the file: <code>cat /mnt/test.log</code></li>
<li>Show line count: <code>wc -l /mnt/test.log</code></li>
</ul>
</li>
</ul>
<h3 id="test---cleanup-1">Test - Cleanup</h3>
<ul>
<li><code>kubectl delete -f pod-using-testpvc-nfs.yaml</code></li>
<li><code>kubectl delete -f test-pvc-truenas-nfs.yaml</code></li>
</ul>
<h2 id="dynamic-nvme-of-storage-for-kubernetes-i-was-unable-to-make-this-work">Dynamic NVMe-oF Storage For Kubernetes (I was unable to make this work)</h2>
<p>I spent some time trying to get this to work. TrueNAS doesn&rsquo;t currently support NVMe-oF through the interface, but since it&rsquo;s just a Linux box you can simply (almost simply) install extra packages needed and configure them as root. After doing that, I tested manually by connecting from another Linux machine to validate that I could indeed mount NVMe over TCP using TrueNAS.</p>
<p>From there, I figured out the configuration needed for democratic-csi <code>zfs-generic-nvmeof</code> driver and started testing. I got as far as getting it to provision a new dataset on TrueNAS, create the mount, and create the PV and PVC in the cluster, showing as Bound. However, when I would actually attempt to connect to it from a pod, it would fail. It may have something to do with how democratic-csi does the mount from the node, or otherwise I might have something wrong in my configuration that I can&rsquo;t figure out.</p>
<p>If I could get this working, I might not even bother running a TrueNAS instance and just run some lightweight Linux server to interface between democratic-csi and the disk(s).</p>
<p>Here&rsquo;s some extra details on exactly what I tried:</p>
<ul>
<li><a href="https://github.com/siderolabs/talos/issues/9255">https://github.com/siderolabs/talos/issues/9255</a></li>
<li><a href="https://github.com/democratic-csi/democratic-csi/issues/418">https://github.com/democratic-csi/democratic-csi/issues/418</a></li>
</ul>
<p>Please help me if you know how to make this work, as I&rsquo;d much rather be using this than iSCSI :)</p>
<p>Here&rsquo;s my almost working config for reference:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">csiDriver</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;org.democratic-csi.nvmeof&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">storageClasses</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">truenas-nvmeof</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">defaultClass</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">reclaimPolicy</span><span class="p">:</span><span class="w"> </span><span class="l">Delete</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">volumeBindingMode</span><span class="p">:</span><span class="w"> </span><span class="l">Immediate</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">allowVolumeExpansion</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">parameters</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">fsType</span><span class="p">:</span><span class="w"> </span><span class="l">ext4</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">mountOptions</span><span class="p">:</span><span class="w"> </span><span class="p">[]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">secrets</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">provisioner-secret</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">controller-publish-secret</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">node-stage-secret</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">node-publish-secret</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">controller-expand-secret</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">volumeSnapshotClasses</span><span class="p">:</span><span class="w"> </span><span class="p">[]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">driver</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">config</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">driver</span><span class="p">:</span><span class="w"> </span><span class="l">zfs-generic-nvmeof</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">sshConnection</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">host</span><span class="p">:</span><span class="w"> </span><span class="m">10.0.50.99</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="m">22</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">username</span><span class="p">:</span><span class="w"> </span><span class="l">root</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">privateKey</span><span class="p">:</span><span class="w"> </span><span class="p">|</span><span class="sd">
</span></span></span><span class="line"><span class="cl"><span class="sd">        -----BEGIN RSA PRIVATE KEY-----
</span></span></span><span class="line"><span class="cl"><span class="sd">        REDACTED!
</span></span></span><span class="line"><span class="cl"><span class="sd">        -----END RSA PRIVATE KEY-----</span><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">zfs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">datasetParentName</span><span class="p">:</span><span class="w"> </span><span class="l">nvme2tb/k8s/nvmeof</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">detachedSnapshotsDatasetParentName</span><span class="p">:</span><span class="w"> </span><span class="l">nvme2tb/k8s/nvmeof-snapshots</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">zvolCompression</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">zvolDedup</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">zvolEnableReservation</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">zvolBlocksize</span><span class="p">:</span><span class="w"> </span><span class="l">16K</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">nvmeof</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">transports</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="l">tcp://0.0.0.0:4420</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">namePrefix</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">nameSuffix</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">shareStrategy</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;nvmetCli&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">shareStrategyNvmetCli</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">nvmetcliPath</span><span class="p">:</span><span class="w"> </span><span class="l">nvmetcli</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">configIsImportedFilePath</span><span class="p">:</span><span class="w"> </span><span class="l">/var/run/nvmet-config-loaded</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">configPath</span><span class="p">:</span><span class="w"> </span><span class="l">/etc/nvmet/config.json</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">basename</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;nqn.2003-01.org.linux-nvme&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">ports</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="s2">&#34;1&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">subsystem</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">attributes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">allow_any_host</span><span class="p">:</span><span class="w"> </span><span class="m">1</span><span class="w">
</span></span></span></code></pre></div><h1 id="what-about-other-options">What About Other Options?</h1>
<h2 id="what-about-local-storage">What About Local Storage?</h2>
<p>This fails to meet one of my main requirements of being able to mount a persistent volume to any node. The whole point of Kubernetes, at least for me, is the ability to take a node offline with no actual downtime of any services. If you have a pod connected to a PVC on a certain node, but I need to move that pod to another node, I need to reconnect to the existing persistent volume, but this method doesn&rsquo;t allow for that.</p>
<h2 id="what-about-proxmox-csi-plugin">What About proxmox-csi-plugin?</h2>
<p>I also looked at the <a href="https://github.com/sergelogvinov/proxmox-csi-plugin">proxmox-csi-plugin</a>, but that has the same problem as with node local storage so doesn&rsquo;t fit my requirements.</p>
<h2 id="what-about-rancher-longhorn">What About Rancher Longhorn?</h2>
<p>Longhorn is nice and easy. It uses either NFS or iSCSI. The Talos team recommends against both NFS and iSCSI (although either can be used). I would say to use Longhorn if you need replicated storage in a homelab (like 3 disks) but don&rsquo;t want to figure out Ceph, etc. It&rsquo;s straightforward and well supported. As for downsides, I don&rsquo;t have firsthand experience. I hear it&rsquo;s great for usability, but some users complain that it&rsquo;s not reliable.</p>
<p>It&rsquo;s also a little more complicated to set up specifically with Talos, but Longhorn has specific instructions for installation with Talos so that shouldn&rsquo;t be a man blocker.</p>
<p>At some point I might try this out, but for now I&rsquo;m sticking with the simple approach of using TrueNAS Scale with any type of zpool you want, and dynamically provisioning NFS or iSCSI using democratic-csi.</p>
<h2 id="what-about-mayastor">What About Mayastor?</h2>
<p>I tried getting this to work because the Talos team recommends it if you don&rsquo;t want to do a full blown Ceph cluster, etc. I was also super interested in the fact that it uses NVMe-oF which is a newer protocol (basically a modern replacement for iSCSI). I wasn&rsquo;t able to get it working, but I discovered a little bit about it and decided to keep it simpler for the following reasons:</p>
<ul>
<li>I only have a single disk currently, so I don&rsquo;t need any replicated storage</li>
<li>It seems to be more resource intensive than democratic-csi and has more components.</li>
<li>The documentation is kind of hectic. It used to be Mayastor, but now it&rsquo;s OpenEBS Replicated PV Mayastor or something. When deploying, it&rsquo;s hard to tell if I&rsquo;m deploying other OpenEBS stuff I don&rsquo;t need or what exactly PV type I need. I think you need one type of PV to store data for one of the components even if you are ultimately trying to run the replicated PV type (mayastor) for your primary cluster storage. I don&rsquo;t know, it was confusing.</li>
</ul>
<p>My failure to get this working is 100% a skill issue, but going back to my requirements I really don&rsquo;t need this for my homelab at this point. I may revisit this in the future.</p>
<h2 id="what-about-ceph">What About Ceph?</h2>
<p>I would need enough disks and resources to run Ceph. It&rsquo;s something I really want to test out and potentially use, although probably way overkill for my homelab. I&rsquo;m currently running 2 Minisforum MS-01 servers and planning on getting a third to do a true Proxmox cluster and replace my old 2U power hungry server. At that point, I might actually give Ceph a shot (trying both the Proxmox Ceph installation and also Rook/Ceph on top of Kubernetes). This would solve for truly HA storage, plus meet all other requirements I have, assuming I don&rsquo;t add a requirement for low resource utilization just to run storage.</p>
]]></content:encoded>
    </item>
    
    <item>
      <title>Kubernetes Homelab Series Part 5 - Ingress Controllers With Traefik</title>
      <link>https://blog.dalydays.com/post/kubernetes-homelab-series-part-5-ingress-controllers-with-traefik/</link>
      <pubDate>Sat, 23 Nov 2024 00:00:00 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/kubernetes-homelab-series-part-5-ingress-controllers-with-traefik/</guid>
      <description>A look into building an ingress controller and the purpose behind it.</description>
      <content:encoded><![CDATA[<h1 id="what-is-ingress-and-ingress-controller">What Is Ingress and Ingress Controller?</h1>
<p>Ingress is a way to expose your applications running in Kubernetes to something outside the cluster. It&rsquo;s like a reverse proxy for Kubernetes. It defines how traffic is routed from the point that it hits the cluster, specifically the ingress controller, to the running pod. You might already know about the different Kubernetes service types such as ClusterIP, NodePort and LoadBalancer, but this takes it to another level that can handle TLS termination and advanced routing rules, among other nice features, depending on which Ingress Controller you pick. That determines exactly which features are available.</p>
<p>An ingress controller is the software that contains the actual logic to implement Ingress rules and functionality. These can vary depending on the ingress controller you pick.</p>
<p>The awesome thing about this component is that it deploys a LoadBalancer type service, and if you used MetalLB or kube-vip, that means you get a virtual IP dynamically assigned to one of your ingress controller replicas. That means if the pod goes down that&rsquo;s running the ingress controller (assuming you have more than 1 replica or use a DaemonSet) then the IP automatically moves to another replica. In other words, between MetalLB and Traefik (or insert your tool of choice for either), you get highly available Ingress to your Kubernetes cluster with no manual effort. Pretty nifty!</p>
<h2 id="what-about-nginx">What About NGINX?</h2>
<p>I feel like I should mention that one of the most widely known ingress controllers is nginx. Before you go down that rabbit hole, there are 2 different ingress controllers based on nginx. One used to be called <code>nginx-ingress</code> which was/is the official open source nginx ingress controller (now it&rsquo;s <code>kubernetes-ingress</code>: <a href="https://github.com/nginxinc/kubernetes-ingress)">https://github.com/nginxinc/kubernetes-ingress)</a>, and maintained by the engineers at NGINX. Hard to go wrong there. The other popular option is <code>ingress-nginx</code> which is the community maintained, NGINX supported project. It&rsquo;s been a while since I tried either one, but I hear good things about ingress-nginx because there&rsquo;s a large community behind it and it should be easier to find answers if you have questions. If you need something more production ready (and have your heart set on NGINX), then maybe consider <code>kubernetes-ingress</code>.</p>
<h1 id="why-traefik">Why Traefik?</h1>
<p>A valid question you might ask, and I&rsquo;ve asked myself, is why would I consider using Traefik instead of many people&rsquo;s go-to NGINX? These are my reasons:</p>
<ul>
<li>I settled on Traefik years ago when it was still v1 for things I was running in Docker. It provided the ability to EASILY reverse proxy to my applications by simply setting a few labels in my docker-compose.yml files, and INCLUDING the ability to automatically get certificates from Let&rsquo;s Encrypt using DNS-01 challenge (offering legitimate certificates even for internal only applications).</li>
<li>It&rsquo;s open-source, it&rsquo;s fast, and it just works (I&rsquo;ve never had issues or run into bugs personally using Traefik)</li>
<li>When I first went to Traefik, it had more features than NGINX. Notably it can dynamically add routes/rules without restarting or reloading. Maybe the NGINX controller functions the same way today, but in the Docker compose world that was not the case.</li>
<li>Native support for HTTP/3 - it seems to be more on the cutting edge, supporting new technologies faster than NGINX. I know - newer isn&rsquo;t always better, just look at Debian. But for my homelab, newer is almost always better :)</li>
<li>Middleware - more features readily available such as IP whitelisting, WAF options, etc. that you can easily plug into routes for additional functionality</li>
</ul>
<p>In Kubernetes it matters a lot less since the Ingress resource is another layer of abstraction, and either way your stuff will get routed. But at some point you may have specific requirements, certain middlewares functionality, performance requirements, etc. and that may drive the decision to go with whatever tool best meets those requirements.</p>
<h2 id="why-not-nginx-or-kong-or-something-else">Why Not NGINX Or Kong Or Something Else?</h2>
<p>I have no argument against using NGINX or anything else. I just like the Traefik project and it does what I need in my homelab, so I will continue using it for now :)</p>
<h1 id="installationsetup">Installation/Setup</h1>
<p>You can install the helm chart with no extra options and it will &ldquo;just work.&rdquo; There are also some good options you might want to at least know about. These can also be changed later on, so it&rsquo;s not a big deal if you change your mind later.</p>
<p><a href="https://doc.traefik.io/traefik/getting-started/install-traefik/#use-the-helm-chart">https://doc.traefik.io/traefik/getting-started/install-traefik/#use-the-helm-chart</a></p>
<h2 id="vanilla-installation">Vanilla Installation</h2>
<ul>
<li>For any helm installation, always start by adding the helm repo:
<ul>
<li><code>helm repo add traefik https://traefik.github.io/charts</code></li>
<li><code>helm repo update</code></li>
</ul>
</li>
<li>Vanilla Installation:
<ul>
<li><code>helm install traefik traefik/traefik</code></li>
</ul>
</li>
</ul>
<p>You can start deploying Ingress resources. But keep reading to understand some of the ingress controller options and why you might consider changing them.</p>
<h2 id="deployment-vs-daemonset">Deployment Vs. DaemonSet</h2>
<p>By default the Helm chart creates a Deployment with 1 replica, based on the <a href="https://github.com/traefik/traefik-helm-chart/blob/master/traefik/values.yaml">values.yml file</a>. You could scale this up to 3 replicas if you&rsquo;re looking for something more highly available, or you could consinder a DaemonSet where there will always be one Traefik instance on each worker node to handle routing. Keep in mind with a DaemonSet that if you ever scale up the number of worker nodes, that will also scale up the number of total Ingress Controller pods.</p>
<p>If you wanted to do something a little more elaborate like only run 3 replicas, but make sure they never run on the same node as each other (3 different physical nodes), you would need to get into some node affinity rules that I won&rsquo;t explain here because I haven&rsquo;t done that (yet).</p>
<p>When customizing Helm charts, I tend to prefer downloading the values.yaml file, customizing it, then deploying using the custom values file. That way I can store that in version control and not have to think about messy JSON stuff in my shell commands.</p>
<ul>
<li>
<p>Download values file: <code>helm show values traefik/traefik &gt; values.yaml</code></p>
</li>
<li>
<p>Customize. Let&rsquo;s change to a DaemonSet by changing deployment.kind from Deployment to DaemonSet. Capitalization matters here! DaemonSet with a capital S:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nn">...</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">deployment</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c"># -- Enable deployment</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">enabled</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c"># -- Deployment or DaemonSet</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">DaemonSet</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nn">...</span><span class="w">
</span></span></span></code></pre></div></li>
<li>
<p>Let&rsquo;s say you have 10 worker nodes and you want Traefik to be replicated, but you don&rsquo;t need 10 instances running. In that case you could do a Deployment and set the number of replicas to 3.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nn">...</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">deployment</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c"># -- Enable deployment</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">enabled</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c"># -- Deployment or DaemonSet</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Deployment</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">replicas</span><span class="p">:</span><span class="w"> </span><span class="m">3</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nn">...</span><span class="w">
</span></span></span></code></pre></div></li>
<li>
<p>Deploy: <code>helm install traefik traefik/traefik -f values.yaml</code></p>
</li>
</ul>
<h2 id="namespace">Namespace</h2>
<p>Another consideration when deploying Traefik is whether you want to use a custom namespace. Namespaces are a best practice instead of dumping everything into the default namespace for security and organization. It&rsquo;s easy enough to split out Traefik, so why not?</p>
<ul>
<li>Create a namespace: <code>kubectl create ns traefik</code></li>
<li>Deploy: <code>helm install traefik traefik/traefik -f values.yaml -n traefik</code></li>
</ul>
<p>OR do it all in a single step by adding <code>--create-namespace</code></p>
<ul>
<li><code>helm install traefik traefik/traefik -f values.yaml -n traefik --create-namespace</code></li>
</ul>
<h2 id="verify-your-deployment-or-daemonset">Verify Your Deployment (or DaemonSet)</h2>
<ul>
<li>Vanilla install: <code>kubectl get deploy</code>
<pre tabindex="0"><code>NAME      READY   UP-TO-DATE    AVAILABLE   AGE
traefik   1/1     1             1           3m
</code></pre></li>
<li>Namespaced Deployment: <code>kubectl get deploy -n traefik</code>
<pre tabindex="0"><code>NAME      READY   UP-TO-DATE    AVAILABLE   AGE
traefik   3/3     3             3           14s
</code></pre></li>
<li>Namespaced DaemonSet: <code>kubectl get daemonset -n traefik</code>
<pre tabindex="0"><code>NAME      DESIRED   CURRENT   READY   UP-TO-DATE    AVAILABLE   NODE SELECTOR   AGE
traefik   3         3         3       3             3           &lt;none&gt;          12s
</code></pre></li>
</ul>
<h2 id="traefik-dashboard">Traefik Dashboard</h2>
<p>Traefik has a cool dashboard showing some details about routes, middlewares, services, and other stuff. It&rsquo;s all read-only, but also nice to look at and sometimes helpful in troubleshooting why a route isn&rsquo;t working as expected or what exactly it&rsquo;s routing to. But you have to enable access to it, and consider what type of security you want to have in place for Dashboard access.</p>
<p>They don&rsquo;t recommend enabling the dashboard in production, but if you do, at least secure it. The easiest way to expose but also keep somewhat secure is using basic authentication. Sounds good enough for me, let&rsquo;s try it out!</p>
<p><a href="https://github.com/traefik/traefik-helm-chart/blob/master/EXAMPLES.md#publish-and-protect-traefik-dashboard-with-basic-auth">https://github.com/traefik/traefik-helm-chart/blob/master/EXAMPLES.md#publish-and-protect-traefik-dashboard-with-basic-auth</a></p>
<ul>
<li>You can throw this in its own yaml file and apply this specific config to the existing installation. Let&rsquo;s call this <code>dashboard-basicauth.yaml</code>:
<ul>
<li>You&rsquo;ll need a DNS record pointing the host in the matchRule to the IP address on Traefik&rsquo;s LoadBalancer service.</li>
<li>Using websecure without specifying TLS options will result in a self-signed certificate and you&rsquo;ll get a warning in the browser.</li>
<li>Changing these things is beyond the scope of this tutorial - this is where I draw the line :)</li>
</ul>
</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># Create an IngressRoute for the dashboard</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">ingressRoute</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">dashboard</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">enabled</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c"># Custom match rule with host domain</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">matchRule</span><span class="p">:</span><span class="w"> </span><span class="l">Host(`traefik-dashboard.example.com`)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">entryPoints</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">&#34;web&#34;</span><span class="p">,</span><span class="s2">&#34;websecure&#34;</span><span class="p">]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c"># Add custom middlewares : authentication and redirection</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">middlewares</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">traefik-dashboard-auth</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="c"># Create the custom middlewares used by the IngressRoute dashboard (can also be created in another way).</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="c"># /!\ Yes, you need to replace &#34;changeme&#34; password with a better one. /!\</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">extraObjects</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Secret</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">traefik-dashboard-auth-secret</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l">kubernetes.io/basic-auth</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">stringData</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">username</span><span class="p">:</span><span class="w"> </span><span class="l">admin</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">password</span><span class="p">:</span><span class="w"> </span><span class="l">changeme</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">traefik.io/v1alpha1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Middleware</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">traefik-dashboard-auth</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">basicAuth</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">secret</span><span class="p">:</span><span class="w"> </span><span class="l">traefik-dashboard-auth-secret</span><span class="w">
</span></span></span></code></pre></div><ul>
<li>Apply the config: <code>helm upgrade --reuse-values -n traefik -f dashboard-basicauth.yaml traefik traefik/traefik</code></li>
</ul>
<p>Note that the ingressRoute you set up in this config will NOT appear in your cluster as an Ingress type resource. So if you run <code>kubectl get ingress -A</code> you won&rsquo;t find it and can&rsquo;t verify it that way. It&rsquo;s an internal ingress route that Traefik uses.</p>
<h2 id="http-redirect-to-https">HTTP Redirect To HTTPS</h2>
<p>This is another common thing that most people probably do. You can do this on a per-service basis, or configure Traefik to automatically redirect HTTP to HTTPS for everything. In that case, if you happen to have something that you don&rsquo;t want redirected, you would need to create another entrypoint and use a different port besides 80.</p>
<p><a href="https://doc.traefik.io/traefik/routing/entrypoints/#redirection">https://doc.traefik.io/traefik/routing/entrypoints/#redirection</a></p>
<p>I&rsquo;m confused about the mismatch between the values.yaml file and the documentation. In values.yaml there&rsquo;s a field called <code>redirectTo</code> that expects an object. I worked out that it requires a <code>port</code> and possibly can take another field named <code>permanent</code> (true/false). However, when I upgraded the helm release with these values it didn&rsquo;t seem to have any effect. <a href="https://doc.traefik.io/traefik/routing/entrypoints/#redirection">The documentation</a>, on the other hand, talks about a structure that looks like <code>entryPoints.web.http.redirections.entryPoint.[to|scheme|permanent|priority]</code>. Since that&rsquo;s how it&rsquo;s configured everywhere else, why not use the same structure in values.yaml???</p>
<p>We can (and should?) ignore <code>redirectTo</code> and just go right to <code>additionalArguments</code> instead, using the familiar syntax in the docs:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">additionalArguments</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="s2">&#34;--entrypoints.web.http.redirections.entryPoint.to=:443&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="s2">&#34;--entrypoints.web.http.redirections.entryPoint.scheme=https&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="s2">&#34;--entrypoints.web.http.redirections.entryPoint.permanent=true&#34;</span><span class="w">
</span></span></span></code></pre></div><p>Note the fact that I used <code>:443</code> to redirect to, instead of the entryPoint <strong>name</strong> since using <code>websecure</code> here will actually cause it to redirect to port 8443. I&rsquo;m not sure if that&rsquo;s a bug or not in the Helm chart, but this still works and makes sense.</p>
<p>And in case you&rsquo;re not familiar, &ldquo;permanent&rdquo; just means the difference between a normal redirect (HTTP 302) and permanent redirect (HTTP 301) which determines which HTTP response is used.</p>
<h2 id="logging-externaltrafficpolicy-and-ipallowlist-middleware">Logging, externalTrafficPolicy and IPAllowList Middleware</h2>
<blockquote>
<p>Note: This is an update as of 3/10/2025</p>
</blockquote>
<p>If you want access logs enabled which are not by default, add this section to your values.yaml file:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">logs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">access</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">enabled</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span></code></pre></div><p>This makes it so that you can view access logs from the pod or deployment using something like <code>kubectl -n traefik logs deployments/traefik</code></p>
<p>After enabling this, I realized all requests had the same source IP of 10.244.1.0. I was trying to use the IPAllowList middleware for a <code>/admin</code> endpoint on a service but it wasn&rsquo;t working due to Traefik being unable to see the true source IP. After a bit of research, I found that I could change the <code>externalTrafficPolicy</code> from Cluster to Local. Read <a href="https://kubernetes.io/docs/tasks/access-application-cluster/create-external-load-balancer/#preserving-the-client-source-ip">https://kubernetes.io/docs/tasks/access-application-cluster/create-external-load-balancer/#preserving-the-client-source-ip</a> for more info.</p>
<p>There are pros and cons to this approach. If you need to preserve the source IP, I found this to be the best option. The down side is that traffic can only come in on a node where one of the Traefik pods lives. So earlier when you decided whether to run 2 replicas, a daemonset, etc. that comes into play here. But if you run 2-3 replicas or use a daemonset, this is not a problem.</p>
<p>Change externalTrafficPolicy to Local:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">service</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">externalTrafficPolicy</span><span class="p">:</span><span class="w"> </span><span class="l">Local</span><span class="w">
</span></span></span></code></pre></div><p>I read that it might be possible to use X-Forwarded-For, I didn&rsquo;t dive very far into X-Forwarded-For settings or depth on the middleware. It might be possible to leave this policy set to Cluster and use the header info instead. However, you will need to consider security as you might not necessarily trust the X-Forwarded-For header from internet traffic as this can easily be spoofed.</p>
<h3 id="ipallowlist-middleware">IPAllowList Middleware</h3>
<p>So anyway, we added logging and changed the externalTrafficPolicy to Local so that Traefik sees the true source IP from the request and we can see the request logs. Now I want to show a quick example of how to restrict access to a certain path on a service, while allowing all traffic to reach all other pages. While I&rsquo;m sure there are many ways to tackle this problem, this is a straightforward way to block access to <code>/admin</code> for a service.</p>
<p>I have an ecommerce shop that I want to make available on the internet. It has an admin login page at <code>/admin</code> so I want to just block any public traffic from reaching that page, no exceptions. See below for info about IngressRoute which is what I&rsquo;m using in this example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">cert-manager.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Certificate</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">evershop.example.com</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">evershop</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">secretName</span><span class="p">:</span><span class="w"> </span><span class="l">evershop.example.com-secret</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">issuerRef</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">letsencrypt</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">ClusterIssuer</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">dnsNames</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">evershop.example.com</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nn">---</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">traefik.io/v1alpha1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">IngressRoute</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">evershop-service-ingressroute</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">evershop</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">annotations</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">traefik.ingress.kubernetes.io/router.entrypoints</span><span class="p">:</span><span class="w"> </span><span class="l">websecure</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">tls</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">secretName</span><span class="p">:</span><span class="w"> </span><span class="l">evershop.example.com-secret</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">routes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">match</span><span class="p">:</span><span class="w"> </span><span class="l">Host(`evershop.example.com`) &amp;&amp; PathPrefix(`/admin`)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Rule</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">services</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">app</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="m">3000</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">scheme</span><span class="p">:</span><span class="w"> </span><span class="l">http</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">middlewares</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">admin-ipallowlist</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">match</span><span class="p">:</span><span class="w"> </span><span class="l">Host(`evershop.example.com`)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Rule</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">services</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">app</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="m">3000</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">scheme</span><span class="p">:</span><span class="w"> </span><span class="l">http</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nn">---</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">traefik.io/v1alpha1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Middleware</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">admin-ipallowlist</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">evershop</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">ipAllowList</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">sourceRange</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="m">192.168.0.0</span><span class="l">/23</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="m">10.0.50.0</span><span class="l">/24</span><span class="w">
</span></span></span></code></pre></div><p>To break down what is happening there, we have a Certificate resource which is needed to get a signed certificate through Cert-Manager. At the bottom we have a Middleware which is a Traefik specific CRD. This one uses <code>ipAllowList</code> and I specified 2 local subnets that I want to allow. Finally, we have an IngressRoute. Within that, there are two routes that can be matched, and the one that includes the PathPrefix of <code>/admin</code> in the match rule has an extra <code>middlewares:</code> section that references <code>admin-ipallowlist</code>. That tells Traefik that any inbound traffic to this service that has <code>/admin</code> in the path should also be checked against the IPAllowList <strong>before</strong> routing traffic to the backend <code>app</code> service. If the source IP doesn&rsquo;t match the sourceRange, it will return a 403 Forbidden page.</p>
<p>You can customize the 403 Forbidden page, but I didn&rsquo;t want to do that at this time. Just know that it can be routed to another page (or another service) by using another Middleware of type <code>errors</code>: <a href="https://doc.traefik.io/traefik/middlewares/http/errorpages/">https://doc.traefik.io/traefik/middlewares/http/errorpages/</a></p>
<h2 id="other-options">Other Options</h2>
<p>Some of the options you might notice depend on persistent storage. Just be sure you have storage configured in your cluster before using any of those options (which I&rsquo;ll talk about in my next post).</p>
<p>Specifically I noticed the persistence section in the values.yaml file, but that only pertains to storing certificates acquired directly through Traefik. I would just recommend ignoring that option and sticking with cert-manager.</p>
<p>Traefik provides some examples of common use cases which might also be helpful: <a href="https://github.com/traefik/traefik-helm-chart/blob/master/EXAMPLES.md">https://github.com/traefik/traefik-helm-chart/blob/master/EXAMPLES.md</a></p>
<p>Beyond all of that, if you need to configure custom entryPoints (say you need ingress on a weird port like 4567), read the Traefik docs, make the change in the Helm values and run the upgrade. I might come back later and add a section explaining this in more detail, but for now I just want to mention it&rsquo;s a possibility. I&rsquo;ve used it for the GitLab container registry among other things.</p>
<h2 id="changed-your-mind-after-installing-or-just-need-to-upgrade">Changed Your Mind After Installing Or Just Need To Upgrade?</h2>
<p>This is basic Helm stuff, but basically just update your values and use the same command as before except replacing install with upgrade.</p>
<ul>
<li>Update values.yaml</li>
<li>Upgrade Helm chart: <code>helm upgrade -n traefik -f values.yaml traefik traefik/traefik</code></li>
</ul>
<p>It almost feels like you just type &ldquo;traefik traefik traefik&rdquo; over again a few times, but slightly more nuanced. See the Helm docs for more details.</p>
<p><a href="https://helm.sh/docs/helm/helm_upgrade/">https://helm.sh/docs/helm/helm_upgrade/</a></p>
<h2 id="traefik-version-upgrade">Traefik Version Upgrade</h2>
<p>Always read release notes before upgrading! Sometimes there is more to it than just running the Helm upgrade command. See <a href="https://github.com/traefik/traefik-helm-chart?tab=readme-ov-file#upgrading">https://github.com/traefik/traefik-helm-chart?tab=readme-ov-file#upgrading</a> for more details around CRDs and other considerations.</p>
<h1 id="tldr-deploy-using-options-im-using">TLDR; Deploy Using Options I&rsquo;m Using</h1>
<ul>
<li>Deployment: <code>Deployment</code> with 2 replicas
<ul>
<li>I don&rsquo;t want to run too many replicas with a DaemonSet if I scale up the number of workers.</li>
<li>This still helps me if a pod crashes, there is still another replica running while the other pod gets recycled.</li>
</ul>
</li>
<li>Namespace: <code>traefik</code>
<ul>
<li>I&rsquo;m trying to follow best practices and use namespaces for different things</li>
</ul>
</li>
<li>Dashboard: Yes
<ul>
<li>I like having access to this in my homelab, but I will protect it with basic auth just as a best practice to secure it in some way.</li>
</ul>
</li>
<li>HTTP Permanent Redirect: True
<ul>
<li>I don&rsquo;t have any special use cases for HTTP now, but if needed I would just create a new entrypoint on a different port if something needed to be non-TLS.</li>
</ul>
</li>
</ul>
<h2 id="write-valuesyaml">Write values.yaml</h2>
<ul>
<li><code>values.yaml</code> ONLY needs to contain what you&rsquo;re changing, and defaults will be used automatically. So KISS and don&rsquo;t include default values.
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># Deployment with 2 replicas</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">deployment</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">replicas</span><span class="p">:</span><span class="w"> </span><span class="m">2</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="c"># Dashboard with basic auth</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">ingressRoute</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">dashboard</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">enabled</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">matchRule</span><span class="p">:</span><span class="w"> </span><span class="l">Host(`traefik-dashboard.example.com`)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">entryPoints</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">&#34;web&#34;</span><span class="p">,</span><span class="s2">&#34;websecure&#34;</span><span class="p">]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">middlewares</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">traefik-dashboard-auth</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">extraObjects</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Secret</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">traefik-dashboard-auth-secret</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l">kubernetes.io/basic-auth</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">stringData</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="c"># Change the password PLEASE!!!</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">username</span><span class="p">:</span><span class="w"> </span><span class="l">admin</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">password</span><span class="p">:</span><span class="w"> </span><span class="l">changeme</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">traefik.io/v1alpha1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Middleware</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">traefik-dashboard-auth</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">basicAuth</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">secret</span><span class="p">:</span><span class="w"> </span><span class="l">traefik-dashboard-auth-secret</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="c"># HTTP permanent redirect</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">additionalArguments</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="s2">&#34;--entrypoints.web.http.redirections.entryPoint.to=:443&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="s2">&#34;--entrypoints.web.http.redirections.entryPoint.scheme=https&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="s2">&#34;--entrypoints.web.http.redirections.entryPoint.permanent=true&#34;</span><span class="w">
</span></span></span></code></pre></div></li>
<li>Deploy using one-size-fits-all command:
<ul>
<li><code>helm upgrade --install -n traefik --create-namespace -f values.yaml traefik traefik/traefik</code></li>
</ul>
</li>
</ul>
<h1 id="testing-ingress">Testing Ingress</h1>
<p>Finally, the easy part. Deploy something, then deploy an Ingress resource, and test. Let&rsquo;s try it.</p>
<h2 id="deployment">Deployment</h2>
<p>Create a test deployment with 2 replicas. Traefik provides a utility called &ldquo;whoami&rdquo; which is helpful. This deployment also includes a Service (defaults to type ClusterIP) which is needed before deploying an Ingress resource.</p>
<ul>
<li>Create <code>whoami-deployment.yaml</code>:
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">apps/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Deployment</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">whoami-deployment</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">selector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">matchLabels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">app</span><span class="p">:</span><span class="w"> </span><span class="l">whoami</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">replicas</span><span class="p">:</span><span class="w"> </span><span class="m">2</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">template</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">labels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">app</span><span class="p">:</span><span class="w"> </span><span class="l">whoami</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">containers</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">image</span><span class="p">:</span><span class="w"> </span><span class="l">traefik/whoami:latest</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">whoami</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">ports</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">containerPort</span><span class="p">:</span><span class="w"> </span><span class="m">80</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nn">---</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Service</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">whoami-svc</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">ports</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="m">80</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">targetPort</span><span class="p">:</span><span class="w"> </span><span class="m">80</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">protocol</span><span class="p">:</span><span class="w"> </span><span class="l">TCP</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">selector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">app</span><span class="p">:</span><span class="w"> </span><span class="l">whoami</span><span class="w">
</span></span></span></code></pre></div></li>
<li>Create the deployment: <code>kubectl apply -f whoami-deployment.yaml</code></li>
<li>Verify: <code>k get deploy</code></li>
</ul>
<h2 id="ingress">Ingress</h2>
<p>This is a super basic Ingress resource which can be useful for testing. It includes two annotations:</p>
<ul>
<li>
<p>One for cert-manager to issue a certificate from the letsencrypt-staging ClusterIssuer for the hostname specified under Ingress.spec.tls.hosts. If you had used an Issuer instead of ClusterIssuer, you would need to change that line to <code>cert-manager.io/issuer: &quot;letsencrypt-staging&quot;</code></p>
</li>
<li>
<p>One to specify which entryPoints Traefik should route traffic on. This can be a single entrypoint, or a comma separated list like <code>web,websecure</code> - see <a href="https://doc.traefik.io/traefik/routing/providers/kubernetes-ingress/#on-ingress">https://doc.traefik.io/traefik/routing/providers/kubernetes-ingress/#on-ingress</a></p>
</li>
<li>
<p>Create <code>whoami-ingress.yaml</code> and update host in both places, and cluster issuer if yours has a different name:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">networking.k8s.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Ingress</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">whoami-ingress</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">annotations</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">cert-manager.io/cluster-issuer</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;letsencrypt-staging&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">traefik.ingress.kubernetes.io/router.entrypoints</span><span class="p">:</span><span class="w"> </span><span class="l">websecure</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">ingressClassName</span><span class="p">:</span><span class="w"> </span><span class="l">traefik</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">tls</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">hosts</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">whoami.example.com</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">secretName</span><span class="p">:</span><span class="w"> </span><span class="l">whoami-example-tls</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">rules</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">host</span><span class="p">:</span><span class="w"> </span><span class="l">whoami.example.com</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">http</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">paths</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">path</span><span class="p">:</span><span class="w"> </span><span class="l">/</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">pathType</span><span class="p">:</span><span class="w"> </span><span class="l">Prefix</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">backend</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">service</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">whoami-svc</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">port</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">number</span><span class="p">:</span><span class="w"> </span><span class="m">80</span><span class="w">
</span></span></span></code></pre></div></li>
<li>
<p>Verify: <code>kubectl get ingress</code></p>
</li>
<li>
<p>Test</p>
<ul>
<li>Update DNS or your local hosts file to point your host (whoami.example.com) to the IP address of Traefik&rsquo;s LoadBalancer service (<code>kubectl get svc -n traefik</code>)</li>
<li>Navigate to whoami.example.com in the browser. If you test too soon, you will get the self signed cert from Traefik. If you wait long enough and are still using the staging certificate issuer, you will still have to click through the warning message but it should say it&rsquo;s issued by Let&rsquo;s Encrypt Staging. If you use a production issuer, this should be secure with no warnings.</li>
</ul>
</li>
</ul>
<h2 id="ingressroute">IngressRoute</h2>
<p>This is a Traefik specific way to create an Ingress. It makes things easier if you want to use any Traefik features such as Middleware. I&rsquo;ve been using this method for everything just to be consistent, plus for some reason my certificates get generated quickly with cert-manager using this method over Ingress with annotations. Let&rsquo;s look at the last Ingress example but as an IngressResource. You will need to add a separate Certificate resource.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># whoami-ingressroute.yaml</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">cert-manager.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Certificate</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">whoami.example.com</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">whoami</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">secretName</span><span class="p">:</span><span class="w"> </span><span class="l">whoami.example.com-secret</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">issuerRef</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">letsencrypt-staging</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">ClusterIssuer</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">dnsNames</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">whoami.example.com</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nn">---</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">traefik.io/v1alpha1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">IngressRoute</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">whoami-service-ingressroute</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">whoami</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">annotations</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">traefik.ingress.kubernetes.io/router.entrypoints</span><span class="p">:</span><span class="w"> </span><span class="l">websecure</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">tls</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">secretName</span><span class="p">:</span><span class="w"> </span><span class="l">whoami.example.com-secret</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">routes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">match</span><span class="p">:</span><span class="w"> </span><span class="l">Host(`whoami.example.com`)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Rule</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">services</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">whoami-svc</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="m">80</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">scheme</span><span class="p">:</span><span class="w"> </span><span class="l">http</span><span class="w">
</span></span></span></code></pre></div><ul>
<li>Apply using <code>kubectl apply -f whoami-ingressroute.yaml</code></li>
<li>View IngressRoute <code>kubectl get ingressroute -n whoami</code>
<ul>
<li>This will not appear as an Ingress anymore, since it&rsquo;s an IngressRoute CRD now. So <code>kubectl get ing -n whoami</code> will not list it.</li>
</ul>
</li>
</ul>
<h1 id="what-about-gateway-api">What About Gateway API?</h1>
<p>This Kong article has a lot of good information on the topic: <a href="https://konghq.com/blog/engineering/gateway-api-vs-ingress">https://konghq.com/blog/engineering/gateway-api-vs-ingress</a></p>
<p>I have read that Gateway API is the successor to ingress controllers. The official FAQ says that Gateway API will not replace Ingress, so we are safe to continue using them (<a href="https://gateway-api.sigs.k8s.io/faq/#will-gateway-api-replace-the-ingress-api)">https://gateway-api.sigs.k8s.io/faq/#will-gateway-api-replace-the-ingress-api)</a>. There is no need to use Gateway API if all you need is Ingress.</p>
<p>Gateway API solves some inherent limitations with Ingress, primarily the fact that Ingress is Layer 7 only.</p>
<h2 id="gateway-api-tutorial">Gateway API Tutorial?</h2>
<p>Nah.</p>
<p>We would need a separate tutorial on Gateway API, so I don&rsquo;t even try to begin here. The examples from above mention that you even have to enable it specifically in the Traefik installation before using it: <a href="https://github.com/traefik/traefik-helm-chart/blob/master/EXAMPLES.md#use-kubernetes-gateway-api">https://github.com/traefik/traefik-helm-chart/blob/master/EXAMPLES.md#use-kubernetes-gateway-api</a></p>
<p>Also - How to migrate from Ingress to Gateway API: <a href="https://gateway-api.sigs.k8s.io/guides/migrating-from-ingress/">https://gateway-api.sigs.k8s.io/guides/migrating-from-ingress/</a></p>
]]></content:encoded>
    </item>
    
    <item>
      <title>Kubernetes Homelab Series Part 4.5 - Debug Pod</title>
      <link>https://blog.dalydays.com/post/kubernetes-homelab-series-part-4.5-debug-pod/</link>
      <pubDate>Wed, 20 Nov 2024 00:00:00 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/kubernetes-homelab-series-part-4.5-debug-pod/</guid>
      <description>A quick look at running a debug pod in lieu of SSH for Talos Linux.</description>
      <content:encoded><![CDATA[<h1 id="why-is-this-needed">Why is this needed?</h1>
<p>If you&rsquo;re like me and you have a habit of SSHing into an environment to troubleshoot an issue, you may be at a loss with Talos Linux since one does not simply SSH into a Talos node. Lord Of The Rings reference right there :)</p>
<p>What you can do is run a &ldquo;debug&rdquo; container with elevated privileges, in the kube-system namespace, which essentially gives you roughly the same thing. This is how.</p>
<p>Real quick, Talos themselves wrote an article on this including recommendations for alternative methods. I&rsquo;m forging ahead the &ldquo;old&rdquo; way, but don&rsquo;t discredit the new approach either as this type of thing is likely going to be the future and we need to adapt! Talos offers a lot of excellent troubleshooting tools via their talosctl API and if all you need is something like netstat, they&rsquo;ve got you covered.</p>
<p><a href="https://www.siderolabs.com/blog/how-to-ssh-into-talos-linux/">https://www.siderolabs.com/blog/how-to-ssh-into-talos-linux/</a></p>
<p>On the other hand, I just needed to test manually mounting NVMe over TCP when democratic-csi wasn&rsquo;t working as expected, and this was the only way :)</p>
<p>In case anyone is wondering, why not just run <code>kubectl debug</code> and target the node, that method doesn&rsquo;t have the ability to mount <code>/dev</code> which was needed in Talos to debug CSI driver stuff. So the alternative is to run a pod with <code>/dev</code> mounted, and then attach to it.</p>
<h1 id="show-me-the-money">Show Me The Money!</h1>
<p><a href="https://kubernetes.io/docs/tasks/debug/debug-application/debug-running-pod/#static-profile">https://kubernetes.io/docs/tasks/debug/debug-application/debug-running-pod/#static-profile</a></p>
<p>Anyway, here&rsquo;s how to deploy the debug container.</p>
<ul>
<li>Write a new file <code>debugpod.yaml</code> and replace [nodename] with the name of the specific node you want this deployed to in your cluster. Optionally change the image if you don&rsquo;t want to use Alpine. This is my go-to default because of how small and fast it is, and installing packages with APK is about as fast as it gets.
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Pod</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">debugpod</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">kube-system</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">hostPID</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">containers</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">debugcontainer</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">image</span><span class="p">:</span><span class="w"> </span><span class="l">alpine:3.20</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">stdin</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">tty</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">securityContext</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">privileged</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">volumeMounts</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">dev-mount</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">mountPath</span><span class="p">:</span><span class="w"> </span><span class="l">/dev</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">volumes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">dev-mount</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">hostPath</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">path</span><span class="p">:</span><span class="w"> </span><span class="l">/dev</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">nodeSelector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">kubernetes.io/hostname</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="l">nodename]</span><span class="w">
</span></span></span></code></pre></div></li>
<li>Deploy the pod: <code>kubectl apply -f debug-pod.yaml</code></li>
<li>Drop into a shell inside the container: <code>kubectl exec -it debugpod -n kube-system -- /bin/sh</code></li>
<li>Alternatively, run an ephemeral container in the pod with sysadmin privileges (root): <code>kubectl debug -it debugpod --image=alpine:3.21 --target=debugcontainer -n kube-system --profile=sysadmin</code></li>
<li>Do something useful. In my case, I wanted to check if a pod running in this cluster can reach out to Cloudflare and Google DNS servers so:
<ul>
<li>Install telnet: <code>apk add --update busybox-extras</code></li>
<li><code>telnet 1.1.1.1 53</code> - Ctrl+C, then e to exit</li>
<li><code>telnet 8.8.8.8 53</code> - Ctrl+C, then e to exit</li>
</ul>
</li>
<li>Whatever else you want to test :) For example, check this democratic-csi issue for what I did from here to test nvmeof mounts: <a href="https://github.com/democratic-csi/democratic-csi/issues/418">https://github.com/democratic-csi/democratic-csi/issues/418</a></li>
</ul>
]]></content:encoded>
    </item>
    
    <item>
      <title>Kubernetes Homelab Series Part 4 - Certificates With cert-manager and Let&#39;s Encrypt</title>
      <link>https://blog.dalydays.com/post/kubernetes-homelab-series-part-4-certificates-with-cert-manager-and-lets-encrypt/</link>
      <pubDate>Tue, 19 Nov 2024 00:00:00 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/kubernetes-homelab-series-part-4-certificates-with-cert-manager-and-lets-encrypt/</guid>
      <description>A look into managing TLS certificates with cert-manager and Let&amp;rsquo;s Encrypt, avoiding manual renewals.</description>
      <content:encoded><![CDATA[<h1 id="what-is-cert-manager">What is cert-manager?</h1>
<p><a href="https://cert-manager.io/">https://cert-manager.io/</a><br>
cert-manager is a certificate controller for Kubernetes which is capable of handling all your certificate needs. This of course includes acquiring and automatically renewing certificates from Let&rsquo;s Encrypt, but it can also be used as a local CA (certificate authority) for private certificates between services, etc.</p>
<h1 id="installationsetup">Installation/Setup</h1>
<p><a href="https://cert-manager.io/docs/">https://cert-manager.io/docs/</a><br>
I run piHole internally on my network and also use DNS challenge for internal only hostnames. This means when I request a new certificate and cert-manager attempts to look up the DNS challenge record, it won&rsquo;t be able to query it through my piHole.</p>
<p>The solution is to configure cert-manager to specifically use public DNS servers to do the lookup, and that is done by setting <code>dns01-recursive-nameservers</code> and <code>dns01-recursive-nameservers-only</code>.</p>
<p>For that reason, I don&rsquo;t use the &ldquo;Default static install&rdquo; method by just installing the manifest using kubectl. Instead, I use the Helm chart so that I can apply those overrides during the installation.</p>
<p>If you don&rsquo;t need to override the nameservers used for DNS challenge, I would recommend using their <code>Default static install method</code>: <a href="https://cert-manager.io/docs/installation/#default-static-install">https://cert-manager.io/docs/installation/#default-static-install</a></p>
<p>If you&rsquo;re still with me and want to apply using the Helm chart, follow along!</p>
<ul>
<li>Make sure you have Helm version 3 or later: <a href="https://helm.sh/docs/intro/install/">https://helm.sh/docs/intro/install/</a></li>
<li>Add the helm repo:
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl">helm repo add jetstack https://charts.jetstack.io --force-update
</span></span></code></pre></div></li>
<li>Install the Helm chart. Note the <code>extraArgs</code> for DNS nameserver settings:
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl">helm install <span class="se">\
</span></span></span><span class="line"><span class="cl"><span class="se"></span>  cert-manager jetstack/cert-manager <span class="se">\
</span></span></span><span class="line"><span class="cl"><span class="se"></span>  --namespace cert-manager <span class="se">\
</span></span></span><span class="line"><span class="cl"><span class="se"></span>  --create-namespace <span class="se">\
</span></span></span><span class="line"><span class="cl"><span class="se"></span>  --version v1.16.1 <span class="se">\
</span></span></span><span class="line"><span class="cl"><span class="se"></span>  --set crds.enabled<span class="o">=</span><span class="nb">true</span>
</span></span><span class="line"><span class="cl">  --set <span class="s1">&#39;extraArgs={--dns01-recursive-nameservers-only,--dns01-recursive-nameservers=8.8.8.8:53\,1.1.1.1:53}&#39;</span>
</span></span></code></pre></div><ul>
<li>If you&rsquo;re doing anything fancy on your network like me, double check that your Kubernetes nodes running cert-manager are able to reach the dns01-recursive-nameservers on port 53</li>
</ul>
</li>
<li>Verify deployment: <code>kubectl get all -n cert-manager</code></li>
<li>(optional) Verify dns01-recursive-nameserver settings were applied: <code>kubectl -n cert-manager describe deploy cert-manager | grep dns01</code></li>
</ul>
<h2 id="verify-installation">Verify Installation</h2>
<p>At this point cert-manager should be running and ready to issue self-signed certificates. We can test that.</p>
<ul>
<li>Test with this file named <code>cert-manager-verify.yaml</code>:
<ul>
<li>Apply: <code>kubectl apply -f cert-manager-verify.yaml</code></li>
<li>Verify: <code>kubectl describe cert -n cert-manager-test</code></li>
<li>Cleanup: <code>kubectl delete -f cert-manager-verify.yaml</code></li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Namespace</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">cert-manager-test</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nn">---</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">cert-manager.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Issuer</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">test-selfsigned</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">cert-manager-test</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">selfSigned</span><span class="p">:</span><span class="w"> </span>{}<span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nn">---</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">cert-manager.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Certificate</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">selfsigned-cert</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">cert-manager-test</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">dnsNames</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">example.com</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">secretName</span><span class="p">:</span><span class="w"> </span><span class="l">selfsigned-cert-tls</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">issuerRef</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">test-selfsigned</span><span class="w">
</span></span></span></code></pre></div></li>
</ul>
<h1 id="getting-certificates-from-lets-encrypt-using-cloudflare-dns">Getting Certificates From Let&rsquo;s Encrypt (using Cloudflare DNS)</h1>
<p>Now we are ready to hook up Let&rsquo;s Encrypt so we can get certificates that are signed by a trusted authority. It&rsquo;s always a good idea to start with the staging issuer when testing Let&rsquo;s Encrypt so you don&rsquo;t run into rate limits with their production infrastructure.</p>
<p>I&rsquo;m using Cloudflare for my DNS nameservers so these steps will be based on that setup.</p>
<h2 id="getting-a-cloudflare-api-token">Getting A Cloudflare API Token</h2>
<ul>
<li>Create a Kubernetes secret resource with your Cloudflare API token. This allows cert-manager to add custom _acme-challenge records for domain validation. This will be the same API token you use for both Staging and Production certificates, so you only need to create one of these.
<ul>
<li>In Cloudflare, you can generate a new API token by navigating to your user account &gt; My Profile &gt; API Tokens &gt; Create Token. Make sure token permissions include <code>All zones - Zone Settings:Read, Zone:Read, DNS:Edit</code> or you can limit to a specific zone if you have multiple and want to use different API Tokens for different zones.</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Secret</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">letsencrypt-cloudflare-api-token-secret</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">cert-manager</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l">Opaque</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">stringData</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">api-token</span><span class="p">:</span><span class="w"> </span><span class="l">&lt;api-token-goes-here&gt;</span><span class="w">
</span></span></span></code></pre></div></li>
<li>Apply the secret manifest: <code>kubectl apply -f letsencrypt-cloudflare-api-token-secret.yaml</code></li>
</ul>
<h2 id="deploying-a-cert-manager-issuer---staging">Deploying A cert-manager Issuer - Staging</h2>
<p>cert-manager has two types of issuers: Issuer and ClusterIssuer. Issuer is used for more fine-grained control, it can be placed within a specific namespace, etc. ClusterIssuer, as the name implies, can be accessed across the whole Kubernetes cluster. Since I&rsquo;m not getting too complicated, and also want to be able to easily utilize cert-manager from multiple namespaces, I&rsquo;m using ClusterIssuer. If you wanted to use Issuer, just pick a namespace and the steps are almost identical to this.</p>
<p>It&rsquo;s strongly recommended to test with Let&rsquo;s Encrypt&rsquo;s staging server to avoid any rate limiting. You are far less likely to run into rate limits during testing while using the staging issuer, and once you get this working it&rsquo;s simply a matter of updating the ACME server URL to the production issuer and changing the name on the Issuer/ClusterIssuer (or you can run boht at the same time).</p>
<p><a href="https://letsencrypt.org/docs/staging-environment/">https://letsencrypt.org/docs/staging-environment/</a></p>
<p>You can use any valid email address. When you request a new cert, the email address is registered with that particular request, and that&rsquo;s where they will send renewal/expiry notices.</p>
<ul>
<li>Create a manifest file for the staging issuer <code>letsencrypt-staging-clusterissuer.yaml</code>. Update your email address:
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">cert-manager.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">ClusterIssuer</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">letsencrypt-staging</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">acme</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c"># The ACME server URL</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">server</span><span class="p">:</span><span class="w"> </span><span class="l">https://acme-staging-v02.api.letsencrypt.org/directory</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c"># Email address used for ACME registration</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">email</span><span class="p">:</span><span class="w"> </span><span class="l">mail@example.com</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c"># Name of a secret used to store the ACME account private key</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">privateKeySecretRef</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">letsencrypt-staging-issuer-secret</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c"># Enable the DNS-01 challenge provider</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">solvers</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">dns01</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">cloudflare</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">apiTokenSecretRef</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">letsencrypt-cloudflare-api-token-secret</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">key</span><span class="p">:</span><span class="w"> </span><span class="l">api-token</span><span class="w">
</span></span></span></code></pre></div></li>
<li>Apply the staging issuer: <code>kubectl apply -f letsencrypt-staging-clusterissuer.yaml</code></li>
<li>Verify your ClusterIssuer exists (or Issuer if that&rsquo;s what you&rsquo;re using): <code>kubectl get clusterissuer</code></li>
</ul>
<h2 id="request-a-real-staging-certificate-from-lets-encrypt">Request A Real (Staging) Certificate From Let&rsquo;s Encrypt</h2>
<p>Now we need to deploy a Certificate resource (again) to test our Let&rsquo;s Encrypt staging ClusterIssuer. This is similar to the verify manifest like before, but only needs to include the Certificate resource. However, we also need to update a couple things, notably adding the field Certificate.spec.issuerRef.kind and setting its value to ClusterIssuer (see <a href="https://cert-manager.io/docs/usage/certificate/">https://cert-manager.io/docs/usage/certificate/</a>)</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">cert-manager.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Certificate</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">letsencrypt-staging-certificate</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">default</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">dnsNames</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">justkidding.example.com</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">secretName</span><span class="p">:</span><span class="w"> </span><span class="l">letsencrypt-staging-cert-tls</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">issuerRef</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">ClusterIssuer</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">letsencrypt-staging</span><span class="w">
</span></span></span></code></pre></div><ul>
<li>Apply the Certificate manifest: <code>kubectl apply -f letsencrypt-staging-certificate.yaml</code></li>
<li>Check the Certificate is ready: <code>kubectl get certificate</code> - this will show Ready=False until it&rsquo;s fully issued</li>
<li>Be patient. This can sometimes be quick, but sometimes can take a while. My most recent attempt took 38 minutes (see Troubleshooting section below to dive into what&rsquo;s happening during the process).</li>
</ul>
<p>Once you get a certificate issued using the staging issuer, you are ready to move to production!</p>
<h2 id="deploying-a-cert-manager-issuer---production">Deploying A cert-manager Issuer - Production</h2>
<p>Follow the same steps as in Staging, but using the production ACME URL and probably not using the name &ldquo;staging&rdquo;.</p>
<ul>
<li>Create a manifest file for the production issuer <code>letsencrypt-production-clusterissuer.yaml</code>. Update your email address:
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">cert-manager.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">ClusterIssuer</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">letsencrypt-production</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">acme</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c"># The ACME server URL</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">server</span><span class="p">:</span><span class="w"> </span><span class="l">https://acme-v02.api.letsencrypt.org/directory</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c"># Email address used for ACME registration</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">email</span><span class="p">:</span><span class="w"> </span><span class="l">mail@example.com</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c"># Name of a secret used to store the ACME account private key</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">privateKeySecretRef</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">letsencrypt-production-issuer-secret</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c"># Enable the DNS-01 challenge provider</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">solvers</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">dns01</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">cloudflare</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">apiTokenSecretRef</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">letsencrypt-cloudflare-api-token-secret</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">key</span><span class="p">:</span><span class="w"> </span><span class="l">api-token</span><span class="w">
</span></span></span></code></pre></div></li>
<li>Apply the staging issuer: <code>kubectl apply -f letsencrypt-production-clusterissuer.yaml</code></li>
<li>Verify your ClusterIssuer exists (or Issuer if that&rsquo;s what you&rsquo;re using): <code>kubectl get clusterissuer</code></li>
</ul>
<h2 id="request-a-real-production-certificate-from-lets-encrypt">Request A Real (Production) Certificate From Let&rsquo;s Encrypt</h2>
<p>This is exactly the same as for staging, except we make the request from the production ClusterIssuer (the last line in the yaml file says which ClusterIssuer to use).</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">cert-manager.io/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Certificate</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">letsencrypt-production-certificate</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">default</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">dnsNames</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">justkidding.example.com</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">secretName</span><span class="p">:</span><span class="w"> </span><span class="l">letsencrypt-production-cert-tls</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">issuerRef</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">ClusterIssuer</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">letsencrypt-production</span><span class="w">
</span></span></span></code></pre></div><ul>
<li>Apply the Certificate manifest: <code>kubectl apply -f letsencrypt-production-certificate.yaml</code></li>
<li>Check the Certificate is ready: <code>kubectl get certificate</code> - this will show Ready=False until it&rsquo;s fully issued</li>
<li>Be patient. This can sometimes be quick, but sometimes can take a while depending on a lot of factors. See Troubleshooting below for some tips.</li>
</ul>
<h1 id="troubleshooting-certificate-requests">Troubleshooting Certificate Requests</h1>
<ul>
<li>Definitive guide: <a href="https://cert-manager.io/docs/troubleshooting/">https://cert-manager.io/docs/troubleshooting/</a>
<ul>
<li>Also for Let&rsquo;s Encrypt: <a href="https://cert-manager.io/docs/troubleshooting/acme/">https://cert-manager.io/docs/troubleshooting/acme/</a></li>
</ul>
</li>
<li><code>kubectl get certificaterequests</code></li>
<li><code>kubectl describe certificaterequests</code></li>
<li><code>kubectl get order</code></li>
<li><code>kubectl describe order</code></li>
<li><code>kubectl get challenge</code></li>
<li><code>kubectl describe challenge</code>
<ul>
<li>This level will tell you if it&rsquo;s stuck waiting for DNS challenge (basically it&rsquo;s waiting for the _acme-challenge record to be added and validated from the cert-manager service. This is where sometimes you can get timeouts if your network doesn&rsquo;t allow cert-manager to query public DNS servers to check).</li>
<li>Troubleshooting steps I&rsquo;ve had here were basically verifying that the DNS record exists in Cloudflare while it&rsquo;s waiting. If it is, wait a while to see if it eventually solves the challenge. If not, you might need to dive deeper into dns01-recursive-nameserver settings (see above) or otherwise make sure your internal DNS resolver is able to query the new record.</li>
<li>In the output, look for Challenge.Spec.Key which is the value you should see in the _acme-challenge TXT record</li>
</ul>
</li>
<li>You can also check the logs on the cert-manager container
<ul>
<li>Find the pod name: <code>kubectl get po -n cert-manager</code></li>
<li><code>kubectl logs cert-manager-556766675f-pt123 -n cert-manager</code></li>
</ul>
</li>
<li>See cert-manager issue for more discussion: <a href="https://github.com/cert-manager/cert-manager/issues/5917">https://github.com/cert-manager/cert-manager/issues/5917</a></li>
</ul>
<h1 id="what-about-ingress">What About Ingress?</h1>
<p>I&rsquo;ll revisit this topic after we get to Ingress Controllers (using Traefik) and how to get certificates from Let&rsquo;s Encrypt. It&rsquo;s significantly easier than this, and since we already have this production ClusterIssuer in place, you basically just add a line in your Ingress resource to use this ClusterIssuer to attach a certificate and cert-manager does the rest!</p>
]]></content:encoded>
    </item>
    
    <item>
      <title>Kubernetes Homelab Series Part 3 - LoadBalancer With MetalLB</title>
      <link>https://blog.dalydays.com/post/kubernetes-homelab-series-part-3-loadbalancer-with-metallb/</link>
      <pubDate>Fri, 18 Oct 2024 00:00:00 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/kubernetes-homelab-series-part-3-loadbalancer-with-metallb/</guid>
      <description>A dive into implementing the LoadBalancer service type without a cloud provider using MetalLB.</description>
      <content:encoded><![CDATA[<h1 id="whats-a-loadbalancer">What&rsquo;s a LoadBalancer?</h1>
<p>There are a few different types of &ldquo;Services&rdquo; in Kubernetes. They provide solutions for different types of network communication between internal cluster resources and externally. Depending on what you need to connect to what, you will use a suitable Service type to provide that connection. I&rsquo;m specifically not mentioning those other types because this isn&rsquo;t really a tutorial on Kubernetes or what Service type you might need. But you will most likely need/want a LoadBalancer service if you need external traffic to reach within your cluster.</p>
<p>One of the first things you might run into after building a new Kubernetes cluster is that as soon as you go to build something that has a LoadBalancer type service, it&rsquo;s stuck in a Pending status. That&rsquo;s because you don&rsquo;t have a &ldquo;controller&rdquo; (is that the right term?) to handle this type of service. It&rsquo;s not included out of the box.</p>
<p>Why not include LoadBalancer out of the box? Because it requires some environment specific configuration, and most importantly in our case that&rsquo;s a pool of IP addresses that are dedicated for this purpose. So how do we tell Kubernetes what IPs it can use for LoadBalancer services? MetalLB - <a href="https://metallb.io/">https://metallb.io/</a> There&rsquo;s also the difference between how to advertise provisioned IPs to the rest of the network so that traffic can be routed properly. MetalLB supports two options: layer 2 mode and BGP. I don&rsquo;t have a need for BGP in my homelab so I&rsquo;m just using layer 2 mode. Layer 2 mode uses ARP on IPv4 networks and NDP on IPv6 networks - <a href="https://metallb.io/concepts/">https://metallb.io/concepts/</a></p>
<p>There are other options such as KubeVIP (<a href="https://kube-vip.io/)">https://kube-vip.io/)</a>, which is also a great project. I&rsquo;m going to use MetalLB because that&rsquo;s what I have already figured out. You could even use KubeVIP as a control plane load balancer, in addition to MetalLB as a service load balancer. One is dedicated to load balancing the control plane, which means you can set one IP address for all 3 control plane nodes as an interface between <code>kubectl</code> and your cluster&rsquo;s API. But what we&rsquo;re focused on here is Service load balancing, so the same thing but for all the stuff you&rsquo;re actually running on your cluster like websites, etc. And eventually this will tie into Ingress and Gateway API, but let&rsquo;s not get too far into the weeds yet.</p>
<h2 id="see-for-yourself">See For Yourself</h2>
<p>If you want to see the behavior wen you create a LoadBalancer service without something that implements this type of service such as MetalLB, follow along. The symptom is that instead of getting an IP assigned, you will see the EXTERNAL-IP &ldquo;stuck&rdquo; as <code>&lt;pending&gt;</code>.</p>
<ul>
<li>Save this as <code>test-lb-svc.yaml</code> and then run <code>kubectl apply -f test-lb-svc.yaml</code>. This creates a deployment running 2 replicas of <code>whoami</code> (which is a simple web server that shows you some details about the connection provided by Traefik), then creates a LoadBalancer service in front of it.</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">apps/v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Deployment</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">whoami-deployment</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">labels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">app</span><span class="p">:</span><span class="w"> </span><span class="l">whoami</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">replicas</span><span class="p">:</span><span class="w"> </span><span class="m">2</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">selector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">matchLabels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">app</span><span class="p">:</span><span class="w"> </span><span class="l">whoami</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">template</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">labels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">app</span><span class="p">:</span><span class="w"> </span><span class="l">whoami</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">containers</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">whoami</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">image</span><span class="p">:</span><span class="w"> </span><span class="l">traefik/whoami:latest</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">ports</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">containerPort</span><span class="p">:</span><span class="w"> </span><span class="m">80</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nn">---</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Service</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">whoami-loadbalancer</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l">LoadBalancer</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">selector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">app</span><span class="p">:</span><span class="w"> </span><span class="l">whoami</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">ports</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">protocol</span><span class="p">:</span><span class="w"> </span><span class="l">TCP</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="m">80</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">targetPort</span><span class="p">:</span><span class="w"> </span><span class="m">80</span><span class="w">
</span></span></span></code></pre></div><ul>
<li>Check the service status with <code>kubectl get svc</code> and look at the EXTERNAL-IP for the <code>whoami-loadbalancer</code> service. Notice the <code>&lt;pending&gt;</code> instead of an actual IP address:</li>
</ul>
<pre tabindex="0"><code>NAME                 TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
kubernetes           ClusterIP      10.96.0.1       &lt;none&gt;        443/TCP        40h
whoami-loadbalancer   LoadBalancer   10.99.103.170   &lt;pending&gt;     80:30652/TCP   3s
</code></pre><ul>
<li>You can delete this and try again later after deploying MetalLB, or leave it and it will automatically get an IP assigned by MetalLB once it&rsquo;s running.</li>
</ul>
<h1 id="metallb-setup">MetalLB Setup</h1>
<p><a href="https://metallb.universe.tf/installation/">https://metallb.universe.tf/installation/</a></p>
<p>The core steps are to make sure your cluster networking will support MetalLB, then install MetalLB, then apply your environment specific configuration. Talos includes flannel which supports MetalLB, and no other networking changes are required before you just start installing. There is no configmap for kube-proxy either, that is already configured in Talos and will work fine.</p>
<ul>
<li>Install the MetalLB manifest (check the documentation for the current version): <code>kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.14.8/config/manifests/metallb-native.yaml</code></li>
<li>Configure your IP address pool for MetalLB to use. You can list multiple ranges if you want. See documentation for examples: <a href="https://metallb.io/configuration/#defining-the-ips-to-assign-to-the-load-balancer-services">https://metallb.io/configuration/#defining-the-ips-to-assign-to-the-load-balancer-services</a></li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">metallb.io/v1beta1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">IPAddressPool</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">first-pool</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">metallb-system</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">addresses</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="m">10.0.50.64</span><span class="l">/28</span><span class="w">
</span></span></span></code></pre></div><ul>
<li><code>kubectl apply -f metallb-ipaddresspool.yaml</code></li>
<li>Configure how MetalLB will announce new IPs to your network. See documentation for details or use this for layer 2 mode: <a href="https://metallb.io/configuration/#announce-the-service-ips">https://metallb.io/configuration/#announce-the-service-ips</a></li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">metallb.io/v1beta1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">L2Advertisement</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">metallb-advertise</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">namespace</span><span class="p">:</span><span class="w"> </span><span class="l">metallb-system</span><span class="w">
</span></span></span></code></pre></div><ul>
<li><code>kubectl apply -f metallb-l2advertisement.yaml</code></li>
</ul>
<p>You should now have MetalLB running and ready to provision IP addresses for your Loadbalancer services. If you ran the demo earlier and left the service up, check again to verify that there is now an IP assigned to the <code>whoami-loadbalancer</code> service. Navigate to that IP in the browser (HTTP only) to verify you can see the page.</p>
<p>For example, I&rsquo;m seeing something like this:</p>
<pre tabindex="0"><code>Hostname: whoami-deployment-1a2b3c4d5e-hxx4g
IP: 127.0.0.1
IP: ::1
IP: 10.244.4.8
IP: fe80::10fe:20ff:fe43:68de
RemoteAddr: 10.244.5.0:20130
GET / HTTP/1.1
Host: 10.0.50.180
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en
Cache-Control: max-age=0
Connection: keep-alive
Sec-Gpc: 1
Upgrade-Insecure-Requests: 1
</code></pre>]]></content:encoded>
    </item>
    
    <item>
      <title>Kubernetes Homelab Series Part 2 - Secrets With SOPS and age</title>
      <link>https://blog.dalydays.com/post/kubernetes-homelab-series-part-2-sops-and-age/</link>
      <pubDate>Thu, 17 Oct 2024 00:00:00 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/kubernetes-homelab-series-part-2-sops-and-age/</guid>
      <description>In this part we talk about encrypting secrets in key-value files like YAML so they can be stored securely in public places like GitHub.</description>
      <content:encoded><![CDATA[<h1 id="what-is-sops-and-age">What Is SOPS and age?</h1>
<p><a href="https://github.com/getsops/sops">https://github.com/getsops/sops</a><br>
SOPS is a tool to encrypt values in a key value format, which can be used with YAML. That means you can encrypt specific parts, leaving the YAML structure in tact and also make it safe to store this in a public place like GitHub.</p>
<p><a href="https://github.com/FiloSottile/age">https://github.com/FiloSottile/age</a><br>
(Pronounced &ldquo;ah-gay&rdquo;, it does not rhyme with rage)<br>
This is a modern encryption tool which can be used to encrypt whole files, or in our use case used to encrypt parts of YAML files. SOPS supports other encryption options such as AWS KMS, most of which are cloud based. Since we are in a homelab, age seems to be a good option using a strong/modern encryption algorithm. The only other alternative for us homelabbers is PGP, but age is newer so it must be better, right?</p>
<p>I will also mention that you could use AWS KMS which is available on their free tier up to 20,000 requests per month as of today.</p>
<h1 id="installation-and-setup">Installation And Setup</h1>
<p>This is a quick walk through of getting SOPS and age installed and how to use SOPS to encrypt/decrypt YAML files. We&rsquo;ll do this for Talos Linux secrets.yaml and for machine configs so they can be safely stored in a public repo.</p>
<p>If you are looking for a video for more depth into SOPS, check out <a href="https://www.youtube.com/watch?v=V2PRhxphH2w">https://www.youtube.com/watch?v=V2PRhxphH2w</a></p>
<p>There is also a VSCode extension that allows you to edit plaintext versions of files, but it will automatically encrypt/decrypt the saved file if this is better for your workflow. Check out <a href="https://marketplace.visualstudio.com/items?itemName=signageos.signageos-vscode-sops">https://marketplace.visualstudio.com/items?itemName=signageos.signageos-vscode-sops</a></p>
<p>I&rsquo;m just sticking to basic usage and running sops commands manually for now, at least to get started. You could also build some of these processes into a pipeline.</p>
<h2 id="installation">Installation</h2>
<ul>
<li>Curl the latest <code>SOPS</code> binary and install
<ul>
<li><code>curl -LJO https://github.com/getsops/sops/releases/download/v3.9.1/sops-3.9.1.x86_64.rpm</code></li>
<li><code>install -m 755 sops-v3.9.1.linux.amd64 /usr/local/bin/sops</code></li>
<li>Verify sops version: <code>sops --version</code></li>
</ul>
</li>
<li>Curl the latest <code>age</code> binary and install
<ul>
<li><code>curl -LJO https://dl.filippo.io/age/latest?for=linux/amd64</code></li>
<li><code>tar zxvf age-v1.2.0-linux-amd64.tar.gz</code></li>
<li><code>install -m 755 age/age /usr/local/bin/age &amp;&amp; install -m 755 age/age-keygen /usr/local/bin/age-keygen</code></li>
<li>Verify age version: <code>age --version</code></li>
<li>Verify age-keygen version: <code>age-keygen --version</code></li>
</ul>
</li>
</ul>
<h2 id="setup">Setup</h2>
<ul>
<li>Create a private key similar to how you would with SSH
<ul>
<li><code>mkdir -p ~/.config/sops/age</code></li>
<li><code>age-keygen -o ~/.config/sops/age/keys.txt</code></li>
<li>Copy the private/public key somewhere such as in Bitwarden or other secure location</li>
<li>Copy down the public key, which is needed to encrypt. The public key is included in the private key file in case you need to reference it in the future.</li>
</ul>
</li>
<li>Create a config file locally <code>.sops.yaml</code> (usually stored and committed in the repo where it is used) which tells SOPS which files to encrypt, which keys to encrypt within the file, and which encryption to use. See the documentation for more details around the config file.
<ul>
<li><code>path_regex</code> tells sops which files to encrypt if not specified</li>
<li>(optional) <code>encrypted_regex</code> defines which keys within the file to encrypt - omit this to encrypt all values, otherwise ONLY specific values will be encrypted and others will be left in plain text</li>
<li><code>age</code> is the public key used for encryption</li>
<li>Sample <code>.sops.yaml</code> file configured for Talos Linux secrets.yaml and machine config files:</li>
</ul>
</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="w">  </span>---<span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">creation_rules</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">path_regex</span><span class="p">:</span><span class="w"> </span><span class="l">/*secrets(\.encrypted)?.yaml$</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">age</span><span class="p">:</span><span class="w"> </span><span class="l">replace-with-your-public-key</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">path_regex</span><span class="p">:</span><span class="w"> </span><span class="l">/*controlplane(\.encrypted)?.yaml$</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">encrypted_regex</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;(^token|crt|key|id|secret|secretboxEncryptionSecret)$&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">age</span><span class="p">:</span><span class="w"> </span><span class="l">replace-with-your-public-key</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">path_regex</span><span class="p">:</span><span class="w"> </span><span class="l">/*worker(\.encrypted)?.yaml$</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">encrypted_regex</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;(^token|crt|key|id|secret|secretboxEncryptionSecret)$&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">age</span><span class="p">:</span><span class="w"> </span><span class="l">replace-with-your-public-key</span><span class="w">
</span></span></span></code></pre></div><ul>
<li>Create <code>.gitignore</code> to make sure you don&rsquo;t commit unencrypted secrets to your repo!</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-ini" data-lang="ini"><span class="line"><span class="cl"><span class="na">secrets.yaml</span>
</span></span><span class="line"><span class="cl"><span class="na">talosconfig</span>
</span></span><span class="line"><span class="cl"><span class="na">controlplane.yaml</span>
</span></span><span class="line"><span class="cl"><span class="na">worker.yaml</span>
</span></span><span class="line"><span class="cl"><span class="na">kubeconfig</span>
</span></span></code></pre></div><ul>
<li>Encrypt secrets.yaml: <code>sops --encrypt secrets.yaml &gt; secrets.encrypted.yaml</code></li>
<li>Encrypt controlplane.yaml: <code>sops --encrypt _out/controlplane.yaml &gt; _out/controlplane.encrypted.yaml</code></li>
<li>Encrypt worker.yaml: <code>sops --encrypt _out/worker.yaml &gt; _out/worker.encrypted.yaml</code></li>
<li>Initialize the git repo: <code>git init</code></li>
<li>Make your initial git commit: <code>git add .</code></li>
<li>Double check that only encrypted yaml files are staged: <code>git status</code></li>
<li>Commit: <code>git commit -m &quot;initial commit with encrypted cluster secrets&quot;</code></li>
</ul>
<h1 id="encrypting-and-decrypting-files">Encrypting and Decrypting Files</h1>
<p>Later, when you pull the repo to make changes to your cluster, you might only have the encrypted version of the files. You will need to decrypt the files, operate on the decrypted versions, and when you&rsquo;re done, re-encrypt them. At that point, you can commit your changes to the git repo and push them up.</p>
<p>Now you are ready to actually encrypt/decrypt values in your files. This is a good point to come up with a naming convention if you want to keep separate copies of the decrypted files, because you should make sure you add the decrypted version to .gitignore since the whole point of this is to make sure you don&rsquo;t commit plaintext secrets to a repo anywhere (public or private).</p>
<ul>
<li>Encryption: <code>sops --encrypt secrets.yaml &gt; secrets.encrypted.yaml</code></li>
<li>Decryption: <code>sops --decrypt secrets.encrypted.yaml &gt; secrets.yaml</code></li>
</ul>
<h1 id="what-about-sealed-secrets">What about Sealed Secrets?</h1>
<p>Sealed Secrets are a Kubernetes specific solution to encrypting secrets. It&rsquo;s not exactly a replacement for SOPS because it can&rsquo;t be used for anything outside the Kubernetes cluster, and Talos <code>secrets.yaml</code> is a perfect example.</p>
<p>Sealed Secrets on the other hand are great for other things that you would normally put a Secret in Kubernetes. You can lock down access to the easily read base64 encoded Secrets in the cluster, making sure users interact only with Sealed Secrets which are truly encrypted. Sealed Secrets actually create regular Secrets in the cluster, so you still need to be careful about who has access to Secret resources.</p>
<p>You could use both SOPS and Sealed Secrets for different use cases, which is my plan going forward. Sometimes a Helm chart may not directly support Sealed Secrets and while it&rsquo;s possible to make it work, you may find it easier to use SOPS for that type of use case.</p>
]]></content:encoded>
    </item>
    
    <item>
      <title>Kubernetes Homelab Series Part 1 - Introduction and Talos Installation</title>
      <link>https://blog.dalydays.com/post/kubernetes-homelab-series-part-1-talos-linux-proxmox/</link>
      <pubDate>Thu, 10 Oct 2024 00:00:00 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/kubernetes-homelab-series-part-1-talos-linux-proxmox/</guid>
      <description>This in depth series will walk through building a Kubernetes cluster beyond the basics, including dynamically provisioned storage, certificate management, and backups.</description>
      <content:encoded><![CDATA[<h1 id="introduction">Introduction</h1>
<p>This series will go through the process of building a Kubernetes cluster on Proxmox using Talos Linux from the ground up. I intend for this to go into a lot more depth than a lot of basic tutorials, and by the end you should have a cluster that you can actually trust to run workloads with high availability, backups, etc.</p>
<p>The goal of this series is not to be a Kubernetes tutorial, but more of a step by step to building a Talos Linux cluster that has all the components you might need, along with some insights I&rsquo;ve learned over time that you might find helpful.</p>
<p>I use Proxmox in my homelab, but this could be followed easily in ESXi or other hypervisors of choice.</p>
<p>Building a bare cluster is easy. Making it work for you is not. Not because it&rsquo;s hard, but because there are a lot of components and decisions to make. As many problems as Kubernetes solves, it introduces many new challenges and new ways of thinking about things. One of the hardest problems in my opinion is storage, which we will talk about and I will show you how I decided to do this in my homelab. It is possible to achieve dynamically provisioned storage with a single disk, and not just with hostPath. You don&rsquo;t have to use Rancher Longhorn, OpenEBS (Mayastor) or Portworx. You don&rsquo;t have to use NFS. It&rsquo;s mix and match!</p>
<p>Other topics will include LoadBalancer services using MetalLB, secret management using Sealed Secrets (and also talking about SOPS with age), cluster backup of the etcd cluster along with Velero to back up cluster resources, and ingress with Traefik (plus Gateway API which is the successor to ingress controllers).</p>
<p>Finally, and this will actually be the first topic, I will be using Talos Linux because I believe it&rsquo;s the easiest option overall considering the initial build and also ongoing maintenance such as OS upgrades and Kubernetes version upgrades.</p>
<p>Using Talos Linux in theory makes managing the cluster itself very simple because their API is the only way to make changes and they include safe upgrade options for both the Talos nodes and Kubernetes versions. It does introduce some complexity, especially relating to storage, but we&rsquo;ll talk about that a little later.</p>
<h1 id="prerequisites-and-assumptions">Prerequisites and Assumptions</h1>
<ul>
<li>You have a homelab of some sort and can download ISOs and create VMs.</li>
<li>You have a disk or virtual disk to dedicate for storage (or you can do clustered storage, we&rsquo;ll talk about it).</li>
<li>You are able to read and follow along :)</li>
</ul>
<h1 id="getting-started-with-talos-overview">Getting Started With Talos (Overview)</h1>
<p><a href="https://www.talos.dev/">https://www.talos.dev/</a><br>
Talos Linux is Linux designed for Kubernetes – secure, immutable, and minimal.</p>
<ul>
<li>Supports cloud platforms, bare metal, and virtualization platforms</li>
<li>All system management is done via an API. No SSH, shell or console</li>
<li>Production ready: supports some of the largest Kubernetes clusters in the world</li>
<li>Open source project from the team at Sidero Labs</li>
</ul>
<p>It&rsquo;s the easiest solution to deploy (and maintain) a Kubernetes cluster.</p>
<h2 id="proxmox-setup">Proxmox Setup</h2>
<ul>
<li>Download latest Talos Linux release, grabbing metal-amd64.iso version from the GitHub releases page <a href="https://github.com/siderolabs/talos/releases">https://github.com/siderolabs/talos/releases</a></li>
<li>Upload it to proxmox ISOs, or have proxmox download it directly from the URL</li>
<li>Optionally rename the file manually on the filesystem: /tank/isos/template/iso/</li>
<li>Create VMs in Proxmox (3 control plane nodes, 3 worker nodes)</li>
<li>For OS, the talos metal-amd64 iso image. It will be replaced later during bootstrapping, so getting the latest image with added extensions now is not important.
<ul>
<li>Enable QEMU guest agent (requires qemu guest extensions in Talos)</li>
<li>For disks, choose something on SSD storage (required for reasonable etcd performance)</li>
<li>Enable Discard and SSD emulation options</li>
<li>20GB should be plenty</li>
<li>4CPU (2 minimum)</li>
<li>4GB RAM (2 minimum)</li>
<li>VLAN 50, DHCP (VLAN is optional, use whatever VLAN you want for your cluster if you have one, otherwise leave this blank)</li>
</ul>
</li>
<li>Start the VMs, and check the console for each to verify they are running (and have an IP assigned on the kubernetes VLAN if applicable)</li>
</ul>
<h2 id="talos-linux-setup">Talos Linux Setup</h2>
<ul>
<li>Check out the talos project: <a href="https://www.talos.dev/v1.8/talos-guides/install/virtualized-platforms/proxmox/">https://www.talos.dev/v1.8/talos-guides/install/virtualized-platforms/proxmox/</a></li>
<li>On whichever machine you are using to manage Talos such as your local workstation, install talosctl and verify it&rsquo;s the same version you are deploying
<ul>
<li>See <a href="https://www.talos.dev/v1.8/talos-guides/install/talosctl/">https://www.talos.dev/v1.8/talos-guides/install/talosctl/</a></li>
<li>The recommended method is <code>brew install siderolabs/tap/talosctl</code></li>
<li>The alternative method is <code>curl -sL https://talos.dev/install | sh</code></li>
<li>Note: if you are upgrading talosctl, the <code>curl</code> method will tell you it&rsquo;s already installed with a message saying <code>To force re-downloading, delete '/usr/local/bin/talosctl' then run me again.</code>
<ul>
<li>Delete the current binary with <code>sudo rm -rf /usr/local/bin/talosctl</code></li>
<li>Re-run <code>curl -sL https://talos.dev/install | sh</code></li>
</ul>
</li>
</ul>
</li>
<li>Create a new folder for your Talos cluster. You will be initializing a git repo here and eventually pushing to a remote repo.
<ul>
<li><code>mkdir -p taloscluster &amp;&amp; cd taloscluster</code></li>
</ul>
</li>
<li>Generate new Talos secrets. These are used to authenticate with the Talos cluster.
<ul>
<li><code>talosctl gen secrets</code></li>
<li>This stores a file in the current directory named <code>secrets.yaml</code></li>
</ul>
</li>
<li>Build the Talos image you will be using to install Talos to each node. This allows you to add extensions which add capabilities such as iSCSI support and QEMU guest agent for Proxmox.
<ul>
<li>Go to <a href="https://factory.talos.dev/">https://factory.talos.dev/</a> - Bare Metal Machine &gt; 1.8.1 (or the version you want) &gt; AMD64 &gt; Check siderolabs/qemu-guest-agent &gt; Check siderolabs/iscsi-tools &gt; (no customization) &gt; copy the factory image string under &ldquo;Initial Install&rdquo; section, e.g. <code>factory.talos.dev/installer/88d1f7a5c4f1d3aba7df787c448c1d3d008ed29cfb34af53fa0df4336a56040b:v1.8.1</code></li>
</ul>
</li>
<li>Generate base machine configs: <a href="https://www.talos.dev/v1.8/talos-guides/install/virtualized-platforms/proxmox/#generate-machine-configurations">https://www.talos.dev/v1.8/talos-guides/install/virtualized-platforms/proxmox/#generate-machine-configurations</a>
<ul>
<li>The <code>CONTROL_PLANE_IP</code> needs to point to one of the control plane nodes. The best practice is to use a VIP, allowing for HA due to the fact that it is automatically assigned to another control plane node if one goes offline (<a href="https://www.talos.dev/v1.8/talos-guides/network/vip/">https://www.talos.dev/v1.8/talos-guides/network/vip/</a>)
<ul>
<li>You can use a hostname that points to the VIP, but I prefer to use the VIP directly.</li>
<li>Adding a VIP is essentially done by configuring <code>machine.network.interfaces.vip.ip</code> in the Talos machine config (see below), which automatically enables the functionality in Talos to load balance this VIP (e.g. if the VIP is on node 1 and that goes down, the VIP will automatically be moved to another node that is online).</li>
<li>It&rsquo;s not recommended to use the VIP for the Talos API (used with <code>talosctl</code>). The main purpose of the VIP is for use with the Kubernetes API (used with <code>kubectl</code>).</li>
<li>If you don&rsquo;t use a VIP, you can just use the IP of one of the control plane nodes, but if that particular node goes down, even if other control plane nodes are still up, you will not be able to run <code>kubectl</code> commands against your cluster without manually editing your <code>kubeconfig</code>. A VIP would also be pointless if you only have a single control plane node.</li>
</ul>
</li>
<li>Feel free to change the cluster name (default is &ldquo;talos-proxmox-cluster&rdquo;)
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="c1"># CONTROL_PLANE_IP should be the VIP you plan on using (not the DHCP assigned), or the IP of one of your control plane nodes if not using a VIP</span>
</span></span><span class="line"><span class="cl"><span class="nb">export</span> <span class="nv">CONTROL_PLANE_IP</span><span class="o">=</span>10.0.50.160
</span></span><span class="line"><span class="cl"><span class="c1"># FACTORY_IMAGE should be updated to the image you want to install</span>
</span></span><span class="line"><span class="cl"><span class="nb">export</span> <span class="nv">FACTORY_IMAGE</span><span class="o">=</span>factory.talos.dev/installer/88d1f7a5c4f1d3aba7df787c448c1d3d008ed29cfb34af53fa0df4336a56040b:v1.9.2
</span></span><span class="line"><span class="cl">talosctl gen config talos-proxmox-cluster https://<span class="nv">$CONTROL_PLANE_IP</span>:6443 --with-secrets secrets.yaml --output-dir _out --install-image <span class="nv">$FACTORY_IMAGE</span>
</span></span></code></pre></div></li>
</ul>
</li>
<li>In Part 2, I discuss encrypting YAML files with SOPS + age, which you can use to encrypt this file and store it securely in a git repo: <a href="https://blog.dalydays.com/post/kubernetes-homelab-series-part-2-sops-and-age/">https://blog.dalydays.com/post/kubernetes-homelab-series-part-2-sops-and-age/</a> - Feel free to do that now if you want to add a little bit of complexity. Otherwise, continue to the end of this one and circle back to that article afterward so you can safely encrypt and store your configs in a git repo.</li>
<li>Create node specific patches (used for defining a static IP, etc.): <a href="https://www.talos.dev/v1.8/talos-guides/configuration/patching/">https://www.talos.dev/v1.8/talos-guides/configuration/patching/</a>
<ul>
<li>Best practice is to install the base config to each node (<code>controlplane.yaml</code> or <code>worker.yaml</code>), then apply patches to customize nodes to set the name, static IP, etc.</li>
<li>To set a static IP, you have to specify either an <code>interface</code> or a <code>deviceSelector</code> (mutually exclusive). By default, Predictable Interface Names are not enabled, meaning your ethernet name might be something like <code>enxSOMETHING</code> based on the MAC address. One way around this is to enable Predictable Interface Names by setting the kernel argument <code>net.ifnames=0</code> on first boot, or you can use a generic <code>deviceSelector</code> like I did in the example below. If you only have 1 NIC, this will work just fine. If you mave multiple NICs, you can run <code>talosctl get links -n [node-ip]</code> and select the interface name from the ID column, or maybe the MAC address from the HW ADDR column. I&rsquo;m just using <code>busPath: &quot;0*&quot;</code> to select the default NIC without changing any other options.
<ul>
<li><a href="https://www.talos.dev/v1.8/talos-guides/network/predictable-interface-names/#single-network-interface">https://www.talos.dev/v1.8/talos-guides/network/predictable-interface-names/#single-network-interface</a></li>
</ul>
</li>
<li><code>mkdir -p patches</code></li>
<li>Create patches for each controlplane and worker node:
<ul>
<li>cp1.patch</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">machine</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">network</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">hostname</span><span class="p">:</span><span class="w"> </span><span class="l">taloscp1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">interfaces</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">deviceSelector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">busPath</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;0*&#34;</span><span class="w"> </span><span class="c"># This is an example; adjust based on your hardware</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">addresses</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="m">10.0.50.161</span><span class="l">/24</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">routes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="nt">network</span><span class="p">:</span><span class="w"> </span><span class="m">0.0.0.0</span><span class="l">/0</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">gateway</span><span class="p">:</span><span class="w"> </span><span class="m">10.0.50.1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">vip</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">ip</span><span class="p">:</span><span class="w"> </span><span class="m">10.0.50.160</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">nameservers</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="m">192.168.1.22</span><span class="w">
</span></span></span></code></pre></div><ul>
<li>cp2.patch</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">machine</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">network</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">hostname</span><span class="p">:</span><span class="w"> </span><span class="l">taloscp2</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">interfaces</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">deviceSelector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">busPath</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;0*&#34;</span><span class="w"> </span><span class="c"># This is an example; adjust based on your hardware</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">addresses</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="m">10.0.50.162</span><span class="l">/24</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">routes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="nt">network</span><span class="p">:</span><span class="w"> </span><span class="m">0.0.0.0</span><span class="l">/0</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">gateway</span><span class="p">:</span><span class="w"> </span><span class="m">10.0.50.1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">vip</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">ip</span><span class="p">:</span><span class="w"> </span><span class="m">10.0.50.160</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">nameservers</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="m">192.168.1.22</span><span class="w">
</span></span></span></code></pre></div><ul>
<li>cp3.patch</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">machine</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">network</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">hostname</span><span class="p">:</span><span class="w"> </span><span class="l">taloscp3</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">interfaces</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">deviceSelector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">busPath</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;0*&#34;</span><span class="w"> </span><span class="c"># This is an example; adjust based on your hardware</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">addresses</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="m">10.0.50.163</span><span class="l">/24</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">routes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="nt">network</span><span class="p">:</span><span class="w"> </span><span class="m">0.0.0.0</span><span class="l">/0</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">gateway</span><span class="p">:</span><span class="w"> </span><span class="m">10.0.50.1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">vip</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">ip</span><span class="p">:</span><span class="w"> </span><span class="m">10.0.50.160</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">nameservers</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="m">192.168.1.22</span><span class="w">
</span></span></span></code></pre></div><ul>
<li>wk1.patch</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">machine</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">network</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">hostname</span><span class="p">:</span><span class="w"> </span><span class="l">taloswk1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">interfaces</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">deviceSelector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">busPath</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;0*&#34;</span><span class="w"> </span><span class="c"># This is an example; adjust based on your hardware</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">addresses</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="m">10.0.50.171</span><span class="l">/24</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">routes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="nt">network</span><span class="p">:</span><span class="w"> </span><span class="m">0.0.0.0</span><span class="l">/0</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">gateway</span><span class="p">:</span><span class="w"> </span><span class="m">10.0.50.1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">nameservers</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="m">192.168.1.22</span><span class="w">
</span></span></span></code></pre></div><ul>
<li>wk2.patch</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">machine</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">network</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">hostname</span><span class="p">:</span><span class="w"> </span><span class="l">taloswk2</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">interfaces</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">deviceSelector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">busPath</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;0*&#34;</span><span class="w"> </span><span class="c"># This is an example; adjust based on your hardware</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">addresses</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="m">10.0.50.172</span><span class="l">/24</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">routes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="nt">network</span><span class="p">:</span><span class="w"> </span><span class="m">0.0.0.0</span><span class="l">/0</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">gateway</span><span class="p">:</span><span class="w"> </span><span class="m">10.0.50.1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">nameservers</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="m">192.168.1.22</span><span class="w">
</span></span></span></code></pre></div><ul>
<li>wk3.patch</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">machine</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">network</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">hostname</span><span class="p">:</span><span class="w"> </span><span class="l">taloswk3</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">interfaces</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">deviceSelector</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">busPath</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;0*&#34;</span><span class="w"> </span><span class="c"># This is an example; adjust based on your hardware</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">addresses</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="m">10.0.50.173</span><span class="l">/24</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">routes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="nt">network</span><span class="p">:</span><span class="w"> </span><span class="m">0.0.0.0</span><span class="l">/0</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">gateway</span><span class="p">:</span><span class="w"> </span><span class="m">10.0.50.1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">nameservers</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="m">192.168.1.22</span><span class="w">
</span></span></span></code></pre></div></li>
</ul>
</li>
</ul>
<h2 id="talos-linux-installation">Talos Linux Installation</h2>
<p>Finally, time to actually install Talos!</p>
<ul>
<li>Check the console for Talos CP1 in Proxmox to get the IP address (should have been assigned via DHCP). Let&rsquo;s say it&rsquo;s 10.0.50.129</li>
<li>Install using base controlplane.yaml config: <code>talosctl apply-config --insecure --nodes 10.0.50.129 --file _out/controlplane.yaml</code></li>
<li>Follow this same process for each of the other 2 controlplane nodes.</li>
<li>Follow the same process for all worker nodes, but use <code>_out/worker.yaml</code>: <code>talosctl apply-config --insecure --nodes 10.0.50.132 --file _out/worker.yaml</code></li>
<li>Watch the console in Proxmox to see it install and reboot. When you see the Kubernetes version and Kubelet status Healthy on all 6 nodes, you can proceed to patching each node to assign static IPs.</li>
<li>Patch each node using the corresponding patch file. Endpoint should be the control plane node you&rsquo;re targeting, for controlplane setup. For workers, the endpoint needs to be one of the control plane nodes, and since you&rsquo;ve already updated to 10.0.50.161 in this example, you can use the first control plane as the endpoint when patching all the worker nodes:
<ul>
<li><code>talosctl patch mc --talosconfig _out/talosconfig -e 10.0.50.129 -n 10.0.50.129 --patch @patches/cp1.patch</code></li>
<li><code>talosctl patch mc --talosconfig _out/talosconfig -e 10.0.50.130 -n 10.0.50.130 --patch @patches/cp2.patch</code></li>
<li><code>talosctl patch mc --talosconfig _out/talosconfig -e 10.0.50.131 -n 10.0.50.131 --patch @patches/cp3.patch</code></li>
<li><code>talosctl patch mc --talosconfig _out/talosconfig -e 10.0.50.161 -n 10.0.50.132 --patch @patches/wk1.patch</code></li>
<li><code>talosctl patch mc --talosconfig _out/talosconfig -e 10.0.50.161 -n 10.0.50.133 --patch @patches/wk2.patch</code></li>
<li><code>talosctl patch mc --talosconfig _out/talosconfig -e 10.0.50.161 -n 10.0.50.134 --patch @patches/wk3.patch</code></li>
</ul>
</li>
<li>Note: the VIP doesn&rsquo;t come online until after bootstrapping the cluster. Don&rsquo;t be like me and try to troubleshoot this right now :)</li>
<li>If you want to deploy metrics-server (see <a href="https://www.talos.dev/v1.8/kubernetes-guides/configuration/deploy-metrics-server/">https://www.talos.dev/v1.8/kubernetes-guides/configuration/deploy-metrics-server/</a>) then enable <code>rotate-server-certificates: true</code> on all nodes and add the extra manifests to automatically approve the CSRs and include metrics-server. This is OPTIONAL.
<ul>
<li>Create <code>./patches/kubelet-cert-rotation.patch</code></li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nn">---</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">machine</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">kubelet</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">extraArgs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">rotate-server-certificates</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">cluster</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">extraManifests</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">https://raw.githubusercontent.com/alex1989hu/kubelet-serving-cert-approver/main/deploy/standalone-install.yaml</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml</span><span class="w">
</span></span></span></code></pre></div><ul>
<li><code>talosctl patch mc --talosconfig _out/talosconfig -e 10.0.50.161 -n 10.0.50.161 --patch @patches/kubelet-cert-rotation.patch</code></li>
<li><code>talosctl patch mc --talosconfig _out/talosconfig -e 10.0.50.161 -n 10.0.50.162 --patch @patches/kubelet-cert-rotation.patch</code></li>
<li><code>talosctl patch mc --talosconfig _out/talosconfig -e 10.0.50.161 -n 10.0.50.163 --patch @patches/kubelet-cert-rotation.patch</code></li>
<li><code>talosctl patch mc --talosconfig _out/talosconfig -e 10.0.50.161 -n 10.0.50.171 --patch @patches/kubelet-cert-rotation.patch</code></li>
<li><code>talosctl patch mc --talosconfig _out/talosconfig -e 10.0.50.161 -n 10.0.50.172 --patch @patches/kubelet-cert-rotation.patch</code></li>
<li><code>talosctl patch mc --talosconfig _out/talosconfig -e 10.0.50.161 -n 10.0.50.173 --patch @patches/kubelet-cert-rotation.patch</code></li>
</ul>
</li>
<li>Configure talosctl by either copying talosconfig to $HOME/.talos/config or exporting the ENV variable:
<ul>
<li><code>mv _out/talosconfig ~/.talos/config</code></li>
<li>Or if you want to use ENV variables: <code>export TALOSCONFIG=&quot;_out/talosconfig&quot;</code></li>
</ul>
</li>
<li>Set endpoints list and nodes list if youi get tired of specifying them every time you run talosctl commands:
<ul>
<li><code>talosctl config endpoint 10.0.50.161 10.0.50.162 10.0.50.163</code></li>
<li><code>talosctl config node 10.0.50.161 10.0.50.162 10.0.50.163 10.0.50.171 10.0.50.172 10.0.50.173</code></li>
</ul>
</li>
<li>Bootstrap the cluster:
<ul>
<li><code>talosctl bootstrap -n 10.0.50.161</code></li>
</ul>
</li>
<li>Grab kubeconfig - download to current directory
<ul>
<li><code>talosctl kubeconfig -n 10.0.50.161 .</code></li>
</ul>
</li>
<li>Move kubeconfig if this will be used as your default cluster
<ul>
<li><code>mv kubeconfig ~/.kube/config</code></li>
</ul>
</li>
<li>Install kubectl
<ul>
<li><code>curl -LO &quot;https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl&quot;</code></li>
<li><code>install -m 755 /usr/local/bin/kubectl</code></li>
<li><code>kubectl version</code></li>
</ul>
</li>
<li>Similar to talosctl, configure kubectl by either copying kubectl to $HOME/.kube/config or exporting the ENV variable:
<ul>
<li><code>cp kubeconfig ~/.kube/config</code></li>
<li>OR</li>
<li><code>export KUBECONFIG=&quot;/root/whereami/kubeconfig&quot;</code></li>
</ul>
</li>
<li>Verify you can see the nodes. It can take a few minutes for Kubernetes to fully bootstrap, even though Talos shows Ready/Healthy in the dashboard.
<ul>
<li><code>kubectl get node -o wide</code></li>
</ul>
<pre tabindex="0"><code>NAME       STATUS   ROLES           AGE     VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE         KERNEL-VERSION   CONTAINER-RUNTIME
taloscp1   Ready    control-plane   6m34s   v1.31.1   10.0.50.161   &lt;none&gt;        Talos (v1.8.1)   6.6.54-talos     containerd://2.0.0-rc.5
taloscp2   Ready    control-plane   6m34s   v1.31.1   10.0.50.162   &lt;none&gt;        Talos (v1.8.1)   6.6.54-talos     containerd://2.0.0-rc.5
taloscp3   Ready    control-plane   6m2s    v1.31.1   10.0.50.163   &lt;none&gt;        Talos (v1.8.1)   6.6.54-talos     containerd://2.0.0-rc.5
taloswk1   Ready    &lt;none&gt;          6m26s   v1.31.1   10.0.50.171   &lt;none&gt;        Talos (v1.8.1)   6.6.54-talos     containerd://2.0.0-rc.5
taloswk2   Ready    &lt;none&gt;          6m22s   v1.31.1   10.0.50.172   &lt;none&gt;        Talos (v1.8.1)   6.6.54-talos     containerd://2.0.0-rc.5
taloswk3   Ready    &lt;none&gt;          6m18s   v1.31.1   10.0.50.173   &lt;none&gt;        Talos (v1.8.1)   6.6.54-talos     containerd://2.0.0-rc.5
</code></pre></li>
<li>In testing I noticed some stuff that&rsquo;s supposed to run on all control plane nodes was only running on all control plane nodes. Give it some time and check again. You would be looking for <code>kube-apiserver-[node]</code>, <code>kube-controller-manager-[node]</code>, and <code>kube-scheduler-[node]</code>.
<ul>
<li><code>kubectl get po -n kube-system</code></li>
</ul>
</li>
</ul>
<h2 id="test-a-workload">Test A Workload</h2>
<ul>
<li>Create deployment: <code>kubectl create deploy nginx-test --image nginx --replicas 1</code></li>
<li>Expose via NodePort service: <code>kubectl expose deploy nginx-test --type NodePort --port 80</code></li>
<li>Get the node the pod is running on: <code>kubectl get po -o wide</code>
<ul>
<li>Note which worker node from the Node column</li>
</ul>
</li>
<li>Get the port on that node: <code>kubectl get svc</code>
<ul>
<li>Note the larger port number on the right side in the Ports column, likely in the 31000 range</li>
</ul>
</li>
<li>In testing, my pod was deployed to taloswk3 which has IP 10.0.50.173, and the port was 31458
<ul>
<li>Visit http://10.0.50.173:31458 in the browser to confirm you see the &ldquo;Welcome to nginx!&rdquo; page</li>
</ul>
</li>
<li>Clean up test resources:
<ul>
<li><code>kubectl delete svc nginx-test</code></li>
<li><code>kubectl delete deploy nginx-test</code></li>
</ul>
</li>
</ul>
<h1 id="ongoing-maintenance">Ongoing Maintenance</h1>
<h2 id="client-certificate-expiration---important">Client Certificate Expiration - IMPORTANT!!!</h2>
<p><a href="https://www.talos.dev/v1.8/talos-guides/howto/cert-management/">https://www.talos.dev/v1.8/talos-guides/howto/cert-management/</a></p>
<blockquote>
<p>Talos Linux automatically manages and rotates all server side certificates for etcd, Kubernetes, and the Talos API. Note however that the kubelet needs to be restarted at least once a year in order for the certificates to be rotated. Any upgrade/reboot of the node will suffice for this effect.</p>
</blockquote>
<p>TLDR; Your <code>talosconfig</code> and <code>kubeconfig</code> files are going to expire 1 year from when they are created. You need to regenerate these files, ideally <strong>before</strong> they expire to avoid issues connecting to the Talos API or the Kubernetes API.</p>
<p>Here&rsquo;s what you need to do. If you get lost, refer to the official Talos documentation linked above for curent instructions.</p>
<h3 id="talosconfig">talosconfig</h3>
<p>You will know if <code>talosconfig</code> is expired if you see this error message when attempting to run <code>talosctl</code> commands. Wow that&rsquo;s a lot of errors!!!</p>
<pre tabindex="0"><code>error reading file: rpc errror: code = Unavailable desc = last connection error: connection error: desc = &#34;error reading server preface: remote error: tls: expired certificate&#34;
</code></pre><ul>
<li>At least once a year, regenerate <code>talosconfig</code>. You could automate this.</li>
<li>Check config details with <code>talosctl config info</code>
<ul>
<li>If your current <code>talosconfig</code> is still valid: <code>talosctl -n CP1 config new talosconfig-reader --roles os:reader --crt-ttl 24h</code>
<ul>
<li>Here, <code>CP1</code> is the IP of one of your control plane nodes, e.g. <code>10.0.50.161</code></li>
</ul>
</li>
<li>If <code>talosconfig</code> has expired:
<ul>
<li>Generate from <code>secrets</code> bundle: <code>talosctl gen config --with-secrets secrets.yaml --output-types talosconfig -o talosconfig &lt;cluster-name&gt; https://&lt;cluster-endpoint&gt;</code>
<ul>
<li>Here, <code>&lt;cluster-name&gt;</code> should match what you named your Talos cluster earlier. <code>&lt;cluster-endpoint&gt;</code> should be the VIP used for your Talos cluster, e.g. 10.0.50.160. If you didn&rsquo;t elect to use VIP, just pick one of your control plane IPs.</li>
</ul>
</li>
</ul>
</li>
<li>There&rsquo;s a third option in the docs if you need it.</li>
</ul>
</li>
<li>If you store <code>talosconfig</code> in version control, re-encrypt with SOPS and update. This is totally optional and probably unnecessary since you can generate the config by other means if needed.</li>
<li><code>mv talosconfig ~/.talos/config</code></li>
<li>Configure your endpoints and nodes again since they are tied to the config:
<ul>
<li><code>talosctl config endpoint 10.0.50.161 10.0.50.162 10.0.50.163</code></li>
</ul>
</li>
</ul>
<h3 id="kubeconfig">kubeconfig</h3>
<p>You will know if <code>kubeconfig</code> is expired if you see this error message when attempting to run <code>kubectl</code> commands:</p>
<pre tabindex="0"><code>error: You must be logged in to the server (the server has asked for the client to provide credentials)
</code></pre><ul>
<li>At least once a year, regenerate <code>kubeconfig</code>. This requires you to have a valid <code>talosconfig</code>.
<ul>
<li>Follow the same steps as before to grab <code>kubeconfig</code>: <code>talosctl kubeconfig -n 10.0.50.161 .</code></li>
<li><code>mv kubeconfig ~/.kube/config</code></li>
</ul>
</li>
</ul>
<h2 id="backing-up-etcd---recommended">Backing Up <code>etcd</code> - Recommended!</h2>
<p>I&rsquo;ll be talking about other backups in more depth later on in part 7 with Velero.</p>
<p><code>etcd</code> is special as it&rsquo;s a real time representation of all Kubernetes cluster resources, and it&rsquo;s what the API queries when you run <code>kubectl</code> commands. It&rsquo;s the current running state of the cluser. Obviously backing this up is important, because if something goes wrong with it, and you don&rsquo;t have a backup, you will have to rebuild your entire cluster.</p>
<p>I strongly recommend backing up <code>etcd</code> separately. Since <code>etcd</code> doesn&rsquo;t run as part of the cluster, but rather at the Talos level, you can use <code>talosctl</code> to manage and backup the <code>etcd</code> database. This actually simplifies things from a Kubernetes administrator perspective, but it&rsquo;s certainly different than the approach you would take from the CKA certification training.</p>
<p>The basic command to snapshot etcd with Talos is this:</p>
<ul>
<li><code>talosctl -n 10.0.50.161 etcd snapshot &lt;path&gt;</code></li>
</ul>
<p>I created a small bash script and scheduled it in a cron job to automate this backup, and my bash script looks like this. The resulting file looks like <code>etcd.snapshot.20241023</code>:</p>
<pre tabindex="0"><code>#!/bin/bash

TALOSCONFIG=&#34;/root/.talos/config&#34;
PATH=&#34;/mnt/talos_backup&#34;

/usr/local/bin/talosctl --talosconfig $TALOSCONFIG -n 10.0.50.161 etcd snapshot $PATH/etcd.snapshot.$(/bin/date +%Y%m%d)

# Delete snapshots older than 30 days
/usr/bin/find $PATH -mtime +30 -type f -delete
</code></pre><h1 id="next-steps">Next Steps</h1>
<p>Now you should have a Talos Linux cluster running with each node having its own static IP, along with a VIP for the control plane cluster. You are ready to start installing what I would consider foundational components that will be used to automate tasks for you when deploying actual workloads later on. These include:</p>
<ul>
<li>Secret encryption - Sealed Secrets</li>
<li>LoadBalancer service - MetalLB</li>
<li>Certificate manager - cert-manager</li>
<li>Ingress provider - traefik</li>
<li>CSI driver - democratic-csi</li>
<li>Backups - Velero + etcd backup with talosctl</li>
<li>Grafana (<a href="https://grafana.com/docs/grafana/latest/">https://grafana.com/docs/grafana/latest/</a>) + Prometheus (<a href="https://prometheus.io/docs/introduction/overview/">https://prometheus.io/docs/introduction/overview/</a>)</li>
<li>Loki + Promtail - <a href="https://github.com/grafana/loki">https://github.com/grafana/loki</a></li>
<li>Metrics server (<a href="https://github.com/kubernetes-sigs/metrics-server">https://github.com/kubernetes-sigs/metrics-server</a>) - This is intended for autoscaling, not general monitoring</li>
</ul>
]]></content:encoded>
    </item>
    
    <item>
      <title>Automate Ansible With GitLab</title>
      <link>https://blog.dalydays.com/post/automate-ansible-with-gitlab/</link>
      <pubDate>Mon, 20 Feb 2023 14:22:26 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/automate-ansible-with-gitlab/</guid>
      <description>Run your Ansible playbooks directly from GitLab pipelines, allowing you to easily track your playbooks using IaC.</description>
      <content:encoded><![CDATA[<h1 id="my-use-case">My Use Case</h1>
<p>M﻿y use case is currently to deploy new VMs to Proxmox, provision them, and finally bootstrap an RKE2 cluster in HA with 3 server nodes and 3 agents. The goal is for my pipeline to do all the heaving lifting, from cloning a template to 6 VMs, do all the provisioning such as setting static IPs and installing some packages, and then installing RKE2 and joining all nodes to the cluster. This involves a series of different playbooks that are run in different pipeline stages, all triggered from a single push. This also means that once my playbooks are stable and I don&rsquo;t necessarily want to push to build a new cluster, that I can manually run the pipeline from GitLab with a couple clicks to build out an entirely new cluster.</p>
<h1 id="why-gitlab-ci-vs-ansible-towerawx">Why GitLab CI vs. Ansible Tower/AWX?</h1>
<p>A﻿nsible on its own is a great tool for provisioning servers and really doing any changes on a server that you used to (or still are) doing via SSH in a terminal. By moving things into an Ansible playbook, you get a more reliable, repeatable way to run through a specific set of tasks that you might want to reuse on different servers. A simple example might be installing Docker and docker compose on an Ubuntu server.</p>
<p>I&rsquo;ve used Ansible Tower at work and while it&rsquo;s nice, I always felt like it has a weak point: certain data is not tracked in your VCS. You version control playbooks, and inventory. But then when it comes to schedules, job history, or anything else, that&rsquo;s all stored in Ansible Tower&rsquo;s database. I figure, everything that <em>can</em> be in a VCS <em>should</em> be in a VCS. This includes scheduled jobs and run history, including what variables were included when a playbook was run.</p>
<p>S﻿peaking of using a VCS to keep track of Ansible playbooks, GitLab is awesome. It&rsquo;s a bit heavy on resources for a home lab, but it really gives you all (or almost all) of the features you could want. I already have GitLab running and I already put playbooks in there. So why not run the playbooks directly from a pipeline?</p>
<p>T﻿his not only allows you to run a single playbook from a pipeline when pushing to a main branch, but you can also run a series of playbooks at different pipeline stages as you will see a bit later.</p>
<h1 id="what-does-it-take-aka-requirements">What Does It Take (a.k.a. Requirements)?</h1>
<p>You need GitLab. You need to know how a little bit about writing Ansible playbooks and creating an inventory file. You need to know a little about GitLab CI pipelines and CI/CD variables, and have GitLab already configured with at least 1 runner. You should probably at least get the basic concept of SSH keys. For this demonstration you need Proxmox. And you have to want to build a Rancher RKE2 cluster. Or just the motivation to follow along and learn for the GitLab and/or Ansible pieces.</p>
<h1 id="show-me-the-money">Show Me The Money</h1>
<h2 id="running-ansible-from-a-container">Running Ansible From A Container</h2>
<p>I﻿n order to run Ansible f﻿rom a pipeline, you just need a Python environment with Ansible installed. I always check <a href="https://hub.docker.com/">Docker Hub</a> for existing (up to date) images, but surprisingly there is no official Ansible image. Maybe it&rsquo;s not recommended, but I can&rsquo;t find any reason why not. T﻿hat&rsquo;s just stupid. So anyway, the next best thing is to find an official, slim environment that we can install Ansible on. Since Ansible requires Python, the official Python image is perfect. I went for the python:3-slim tag since that tracks latest stable version and is very lightweight.</p>
<p>W﻿ith that in hand, it&rsquo;s just a matter of running a few steps to install ansible from pip with <code>python3 -m pip install --user ansible</code> and anything else you might want to use from your temporary container instance like openssh-client.</p>
<h2 id="gitlab-setup">GitLab Setup</h2>
<p>S﻿etting up pipelines and runners is pretty far beyond the scope of this one, but if you&rsquo;re running on <a href="gitlab.com">GitLab.com</a> you should be able to do this easily.</p>
<ul>
<li>
<p>Start with an empty repo</p>
</li>
<li>
<p>Add a basic Ansible inventory file <code>inventory.ini</code></p>
<ul>
<li>I﻿ use ansible_ssh_common_args=&rsquo;-o StrictHostKeyChecking=no&rsquo; but the alternative is to add trusted host keys into a GitLab CI/CD variable which is pretty annoying to do</li>
</ul>
</li>
<li>
<p>Add a playbook for testing <code>deploy.yml</code></p>
</li>
<li>
<p>Generate a new SSH key for Gitlab to connect to your Proxmox PVE host (or use one you already have)﻿</p>
</li>
<li>
<p>Add a Proxmox API Token credential in Proxmox and record the token ID and secret</p>
</li>
<li>
<p>A﻿dd GitLab CI/CD variables</p>
<ul>
<li>P﻿VE_API_TOKEN = the actual generated token</li>
<li>P﻿VE_API_TOKEN_ID = the name of the token user</li>
<li>P﻿VE_API_USER = root@pam (in my case)</li>
<li>P﻿VE_HOST = [IP address of your Proxmox host]</li>
<li>S﻿SH_PRIVATE_KEY = private SSH key generated earlier (not the public key)</li>
</ul>
</li>
</ul>
<p>In the file, copypasta this. before_script and after_script will run before/after every job. This works fine for me since every job I&rsquo;m running is for Ansible playbooks. So in the deploy stage, it runs before_script, then the ansible-playbook, then the after_script cleanup (so we don&rsquo;t leave our private key hanging out there in a container on the runner).</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">image</span><span class="p">:</span><span class="w"> </span><span class="l">python:3-slim</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">stages</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="l">deploy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="l">provision</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="l">install</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">before_script</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="s1">&#39;command -v ssh-agent &gt;/dev/null || ( apt update &amp;&amp; apt install -y openssh-client )&#39;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="l">eval $(ssh-agent -s)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="l">mkdir -p ~/.ssh</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="l">chmod 700 ~/.ssh</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="l">echo &#34;$SSH_PRIVATE_KEY&#34; | tr -d &#39;\r&#39; &gt; ~/.ssh/gitlab_ed25519</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="l">chmod 600 ~/.ssh/gitlab_ed25519</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="l">export PATH=&#34;~/.local/bin:$PATH&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="l">python3 -m pip install --user ansible</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">after_script</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="l">rm -rf ~/.ssh/</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">deploy</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">stage</span><span class="p">:</span><span class="w"> </span><span class="l">deploy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">script</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">ansible-playbook -i inventory.ini --user root --private-key ~/.ssh/gitlab_ed25519 \</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>-<span class="l">e &#34;api_host=${PVE_HOST} api_user=${PVE_API_USER} api_token_id=${PVE_API_TOKEN_ID} \</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="l">api_token_secret=${PVE_API_TOKEN}&#34; deploy.yaml</span><span class="w">
</span></span></span></code></pre></div>]]></content:encoded>
    </item>
    
    <item>
      <title>Expense Tracker with Firefly III and Metabase</title>
      <link>https://blog.dalydays.com/post/nocodb-with-metabase-expense-tracker/</link>
      <pubDate>Mon, 20 Feb 2023 13:40:56 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/nocodb-with-metabase-expense-tracker/</guid>
      <description>Quickly set up an easy to use, web based expense tracker with nice analytics using Firefly III and Metabase, with a simple docker-compose stack.</description>
      <content:encoded><![CDATA[<h1 id="what-is-firefly-iii">What is Firefly III?</h1>
<p><a href="https://www.firefly-iii.org/">Firefly III</a> is &ldquo;A free and open source personal finance manager&rdquo; that is web based and can be self hosted easily in Docker/podman/whatever container. I switched to this after many years of using GnuCash because I wanted to move to a more modern platform, and specifically wanted the ability to access a database instead of using the flat file database format that GnuCash uses. This would allow me to integrate easily with Metabase. My use case is to hand enter every transaction that appears on my bank account or credit card account, and specify what category they belong to with the goal of using that data to track budgets (which I might talk about in the future) and get an easy high level view of how much is being spent per week/month/whatever by category. I should also be able to quickly dive into a problematic spending category and get a breakdown of what transactions that is made up of. This is where Metabase will be a great help. Don&rsquo;t even bother looking at the Firefly III built in dashboards. I&rsquo;m sure they&rsquo;re nice, but there&rsquo;s just no comparison to the power and flexibility you will get using Metabase to analyze the data.</p>
<h1 id="what-is-metabase">What is Metabase?</h1>
<p><a href="https://www.metabase.com/">﻿Metabase</a> is a free open source BI (Business Intelligence) dashboarding tool, which also happens to run easily in a Docker/podman/whatever container. This means I can create charts, graphs, dashboards, (dynamic) pivot tables, or whatever else I want based on a set of data. You can easily add filters, and one feature I really like about Metabase is how easy date filters are to work with. The feature image of this post shows an example where the blue is the number of transactions in that category in the pats 12 months, and the green is the average transaction cost. This is kind of a weird chart, but also kind of interesting.</p>
]]></content:encoded>
    </item>
    
    <item>
      <title>Why I Recommend Joplin and Nextcloud</title>
      <link>https://blog.dalydays.com/post/why-i-recommend-joplin-and-nextcloud/</link>
      <pubDate>Fri, 10 Jun 2022 14:37:10 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/why-i-recommend-joplin-and-nextcloud/</guid>
      <description>Joplin is a Markdown based note taking app that includes features like checklists, embedding images, and more.</description>
      <content:encoded><![CDATA[<p>In my quest to replace paper notes and paper task lists over the years, I have been trying out different web based solutions. I am mostly opposed to desktop applications you need to install, although I did use VSCode for a while and store .txt files in a folder synced with Nextcloud. But I still wind up not being able to find notes I know I wrote, and it&rsquo;s possible I didn&rsquo;t store them in the same folder for some reason.</p>
<p>Combine that with the fact that I have also been using Nextcloud Deck primarily for task lists (but also for some notes if they are related), maybe the notes I&rsquo;m looking for are there. But it&rsquo;s a little difficult to search there as well. And the interface for Nextcloud Deck was limited because the editor was only available in the right hand column. So that means if you take a lot of notes, you have to do a lot of scrolling. Apparently this just changed in a recent update, but it&rsquo;s about a week too late since I just switched to Joplin&hellip;</p>
<p>I started investigating fully web based solutions, keeping in mind some key deal breakers:</p>
<ul>
<li>I need to be able to self host this for privacy/security.</li>
<li>This needs to be free. I may be willing to pay for a good solution some day, but today I need to know what all the open source, self hosted options are.</li>
<li>It needs to be better than Nextcloud Deck and VSCode .txt files.</li>
</ul>
<p>What I found was that although there does seem to be a wide variety of options, they are not all the equal. Some of the more promising looking web based options wound up being wrong for me for a variety of reasons:</p>
<ul>
<li>git based notes are cool but I don&rsquo;t think that is right for my use case</li>
<li>Some options don&rsquo;t seem to have a good way to browse/search/link older notes which makes it bad for historical reference</li>
<li>Some projects did not have any recent commits, so I&rsquo;m hesitant to start using a project that may no longer be maintained.</li>
</ul>
<p>One project that stands out is called Standard Notes. This looked the most promising, but it&rsquo;s not as simple to deploy as almost anything else out there. Because they are very micro services oriented, it&rsquo;s a real chore to deploy this one and for me, just for notes, I wasn&rsquo;t willing to invest that much time researching all the options I needed to understand in order to deploy this in production. I may come back to this project, but for now I felt the barrier to entry was too high and I moved on after a short amount of experimentation.</p>
<p>The last one I tried was Joplin. I read a lot of praise about Joplin and wanted to give it a shot. I have avoided it only because it requires you to install an app locally and configure sync. However, the app is quick to install and sync is very easy to set up, especially if you already run Nextcloud.</p>
<p>Joplin ticks all the boxes for me: it takes notes, I can do checklists, it syncs with a private server, I can easily find things with search, and the desktop editor is really nice while keeping everything in one place.</p>
<p>Here&rsquo;s a tip I didn&rsquo;t realize at first due to my lack of WebDAV experience. When syncing Joplin with Nextcloud, the server URL is something like <code>https://cloud.example.com/remote.php/webdav/</code> but I would <strong>strongly</strong> recommend creating a folder in your account first that Joplin will sync to. For me, that&rsquo;s just a folder called &ldquo;joplin&rdquo; which makes that WebDAV URL <code>https://cloud.example.com/remote.php/webdav/joplin</code></p>
<p>Go forth and take better notes!</p>
]]></content:encoded>
    </item>
    
    <item>
      <title>HA Kubernetes Cluster With Ansible and Kubespray (WIP)</title>
      <link>https://blog.dalydays.com/post/ha-kubernetes-cluster-with-ansible-and-kubespray-wip/</link>
      <pubDate>Fri, 11 Feb 2022 15:11:35 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/ha-kubernetes-cluster-with-ansible-and-kubespray-wip/</guid>
      <description>This is an overview of how to build an HA Kubernetes cluster using Ansible with Kubespray for reproducible results.</description>
      <content:encoded><![CDATA[<h1 id="this-is-currently-a-work-in-progress-mostly-for-my-own-benefit-to-track-my-progress-through-this-build-there-may-be-major-steps-missing">This is currently a work in progress, mostly for my own benefit to track my progress through this build. There may be major steps missing!</h1>
<h2 id="prerequisites">Prerequisites</h2>
<p>As a starting point, you need 6 VMs which are accessible via SSH so that Ansible can remote in and make all the necessary changes. VM provisioning and Ansible/SSH keys are beyond the scope of this guide but are a prerequisite. I currently run Proxmox and use Terraform to generate the fresh Ubuntu VMs and inject my Ansible SSH key so that it can do its thing. I strongly recommend using a tool like Terraform that allows you to quickly destroy and recreate fresh VMs since you will need to do this process any time the Kubespray deployment fails or you want to make big changes and deploy again.</p>
<ul>
<li>
<p>Create 6 fresh Ubuntu based VMs to be used for 3 masters and 3 workers (using Terraform for this step)</p>
</li>
<li>
<p>Install SSH keys into the VMs so that Ansible can remote in with sudo privileges (using Terraform for this step)</p>
</li>
<li>
<p>On your Ansible machine, <code>git clone</code> the kubespray repo</p>
</li>
<li>
<p>Build your inventory file for Ansible to connect into the VMs it will be modifying</p>
</li>
<li>
<p>Make changes as needed to your Kubespray deployment.</p>
</li>
<li>
<p>Optionally but strongly recommended, track any customizations in your own git repo. There are some different approaches to this but I&rsquo;m currently using a private repo and tracking the official kubespray repo as an upstream branch.</p>
</li>
<li>
<p>I wanted to use kube-vip for HA on the control plane API which is currently not built into Kubespray, so I created a new playbook that copies a yaml file into the static pod manifest folder which deploys kube-vip for the control plane nodes. I run this playbook which does that task and then calls the kubespray cluster.yml playbook afterward.</p>
</li>
<li>
<p>Run the Kubespray playbook. Something like this: <code>ansible-playbook -i inventory/mycluster/hosts.yaml -b kube-vip-cluster.yml -u ubuntu</code></p>
</li>
<li>
<p>Since I couldn&rsquo;t seem to get kube-vip working as a LoadBalancer service, I&rsquo;m using MetalLB for that. Deploy MetalLB and don&rsquo;t forget to add the config which tells it which IP address pool to use.</p>
<ul>
<li>You can just enable MetalLB in Kubespray, but I opted to deploy using the official manifest in order to decouple it from the core cluster creation. I want to keep the cluster as barebones as possible for deployment, then install all add-ons separately, eventually using GitOps with something like ArgoCD.</li>
</ul>
</li>
<li>
<p>Install an ingress controller such as Traefik.</p>
<ul>
<li>This requires a ClusterRole, ClusterRoleBinding, ServiceAccount, Daemonset or Deployment, and a Service of type LoadBalancer.</li>
</ul>
</li>
<li>
<p>Install Rancher. There are several options for managing clusters such as Lens or K9s. After using Lens for a little while I find it difficult to switch between client machines, especially when you are destroying and recreating a cluster for testing which means you have to reconnect manually over and over. I like the idea of a web based management interface because it is more centralized, and Rancher does the job nicely.</p>
</li>
<li>
<p>Install a distributed block storage solution. I am currently testing Longhorn, but ran into an issue with the RWX mode when deploying bitnami/wordpress from the helm chart (<a href="https://github.com/longhorn/longhorn/issues/3661">Longhorn issue #3661</a>)</p>
</li>
<li>
<p>Install a monitoring solution. I haven&rsquo;t spent enough time with the ELK stack or many other monitoring solutions but that is one of the next steps on my radar.</p>
</li>
<li>
<p>Install a centralized logging solution. I will evaluate Graylog and Logstash/Kibana.</p>
</li>
<li>
<p>Install applications</p>
<ul>
<li>Migrate existing Nextcloud VM instance to Kubernetes</li>
</ul>
</li>
</ul>
]]></content:encoded>
    </item>
    
    <item>
      <title>Using Restic with Backblaze B2 for Off-Site Backup</title>
      <link>https://blog.dalydays.com/post/using-restic-with-backblaze-b2/</link>
      <pubDate>Wed, 20 Oct 2021 00:00:00 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/using-restic-with-backblaze-b2/</guid>
      <description>How to use restic with Backblaze B2 and some shell scripts to help with scheduling/automation/monitoring</description>
      <content:encoded><![CDATA[<p><a href="https://github.com/linucksrox/restic-scripts">Checkout the Github repo here</a></p>
<h1 id="my-use-case">My Use Case</h1>
<p>I have been using restic for just over 3 years with Backblaze B2 and have saved money at the expense of my time. Before that I was running Crashplan home edition until they increased the price to $10/month (at the time I was only storing about 600GB off-site). Backblaze B2 was offering pay as you go pricing for $0.005/GB so I would be saving about $5/month. Fast forward to today, I&rsquo;m still spending less than $10/month (but slowly reaching that threshold), while retaining encrypted, deduplicated snapshots going back to over 3 years ago. Today my total off-site backup size is around 1.6TB.</p>
<h1 id="why-restic">Why Restic?</h1>
<p>These are the main benefits of pairing restic with B2 in my opinion:</p>
<ul>
<li>restic encrypts data before sending over the wire, so I don&rsquo;t have to worry about which data storage provider I use</li>
<li>restic deduplicates data which means it stores less data overall, decreasing my monthly backup cost but this is at the expense of using more computing power and taking longer to complete each snapshot</li>
<li>Backblaze B2 has been 100% reliable in my experience, and their pricing is about as cheap as you can find. When I decided on Backblaze, Wasabi was also offering a similar solution for $5/month per TB, but their prices have increased slightly since then. They also do not charge egress or transaction fees based on transaction types like Backblaze does, so you can more easily wind up with additional costs using Backblaze B2 depending on how you run your backups. But in my experience, these costs have been very minimal (well under $1 per month even when I use class B transactions heavily).</li>
<li>restic snapshots allow me to keep many versions of my data over time without really worrying about the overall storage usage.</li>
<li>restic features an easy way to prune old snapshots based on a retention policy that can be easily scripted (which you can see in the restic-scripts repository, <code>restic_forgetandprune_bypolicy.sh</code>)</li>
</ul>
<h1 id="getting-started">Getting Started</h1>
<p>Assuming you have already thought about your backup retention including how much data you are backing up, how long you want to keep snapshots, etc. we can start with getting a machine set up to actually run the backups, checks, logging/monitoring and restores.</p>
<p>I&rsquo;m not going into specifics about setting up Backblaze B2 so that is an excercise for you (or any other destination you want to use).</p>
<ol>
<li>Make sure your machine is ready to run restic jobs. Since restic does deduplication, it can be heavy on resources like CPU/RAM and also cache storage. In case it gives you any idea, I have been working with a ~1.6TB backup repository, adding somewhere around 200MB-500MB twice daily, and it&rsquo;s using almost 8GB of cache data, consumes almost all 8GB of RAM, and a fair amount of the 8 CPU cores I allocated to that VM.</li>
<li>Mount any network shares you will be using as a source. You will most likely want to update <code>/etc/fstab</code> so that these will mount automatically on reboot.</li>
<li>Install restic. You might be able to use your distro&rsquo;s package manager, but I would recommend just downloading the latest stable binary from Github: <a href="https://github.com/restic/restic/releases">https://github.com/restic/restic/releases</a></li>
<li>Clone the restic-scripts repo: <a href="https://github.com/linucksrox/restic-scripts">https://github.com/linucksrox/restic-scripts</a></li>
<li>Edit <code>restic_env.sh</code> to point to your Backblaze B2 bucket and configure your backups.</li>
<li>Assuming you&rsquo;re creating a brand new restic repository, you&rsquo;ll need to start by initializing the repo before you can run backups. Run <code>restic_initialize_repo.sh</code> or follow the restic documentation for that: <a href="https://restic.readthedocs.io/en/latest/030_preparing_a_new_repo.html">https://restic.readthedocs.io/en/latest/030_preparing_a_new_repo.html</a></li>
<li>Follow the instructions in the README there, and open an issue if you have any questions or something does not work.</li>
</ol>
<h1 id="use-cases">Use Cases</h1>
<h2 id="initializing-a-new-restic-repository">Initializing A New Restic Repository</h2>
<p>Run <code>restic_initialize_repo.sh</code> which will create a brand new repository if data does not already exist there. If you&rsquo;re familiar with git, this is kind of similar to git init. No data is actually sent, but a config is created with your repository details, ready to track snapshots.</p>
<h2 id="backing-up-data">Backing Up Data</h2>
<p>Run <code>restic_backup.sh</code> for this which will rely on <code>restic_unlock.sh</code>, <code>restic_env.sh</code> and <code>restic_excludelist.txt</code>. This just takes a backup snapshot using all the environment variables you already have configured. This can be scheduled with cron (see the README for a cron example and how to pipe the output to a log file).</p>
<h2 id="listing-snapshots">Listing Snapshots</h2>
<p>Run <code>restic_snapshots.sh</code></p>
<h2 id="checking-your-backup-repo">Checking Your Backup Repo</h2>
<p>Run <code>restic_check.sh</code>. I recommend scheduling this with cron (see the README for an example of this).</p>
<h2 id="prune-old-snaphsots">Prune Old Snaphsots</h2>
<p>You can define a backup policy by number of snapshots, number of weeks/months/years and other options like always keeping the latest snapshot. <code>restic_forgetandprune_bypolicy.sh</code> contains the policy I use and will automatically remove old snapshots and associated data blobs, keeping the 4 most recent, 30 most recent days, 8 most recent weeks, 12 most recent months, and 100 years (basically don&rsquo;t ever delete old backups but only retain yearly intervals beyond the most recent 12 months). You can easily adjust this by modifying the options in the file.</p>
<p>There&rsquo;s also <code>restic_forgetandprune_2weeks.sh</code> which removes everything except the past 2 weeks worth of snapshots if that&rsquo;s more your speed.</p>
<h2 id="forget-a-specific-snapshot">Forget A Specific Snapshot</h2>
<p>You may need this in a situation where a backup fails to complete or you are getting errors after running <code>restic check</code>. If you don&rsquo;t already know the snapshot id you need to remove, run <code>restic_snapshots.sh</code> to find the snapshot id of the backup you want to remove, then run <code>restic_forget_byid.sh</code> and pass in the snapshot id to have it removed. This script doesn&rsquo;t automatically prune (so it removes the snapshot from the index but doesn&rsquo;t actually remove data blobs). Run <code>restic_prune.sh</code> to prune, obviously.</p>
<h2 id="manage-backup-repository-keys">Manage Backup Repository Keys</h2>
<p>Restic uses keys to encrypt/decrypt repository data. You will need to add at least 1 key, but you can use these scripts to add and remove keys if you need to rotate them out for example.</p>
<ul>
<li><code>restic_key_list.sh</code> - list current keys</li>
<li><code>restic_key_add_password.sh</code> - add a password to a key</li>
<li><code>restic_key_change_password.sh</code> - change the password for an existing key</li>
<li><code>restic_key_remove_password.sh</code> - remove password from existing key</li>
</ul>
<h2 id="mount-all-snapshots">Mount All Snapshots</h2>
<p>Using <code>restic_mount.sh</code> you can mount snapshots in order to browse/search for files across all snapshots, restore data, or just test your backups. This enables you to use a file manager or CLI, whichever fits your needs. By default, it expects there to be a <code>mount</code> directory in the same location as this script (<code>./mount</code>), but you can obviously change this to fit your needs in the script.</p>
<h2 id="get-backup-stats">Get Backup Stats</h2>
<p>Run <code>restic_stats.sh</code> to get some stats about the total size of your repo and the total deduplicated data size for comparison.</p>
]]></content:encoded>
    </item>
    
    <item>
      <title>How To Hot Swap ZFS Disks In Proxmox</title>
      <link>https://blog.dalydays.com/post/how-to-hot-swap-zfs-disk-in-proxmox/</link>
      <pubDate>Wed, 13 Oct 2021 00:00:00 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/how-to-hot-swap-zfs-disk-in-proxmox/</guid>
      <description>A guide to replacing a failed disk in a ZFS pool without shutting down or rebooting</description>
      <content:encoded><![CDATA[<h2 id="proxmox-and-zfs">Proxmox And ZFS</h2>
<p>ZFS in Proxmox works really well and since I moved away from FreeNAS to Proxmox I have never had any issues. My only complaint is that replacing a disk is a bit more of a manual process than it was in FreeNAS, but you can still hot swap disks with no rebooting or downtime. You might consider that a good thing because it requires you to understand ZFS and storage a little better which can only help you down the road.</p>
<p>Hot swapping disks is not really complicated, but just involves a few steps that you have to follow in a particular order. So here is a detailed breakdown of what to do.</p>
<h2 id="requirements-and-base-understanding">Requirements And Base Understanding</h2>
<ul>
<li>Understanding hot swapping:
<ul>
<li>Unmounting and physically disconnecting a disk while the system is powered on, then plugging in a new disk into the same port and bringing it online with no system downtime.</li>
</ul>
</li>
<li>A supported hardware controller. Hot swapping will not necessarily work on any old hardware you have lying around, and if not then you need to power off to disconnect and reconnect disks.</li>
</ul>
<p>In case it&rsquo;s not obvious, you cannot hot swap a root disk regardless of supported hardware, etc. This guide does not attempt to explain the process for replacing a failed root disk and is only focused on hot swapping failed disks in a ZFS pool that is still online but in a degraded state.</p>
<h1 id="checking-the-status-of-zfs-pools-and-disks">Checking The Status Of ZFS Pools And Disks</h1>
<h2 id="is-a-disk-failing">Is A Disk Failing?</h2>
<ul>
<li>You may be getting email alerts from the SMART monitoring daemon</li>
<li>Run <code>smartctl -a /dev/sdc</code> for the device you want to inspect</li>
<li>Check the zpool status (CLI, proxmox UI)
- <code>zpool status</code> OR <code>zpool status [poolname]</code></li>
</ul>
<h2 id="determining-which-disk-corresponds-to-which-device-letter">Determining Which Disk Corresponds To Which Device Letter</h2>
<ul>
<li>
<p>Go into Proxmox, click the node, then click Disks. This lists out device names and disk info including serial numbers.</p>
</li>
<li>
<p>Run <code>zpool status [poolname]</code> to get a breakout of which devices are in the pool. If you added them by device-id (strongly recommended) then you will see that info including the serial numbers.</p>
</li>
<li>
<p>Determining which disk corresponds to a physical slot on the server is a little more difficult. The best approach is to physically label disks on the outside of the bays and/or keep a spreadsheet of some sort that tracks at a minimum the serial numbers and physical bay locations. A less scientific/reliable method on my HP DL380P to find the bay after things are running and I don&rsquo;t want to shut down:</p>
<ul>
<li>Offline the disk in question (see below), then check for the disk that doesn&rsquo;t light up when there&rsquo;s other activity. It helps to know which disks are in which ZFS pool.</li>
</ul>
</li>
</ul>
<h1 id="replacing-a-failed-disk">Replacing A Failed Disk</h1>
<h2 id="offline-the-failed-disk">Offline The Failed Disk</h2>
<ul>
<li>
<p>Make a note of the existing disk-id (or whatever name is assigned to the disk as far as the zpool is concerned)</p>
<ul>
<li>In the Proxmox UI, click the node, then Disks - ZFS and double click the pool to view the device breakdown</li>
<li>Or on the CLI: <code>zpool status [poolname]</code></li>
</ul>
</li>
<li>
<p>If a disk is not already offline and zpool status is degraded (for example you want to proactively replace a drive that keeps throwing SMART errors but hasn&rsquo;t completely failed), you can set the disk offline manually:
- <code>zpool offline [poolname] [disk-id]</code></p>
</li>
</ul>
<h2 id="replace-the-physical-disk">Replace The Physical Disk</h2>
<ul>
<li>If you are more comfortable powering off the server before swapping disks, this is the time to power off.</li>
<li>Pull offline disk out of the bay</li>
<li>Install new disk into the bay</li>
<li>Update spreadsheet/documentation with new disk information and bay location info.</li>
<li>If you powered off the server to replace the disks, now is the time to power back on.</li>
</ul>
<h2 id="initiate-resilvering">Initiate Resilvering</h2>
<ul>
<li>
<p>Confirm which drive letter the new disk is assigned to (should reuse the existing letter from the old disk):</p>
<ul>
<li>In Proxmox under the Node, then Disks, reload and check the device has the new serial number</li>
<li><code>smartctl -a /dev/sdc</code></li>
<li>If the new disk is assigned a new device letter like /dev/sdx then this would need to be modified in /etc/smartd.conf before reloading the daemon (see below)</li>
</ul>
</li>
<li>
<p>Get the new disk-id</p>
<ul>
<li><code>ls -a /dev/disk/by-id</code></li>
</ul>
</li>
<li>
<p>Import the new disk into pool
- <code>zpool replace -f [poolname] [old-disk-id] [new-disk-id]</code></p>
</li>
</ul>
<h2 id="reload-smart-monitoring">Reload SMART Monitoring</h2>
<ul>
<li>
<p>Reload the SMART monititoring daemon so it doesn&rsquo;t attribute new SMART values to the old disk model, and confirm it picked up the correct disks</p>
<ul>
<li>Update /etc/smartd.conf if the new disk was assigned to a new device letter
- <code>service smartmontools restart</code>
- <code>tail -n 100 /var/log/syslog</code></li>
</ul>
</li>
</ul>
<h2 id="monitoring-resilvering-process">Monitoring Resilvering Process</h2>
<ul>
<li>
<p>Now that resilvering is started (or it should be), you can monitor the progress either through the Proxmox UI or on the CLI. The initial time estimate is not necessarily accurate, but be patient and allow it to complete before doing any other hardware maintenance.</p>
<ul>
<li>In Proxmox, click on the node, then Disks - ZFS, and double click on the storage pool. This displays the current resilvering progress and should show which disk is being replaced.</li>
<li>Or from the CLI: <code>zpool status [poolname]</code></li>
</ul>
</li>
</ul>
<h2 id="summary">Summary</h2>
<p>I&rsquo;ve done this procedure several times now without any issues. If you&rsquo;re not using the SMART daemon (but why not?) you could omit those steps.</p>
]]></content:encoded>
    </item>
    
    <item>
      <title>Docker Versus Virtual Machines</title>
      <link>https://blog.dalydays.com/post/docker-versus-virtual-machines/</link>
      <pubDate>Fri, 21 Feb 2020 00:00:00 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/docker-versus-virtual-machines/</guid>
      <description>A dive into the difference between VMs and Docker containers</description>
      <content:encoded><![CDATA[<h2 id="about-virtual-machines">About Virtual Machines</h2>
<p>In the context of this article, virtual machines is in reference to running a VirtualBox VM or running a VMWare VM that runs a guest operating system. For example I run Linux on a laptop and run Windows in a VirtualBox VM as a guest. Virtual Machines in this context do not refer to the Java Virtual Machine or any other definitions other than a virtual host computer which runs a guest operating system.</p>
<h2 id="how-is-docker-different">How Is Docker Different?</h2>
<p>Docker is not a virtual machine like you might think of a virtualbox VM. It&rsquo;s a tool for running isolated applications within a container based on an image. Docker images are similar to ISO images in the sense that they contain a certain application state, but can&rsquo;t be changed. So you can build multiple identical containers based on a single image (or destroy and rebuild a container), just like you might use a single Windows ISO image to install multiple identical VirtualBox guests.</p>
<p>The difference is that Docker is designed to run a single application or service, while a virtual machine runs an entire operating system and potentially multiple applications. Sure, you can run a single application in a virtual machine like you might traditionally do with a web server. Install a Linux virtual machine and then install Apache/PHP. With a VM if you need a web server, you always start with a host OS such as Ubuntu server. Then you install updates, packages, dependencies, and finally environment config. Now you&rsquo;re ready to deploy.</p>
<p>What if you could build all of that extra update/install/config stuff into an image? You can do that with templates, but in order to maintain that template you have to go into the running system, make changes live, and then commit those new changes as a new version. Doing this stuff by hand is not easily reproducible and can lead to issues over time.</p>
<p>Docker can be used to solve a problem like this because a Docker image (kind of like a VM template) can be built using a Dockerfile. It can be reproduced or modified without making live changes on the system itself, but all defined in the Dockerfile that is used to build that image. Another benefit of this is that you can build someone else&rsquo;s image with nothing other than the Dockerfile (a human readable text file) and avoid having to transfer the actual image file.</p>
<p>Docker runs differently than virtual machines. A virtual machine is a complete virtual computer with its own kernel, operating system, etc. Docker is just an isolated environment, sharing the same kernel as the host machine, making it much faster and smaller to run. This is also important to keep in mind, as Docker is not always better than a VM, or the other way around. It always depends on what you need to do, and kernel sharing is not always ideal.

</p>
<h2 id="basic-docker-components">Basic Docker Components</h2>
<p><strong>Image -</strong> A Docker image is a snapshot of a container, kind of similar to a live Linux ISO image or a VM template. It is unchangeable, and if you need to make changes you need to build a new image.</p>
<p><strong>Container -</strong> A container is a running instance of an image, kind of similar to a running instance of a live Linux ISO running inside a VM.</p>
<p><strong>Dockerfile -</strong> This is a human readable text file that is used to build images. At its most basic, you start with a FROM command to start with a base image (a different existing Docker image), then you make any modifications to the image, and finally define which command to run when the container is started (this can be a shell script, starting the Apache service, etc.). You can use the <code>docker build</code> tool to build an image based on this Dockerfile.</p>
<p><strong>docker-compose -</strong> While you can use <code>docker run ...</code> to run containers, that is tedious at best and hard to remember/reproduce especially during testing. With docker-compose, you can define all your runtime settings (volume mapping, exposed ports, environment variables, which image you want to use, etc) in a file docker-compose.yml and then simply run <code>docker-compose up</code> to get it all running, and <code>docker-compose down</code> to break it all down.</p>
<h2 id="workflow">Workflow</h2>
<p>I like to differentiate between development and production workflow with Docker. Since Docker is supposed to be a tool to make my life easier, then how is it any easier in Docker to start with a base OS, build on top of that, create an image, then finally run that image in a container? Isn&rsquo;t that the same as running a VM?</p>
<p>This is where Docker is very different than VMs. Starting out, you probably don&rsquo;t need to worry about Dockerfiles at all, at least not during development. Start with <a href="https://hub.docker.com/">Docker Hub</a>, find an image that will do what you need, and adjust volumes, ports and environment variables from there as needed. In other words, with a VM you start with a blank operating system and have to build it up. With  Docker, you start with the application you need to run and all other dependencies are already in place.</p>
<p>For example, <a href="https://jellyfin.org/">Jellyfin</a> is a free software media system (kind of like a personal Netflix). You could, if you wanted to, create a new VM, install Ubuntu server, install Apache, etc. and eventually download and set up Jellyfin after configuring a virtual host and all that good stuff. How long did it take you just to get that VM up and running, let alone get everything installed? Or you can take their Docker image and just run it. All the work has been done, so no point reinventing the wheel.</p>
<h3 id="persistence">Persistence</h3>
<p>Unlike a VM, containers should be thought of as disposable. You should be ready to throw away a container and spin up a new one at any point in time. When a container is stopped and removed, everything within it is gone. So if there is any data you want to keep, that needs to be stored outside the container. That is done using volumes.</p>
<p>There are many ways to use volumes, but the simplest thing you can do starting out is just map a local directory as a volume so that changes to that directory are seen live in the container. In the VM world, this would be kind of like sharing a CIFS folder in the host operating system and mapping that in the guest operating system, except it&rsquo;s all set up automatically right when you run the container.</p>
<h3 id="development-versus-production">Development Versus Production</h3>
<p>Your development workflow probably looks different than your production workflow. For example, for a PHP web application, in production you are probably writing a Dockerfile that starts from  a PHP base image, copies your code into the container (maybe into /var/www/html) and you build a custom image from that Dockerfile. This is your dockerized application that can be distributed an deployed.</p>
<p>In contrast, during development you are making changes constantly and want to test those changes. It would seem ridiculous to have to rebuild the image every time you make a change, and then restart the container based on the updated image. So what you would probably do here is run the base PHP container and map a volume from your local src directory to the container&rsquo;s /var/www/html directory. Then any changes you make to your src directory are instantly updated in the container and you don&rsquo;t have to rebuild anything.</p>
<p>So why not just use this method for everything, instead of building custom Docker images? You should differentiate between state and data. Data should be mapped with a volume, and state should exist within the container. The reason for mapping volumes for data is obvious, you want to keep that data when the container gets destroyed. The reason for only keeping state inside the container is because that&rsquo;s part of the power of Docker. State exists within the container, and in order to change state (update application code, etc.) you only need to restart that container. The only changes to state are changes to application code, which should generally mean an updated image, and environment settings which are specified when you run the container, so all of that is reproducible.</p>
<h2 id="summary">Summary</h2>
<p>Hopefully this helps explain some of the ideas behind Docker and how it&rsquo;s different than virtual machines. To oversimplify, VMs virtualize an entire operating system, while Docker virtualizes an application.</p>
]]></content:encoded>
    </item>
    
    <item>
      <title>Learning Android Today</title>
      <link>https://blog.dalydays.com/post/learning-android-today/</link>
      <pubDate>Fri, 23 Aug 2019 00:00:00 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/learning-android-today/</guid>
      <description>Thoughts about learning Android and how to approach it</description>
      <content:encoded><![CDATA[<p>Back in 2010 when I got my first Android phone (the original HTC Evo 4g, before LTE was a thing) I was interested in learning Android development. I started doing a little bit of reading through the documentation, but was quickly overwhelmed with trying to understand all the new terminology like Activities and Intents. I built a few sample apps and played around a bit with it, but lost interest. I wish I would have kept going with it.</p>
<p>Fast forward to 2017, I decided to commit to learning Android development professionally (in my limited free time) and after two years of spending little bits of time here and there (maybe an hour at a time) I finally feel like I have a good grasp on Android in general. I don&rsquo;t feel stuck anymore when I need to figure out how to do something. Now that I&rsquo;m at this point, I thought it would be interesting to go back and map out what would have been the most efficient way to get to this point. This will probably be a bunch of tips and advice that has already been given elsewhere, but hopefully this can help someone who is new to Android decide where to start and see a bigger picture of how all the pieces fit together with modern development and libraries.</p>
<p>Something I realized with Android architecture components is that often times you need to know one thing in order to use another thing, but the other thing can&rsquo;t be used by itself without knowing other concepts, and it was like running in circles sometimes trying to get a foothold on where to start. So this is the advice I would give my past self had I known what I know today about Android development.</p>
<h2 id="my-recommended-android-learning-path">My recommended Android learning path</h2>
<h3 id="general-thoughts">General Thoughts</h3>
<p>One does not simply &ldquo;learn Android.&rdquo; It takes a lot of time and persistence to become proficient and be able to make something useful. Android is a massive framework and I realize I&rsquo;ll never &ldquo;know&rdquo; Android. It&rsquo;s not like the good &lsquo;ol days where you could read a book on C and &ldquo;know&rdquo; the language. I did that after I graduated high school (C Primer Plus is a great book by the way), but even then it wasn&rsquo;t particularly useful. Lately my mindset is more of making continual progress by chipping away at these things. Even just writing this article I&rsquo;m seeing the vast difference in knowledge I have now compared to when I started seriously learning Android a couple years ago.</p>
<p>Aside from having a realistic mindset, it&rsquo;s very important to work on your own project. You will learn exponentially more than just running through books and tutorials. It can seem counterintuitive because you don&rsquo;t know what you need to know to get started or proceed, but that&rsquo;s what highlights your knowledge gaps and makes it obvious what you don&rsquo;t know but need to know. Sample projects are great, but should not be the only source of learning, and should not be the primary source of learning. I would recommend trying to limit tutorials and app walkthroughs except when you&rsquo;re trying to learn something that you will then implement in your own project. A lot of times I felt like I was learning but ultimately I couldn&rsquo;t remember how or why things worked and had to redo the tutorial anyway, when I was ready to actually use it for my app.</p>
<h3 id="kotlin">Kotlin</h3>
<p>This is one of the most important decisions I&rsquo;ve made, using Kotlin exclusively instead of Java. I was partial to Java since I had studied it extensively in the past and was (so I thought) very well versed in it. But using Kotlin once you start using it, it just makes things a lot easier and faster (less typing, less thinking/caring about the lower level details of how the language works and more flowing through what your app needs to do). It is just less in the way of what you want to get done.</p>
<h3 id="activities">Activities</h3>
<p>An activity is a screen in an app. It&rsquo;s a basic building block of Android apps, and you have to have at least 1 activity (nowadays a common practice is only having 1 Activity and switching out what you see on the screen within that Activity using Fragments).</p>
<h3 id="fragments">Fragments</h3>
<p>Fragments are like Activities but can be swapped out within an Activity, or shown side by side in an Activity like in a master/detail type interface. I&rsquo;ve decided generally to work with Fragments and forgo multiple Activities because of their flexibility and tools like Navigation that makes the overall development experience a lot easier to brain (for me).</p>
<h3 id="constraint-layout">Constraint Layout</h3>
<p>Design is not my strong suit, so generally Constraint layouts are now the best way to get flexible layouts that are easy to assemble and work well with drag and drop. It can help reduce a lot of complexity where you used to have to put more consideration into Relative layouts and/or combinations of Linear layouts for example. Basically you tend to end up with less nesting and more efficient layouts. Easier to use + more efficient = my preference.</p>
<h3 id="lambdas">Lambdas</h3>
<p>Not specific to Android, but relevant. I came from a Java background, but this was before Java had lambdas. It took me a minute (actually more) to understand what lambdas were and why I should use them. Now I can&rsquo;t imagine going back to manually instantiating inner classes or even just anonymous inner classes, when a lambda can be had in just a couple lines of code and is so much easier to read. This is one of those things where I struggled with understanding how it worked and had to stop worrying about the low level details and just figure out how useful they were and how to use them effectively. The more you use them the more sense it makes.</p>
<h3 id="lifecycles">Lifecycles</h3>
<p>It&rsquo;s important to understand Activity and Fragment lifecycles (create, resume, start, pause, stop, destroy). You shouldn&rsquo;t necessarily have to handle everything lifecycle related all the time, but it&rsquo;s especially important to know in many situations. There might be something you want to save whenever the user leaves an app, so you need to know when to handle that operation. There also might be &ldquo;weird&rdquo; issues you come across and understanding the lifecycle will help you track down the issue.</p>
<h3 id="recyclerview--listadapter">RecyclerView + ListAdapter</h3>
<p>RecyclerView is something you will use in many apps, and it&rsquo;s important to get the hang of it sooner or later. It can seem overwhelming up front, but ultimately you have a list view + layout, list item + layout, data that goes into the list, and an adapter that ties the actual data to the actual view. Everything in between for the most part is a lot of copy and paste from another implementation. It used to be good, but you still had to worry about refreshing when data changed and taking the right approach which could affect performance. Adding in ListAdapter, this handles a bunch of things automatically like what changed when the data updates, so now you automatically get nice animations built in and it&rsquo;s more efficient when updating data because only refreshes what changed.</p>
<h3 id="navigation">Navigation</h3>
<p>Now that you&rsquo;re working with Activities and Fragments, learn Navigation because it takes away much of the complexity of manually loading up intents and passing data around. You can have a graphical representation of your screens and how they connect to get a visual overview of the app flow.</p>
<h3 id="data-binding">Data Binding</h3>
<p>Tie your data to your views directly in the XML layout file and stop worrying about manually handling data changes/updates/refreshes at different varying points in time (or forgetting to code in another refresh when you make changes to your app later on).</p>
<h3 id="live-data">Live Data</h3>
<p>Live data is a way to make your data &ldquo;observable&rdquo; meaning you can listen for changes in the data and take action, without having to write a lot of code to get it all working. You basically just wrap normal data inside LiveData and it allows you to set up an observer (like a listener) so whenever the data changes that listener gets notified. This also works nicely with data binding so your screen gets updated when the data changes regardless of if the user caused the change or something else on the back end changed it.</p>
<h3 id="coroutines">Coroutines</h3>
<p>This is something that will &ldquo;unlock&rdquo; your ability to move forward with other things such as databases and networking. Without background work off the Main thread, you will be unable to do many things, and even if you can do some things your app&rsquo;s performance can really suffer. From my experience, coroutines are a very straightforward way of doing background work which is both useful and necessary for different situations. Forget about AsyncTasks and loaders and everything else, coroutines are very easy to learn and you can get more complex over time as needed.</p>
<h3 id="room">Room</h3>
<p>Make sure to learn coroutines before attempting Room, otherwise you&rsquo;ll end up doing very bad things like forcing Android to run queries on the main thread which you should never do unless you like living on the edge. It can be useful, but it&rsquo;s usually better not to play with fire. I started blogging about the old way of doing SQLite databases in Android. Basically any time you need to store data for your app (such as for a to do list), you&rsquo;ll want a SQLite database. There are plenty of ORMs out there (a library to make it easier to save and retrieve data from a local database), but Room is the official Android one and it is very nice to use compared to SQLiteOpenHelper, CursorAdapters, and ContentProviders. You lost me. With Room, you just make your model objects (entities), make a Dao interface which defines the database CRUD operations, and set up a Room database object (mostly copy and paste to get a singleton object) and you&rsquo;re good to go.</p>
<h3 id="architecture">Architecture</h3>
<p>This can seem complex, but once you start using an architecture it makes a lot more sense. I would recommend starting with MVVM since that&rsquo;s what Google seems to be pushing, it works well with data binding, and makes handling lifecycle changes a lot easier. However, I would recommend <strong>not</strong> getting into architecture until you are comfortably getting apps running just using Activities and/or Fragments. But you&rsquo;ll want to be doing this when you start building &ldquo;real&rdquo; apps that you are going to have to support beyond the initial creation. This is one of those things that separates the noobs from the pros. More generally, start looking at program patterns and following best practices. This takes time and experience to get good at, but ultimately most computer science problems have been solved already and it&rsquo;s just a matter of recognizing the problem and applying an appropriate pattern as a solution. I would say watch out for anti-patterns, but that&rsquo;s a deeper topic and requires some understanding of <strong>regular</strong> patterns first. This is a rabbit hole to watch out for.</p>
<h3 id="viewmodel">ViewModel</h3>
<p>This is actually used as part of the MVVM architecture, but on its own is useful to understand what it is and what benefits it provides. When used correctly, you no longer have to worry about things like screen rotation and saving/reloading data. However, I would say this is something that is more difficult to learn up front so don&rsquo;t worry about using ViewModels or MVVM architecture until you are comfortably running more simplistic apps.</p>
<h2 id="other-important-concepts-that-i-need-to-start-practicing">Other important concepts that I need to start practicing</h2>
<h3 id="testing">Testing</h3>
<p>Testing is extremely important in production apps that need to be maintained and expanded over time. While there&rsquo;s some debate, test driven development is an excellent workflow for developing any software because of many side effects such as catching bugs early (make a change, test, test fails, you know what you just changed so it&rsquo;s easy to find the problem), catching regressions, and having complete and evolving documentation that always matches your production code (tests are good examples of how to call your functions). This isn&rsquo;t something I do at this point, but my goal is to use test driven development for any production grade development I do in the future.</p>
<h3 id="networking">Networking</h3>
<p>Networking is something that is used constantly. Try to think of an app that is 100% functional without an internet connection. Just like RecyclerView, networking is used in many apps. It&rsquo;s also used for storing online copies of data (Firebase) or other things like cloud sync. This is another concept though that requires knowledge of doing background work, so learn coroutines before getting too far into networking, or else your app is going to be uninstalled.</p>
<h3 id="json-library">JSON Library</h3>
<p>This kind of goes along with networking, although you can certainly work with JSON data from a local database or flat file. But JSON tends to be the format of choice for consuming APIs, and they map nicely to data objects. Actually, you don&rsquo;t need to work directly with JSON, there are a bunch of libraries to make JSON mapping easier such as GSON and moshi. I think Google has been pointing people toward moshi lately, so that&rsquo;s what I&rsquo;ll be using when I get more into the API side of things.</p>
<h3 id="image-loading-library">Image Loading Library</h3>
<p>A couple of big names are Picasso and Glide. Basically an image loading library handles all the technical details of decoding media, caching, and other things and makes it easy to do something like use the URL of an image from JSON and display it in an ImageView. There&rsquo;s a lot that goes on behind the scenes, but unless you&rsquo;re interested in developing/maintaining one of these libraries yourself, don&rsquo;t worry about the details and just use the tools provided.</p>
<h3 id="design">Design</h3>
<p>Material design is a library that you can use to make things look good, but it&rsquo;s also a set of guidelines for designing apps that are easy to use, look good, and have a good flow. Design is one of my weaknesses, so I really struggle in this area (I know when something doesn&rsquo;t look good, but I struggle to make it better).</p>
<h2 id="conclusion">Conclusion</h2>
<p>So I&rsquo;m probably missing some things that people would find important, and including things that people would omit from their lists. The Android ecosystem is massive, and there are limitless rabbit holes to go down. In my opinion this is pretty much the core list of things to know as a general Android developer, and you can pick up what you need as you go from here. I hope this information is helpful to someone, and I hope to dig into more details by going through the process from start to finish with my app that I&rsquo;m close to finishing. The app uses modern Android development practices and is a to do list that allows you to schedule when a checked item will return to the unchecked list and notify the user that it needs to be done again at that time. It seems simple enough, but there is a lot that goes into it, and this is my first &ldquo;real&rdquo; app that will have value to me personally and I plan to maintain in the future.</p>
]]></content:encoded>
    </item>
    
    <item>
      <title>From Virtual Machines to Docker Stacks</title>
      <link>https://blog.dalydays.com/post/from-virtual-machines-to-docker-stacks/</link>
      <pubDate>Wed, 22 Aug 2018 00:00:00 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/from-virtual-machines-to-docker-stacks/</guid>
      <description>Going from everything in a VM to using Docker containers to run services</description>
      <content:encoded><![CDATA[<p>I&rsquo;ve been attempting to learn Docker with the goal of using it in production at home and at work. I see several benefits to doing this:</p>
<ul>
<li>Containers are less tied to the host OS compared to packages available in the repository. For example, in Ubuntu 16.04 you may not be able to get an old enough or new enough version of Redis to satisfy a requirement for your web application, leading to changing the host OS or compiling yourself (time consuming and difficult to update).</li>
<li>Containers don&rsquo;t (shouldn&rsquo;t) hold state. They are disposable which is good for updating and if something goes wrong.</li>
<li>Containers should be easy to work with. I can write a .yml file defining containers and parameters, and then run or &ldquo;deploy&rdquo; the stack using a single command. This is light years ahead of manually installing packages and configuring them manually.</li>
<li>Using containers should save a lot of time, including the time it takes to define the .yml file and get everything set up.</li>
<li>Once a .yml file is defined, it&rsquo;s easy to reuse this in another environment, and easy to alter requirements.</li>
</ul>
<p>I&rsquo;m sure there are more things, but that&rsquo;s just off the top of my head.</p>
<p>Now in my experience, what I found was that there is a plethora of Docker tutorials that get you going from scratch. It&rsquo;s easy to run individual containers, and sometimes they tell you about mapping volumes or mapping ports so you can access the container from the host. From there it seems like they go directly to Docker Swarm and Kubernetes, along with deploying to a cloud platform like AWS.</p>
<p>What about me? I want to deploy to my own ESXi environment, but I&rsquo;m not interested in hosting my own Kubernetes platform because of the maintenance burden. I also want to take advantage of all the features Docker provides, and if I have to run individual containers one at a time using <code>docker run</code> then it&rsquo;s a no go for me. Often times, especially with web applications, you&rsquo;ll have several different pieces of software working together. You could build a custom Docker image with exactly the pieces you need, but this would be difficult to maintain and only makes sense if that software is something you are developing. If you just want to run other people&rsquo;s software, such as Nextcloud with a MariaDB backend and Redis caching, you&rsquo;re gonna want to run these as individual &ldquo;services.&rdquo;</p>
<p>I started looking at docker-compose because it seems to be capable of doing exactly what I want: define a stack of services in docker-compose.yml, then deploy it using a single command. However, in the documentation they kind of gloss over docker-compose and suggest going directly into Docker Swarm. Hold on a minute, I don&rsquo;t care about Docker Swarm or clustering. Or do I?</p>
<p>It turns out Docker Swarm is exactly as easy as docker-compose. I guess docker-compose is more geared towards local development and testing, but we should all be using Docker Swarm in production. Basically we write the same docker-compose.yml file, but run a different command. Instead of <code>docker-compose up -d</code> we just run <code>docker stack deploy</code>.</p>
<p>Ok, but why do I care about that? Docker Swarm makes it possible to scale up your service if you ever need to, so all of a sudden if one front end web server isn&rsquo;t enough, just tell it to run 5 replicas and it will automatically spin them up and load balance for you. You can also have this happen automatically. The other thing that caught my attention was that with Docker Swarm, you have a master and then you can have workers. I can install an Ubuntu VM just for running Docker stacks on one physical host, then install another Ubuntu VM on another host, and let&rsquo;s say one more for good measure. The second and third hosts can join the Docker swarm, and all of a sudden not only do I have load balancing, but I have physical redundancy failover, kind of like live vMotion if you come from a VMWare world. For free.</p>
]]></content:encoded>
    </item>
    
    <item>
      <title>Let&#39;s Make Lights Out for Android Part 1 - Creating the Layout</title>
      <link>https://blog.dalydays.com/post/lets-make-lights-out-for-android-part-1-creating-the-layout/</link>
      <pubDate>Fri, 02 Feb 2018 00:00:00 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/lets-make-lights-out-for-android-part-1-creating-the-layout/</guid>
      <description>A quick start to create a layout for a Lights Out clone</description>
      <content:encoded><![CDATA[<p>I&rsquo;ve always enjoyed the classic game Lights Out by Tiger Electronics. It&rsquo;s a fairly simple idea: there is a grid of 25 buttons which can be lit or not lit. When you press a button, it just toggles the light for the button you pressed plus any adjacent buttons. So you would start with a randomized board (or predetermined pattern), and the goal is to turn off all the lights while pressing the fewest buttons possible.</p>
<p>Ok, maybe I&rsquo;m one of only 4 people who think this is cool, but either way it&rsquo;s a good starter project, so let&rsquo;s make it for Android!</p>
<p>The end goal here is to demonstrate how to create a simple game from scratch, add some features, refactor the code, and eventually release it on the Google Play store. I&rsquo;ll walk through every step start to finish, so you should be able to follow along even if you&rsquo;re new to Android or Java.</p>
<p>For reference, <a href="https://github.com/linucksrox/AndroidLogicPuzzle">here is the current state of the project on Github.</a></p>
<h2 id="1-create-a-new-project">1. Create a new project</h2>
<ul>
<li>
<p>Click Start a new Android Studio project.</p>
<p></p>
</li>
<li>
<p>Set up your project.</p>
<p></p>
<ol>
<li>Name your app. I&rsquo;ll go with Lights Out.</li>
<li>Your personal/business domain (can be anything, but should be unique so that the app&rsquo;s name doesn&rsquo;t conflict with an app with the same name in places like the Play store).</li>
<li>Where are you storing your project? I recommend creating a folder for Android projects.</li>
<li>Click Next.</li>
</ol>
</li>
<li>
<p>Unless you have a specific use case, just use the default settings for Target Android Devices. Click Next.</p>
<p></p>
</li>
<li>
<p>We just need an empty activity for now, so click Next.</p>
<p></p>
</li>
<li>
<p>I&rsquo;ll use the defaults for now. As apps get more complicated we start thinking of better names to organize everything. For now, keep the defaults and click Finish.</p>
<p></p>
</li>
<li>
<p>Wait while android builds the project.</p>
<p></p>
</li>
<li>
<p>Expand app/java/com.domain.appname and app/res/layout to find the MainActivity.java and activity_main.xml files to get started.</p>
<p></p>
</li>
</ul>
<h2 id="2-create-a-layout">2. Create a layout</h2>
<p>Our layout is going to be very basic at this point, because we just want to show a grid of buttons that we can start interacting with. We&rsquo;ll focus on tweaking the layout later on after implementing some of the functionality. I like to focus on getting things functional before worrying too much about how it looks.</p>
<p>As you might know already, there are a bazillion ways to do layouts, and I&rsquo;m just focused on showing something that&rsquo;s very simple for two reasons:</p>
<ol>
<li>It should be simple to understand.</li>
<li>It shouldn&rsquo;t take too much effort to feel like you&rsquo;re making progress.</li>
</ol>
<ul>
<li>Open the provided layout file activity_main.xml. They start you out with a single TextView inside a layout, but we&rsquo;ll completely replace that with our own layout.
There are a few things to note here:</li>
</ul>
<ol>
<li>The outer RelativeLayout uses a vertical orientation, which means that each element can be placed relative to the outer edges and relative to each other. In this case, every inner LinearLayout represents one row of buttons from top to bottom.</li>
<li>The outer RelativeLayout has layout_gravity set to center_horizontal, meaning the entire grid of buttons will be centered on the screen horizontally.</li>
<li>The inner LinearLayouts (each row of buttons) have their orientation set to horizontal, meaning each element inside of them (each button) is placed left to right.</li>
<li>We set the layout_width and layout_height of every button to the same value - buttonWidth - so that every button is a square. We might change this later on so that buttons dynamically shrink or expand depending on screen size, but this method is a little easier for now.</li>
<li>The button background is set to whatever color we assign to colorLightOff, which will be light gray.</li>
<li>The button layout_margin is set to the value of buttonSpacing, which just separates each button by a little bit so they&rsquo;re not all right next to each other.</li>
</ol>
<p>We&rsquo;ll define the dimensions and colors next, so don&rsquo;t worry about any errors you see after pasting this into your layout.</p>
<p>{% gist 2a5e598ba67558385718a27a2f8ab489 layout_main.xml %}</p>
<ul>
<li>
<p>You probably noticed that there are some values we just referenced that don&rsquo;t exist, so we need to create those. We need to define the buttonWidth and buttonSpacing dimensions and the colorLightOn/colorLightOff colors. Starting with the dimensions, right click on app-&gt;res-&gt;values, and click New-&gt;Values resource file. Name it dimensions and click OK.</p>
<p></p>
<p></p>
</li>
<li>
<p>Open dimensions.xml (the one you just created), and add a dimension element named buttonWidth, setting the value to 60dp. Also add buttonSpacing, setting its value to 2dp. You can tweak these to your liking a little bit later.</p>
</li>
</ul>
<p>{% gist 2787528b39b70cbda9bfe6d5685bef17 dimensions.xml %}</p>
<ul>
<li>Next, we need to define some colors for the lights when they&rsquo;re on or off. Open app-&gt;resources-&gt;values-&gt;colors.xml (this was already provided for you when you created your project). Add the two colors named colorLightOn and colorLightOff, leaving the existing colors which are used in other parts of the app.</li>
</ul>
<p>{% gist ca46e1ae8aebbb28d833dd8191c31fab colors.xml %}</p>
<ul>
<li>Back in the layout preview or layout design mode, you should now see the grid pattern.</li>
</ul>
<p>Congratulations! You made a button grid layout, and you can tweak the colors or dimensions however you want before continuing on to the next part where we&rsquo;ll hook up the logic and actually toggle the lights on and off when you press the buttons.</p>
<p>Leave a comment if you have any questions or want me to elaborate on anything.</p>
]]></content:encoded>
    </item>
    
    <item>
      <title>Databases In Android - Part 4</title>
      <link>https://blog.dalydays.com/post/databases-in-android-part-4/</link>
      <pubDate>Fri, 23 Jun 2017 00:00:00 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/databases-in-android-part-4/</guid>
      <description>Part 4 of a mostly code look at using SQLite in Android (not using Room)</description>
      <content:encoded><![CDATA[<p>OK so in our Product sample, the ContentProvider looks like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-java" data-lang="java"><span class="line"><span class="cl"><span class="kd">public</span><span class="w"> </span><span class="kd">class</span> <span class="nc">ProductProvider</span><span class="w"> </span><span class="kd">extends</span><span class="w"> </span><span class="n">ContentProvider</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">public</span><span class="w"> </span><span class="kd">static</span><span class="w"> </span><span class="kd">final</span><span class="w"> </span><span class="n">String</span><span class="w"> </span><span class="n">LOG_TAG</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ProductProvider</span><span class="p">.</span><span class="na">class</span><span class="p">.</span><span class="na">getSimpleName</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">private</span><span class="w"> </span><span class="n">ProductHelper</span><span class="w"> </span><span class="n">mDbHelper</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Set up URI matcher codes</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">private</span><span class="w"> </span><span class="kd">static</span><span class="w"> </span><span class="kd">final</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">PRODUCTS</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">100</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">private</span><span class="w"> </span><span class="kd">static</span><span class="w"> </span><span class="kd">final</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">PRODUCT_ID</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">101</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">private</span><span class="w"> </span><span class="kd">static</span><span class="w"> </span><span class="kd">final</span><span class="w"> </span><span class="n">UriMatcher</span><span class="w"> </span><span class="n">sUriMatcher</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">new</span><span class="w"> </span><span class="n">UriMatcher</span><span class="p">(</span><span class="n">UriMatcher</span><span class="p">.</span><span class="na">NO_MATCH</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">static</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">sUriMatcher</span><span class="p">.</span><span class="na">addURI</span><span class="p">(</span><span class="n">ProductContract</span><span class="p">.</span><span class="na">CONTENT_AUTHORITY</span><span class="p">,</span><span class="w"> </span><span class="n">ProductContract</span><span class="p">.</span><span class="na">PATH_PRODUCT</span><span class="p">,</span><span class="w"> </span><span class="n">PRODUCTS</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">sUriMatcher</span><span class="p">.</span><span class="na">addURI</span><span class="p">(</span><span class="n">ProductContract</span><span class="p">.</span><span class="na">CONTENT_AUTHORITY</span><span class="p">,</span><span class="w"> </span><span class="n">ProductContract</span><span class="p">.</span><span class="na">PATH_PRODUCT</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="s">&#34;/#&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">PRODUCT_ID</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nd">@Override</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">public</span><span class="w"> </span><span class="kt">boolean</span><span class="w"> </span><span class="nf">onCreate</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">mDbHelper</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">new</span><span class="w"> </span><span class="n">ProductHelper</span><span class="p">(</span><span class="n">getContext</span><span class="p">());</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">return</span><span class="w"> </span><span class="kc">true</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nd">@Nullable</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nd">@Override</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">public</span><span class="w"> </span><span class="n">Cursor</span><span class="w"> </span><span class="nf">query</span><span class="p">(</span><span class="nd">@NonNull</span><span class="w"> </span><span class="n">Uri</span><span class="w"> </span><span class="n">uri</span><span class="p">,</span><span class="w"> </span><span class="nd">@Nullable</span><span class="w"> </span><span class="n">String</span><span class="o">[]</span><span class="w"> </span><span class="n">projection</span><span class="p">,</span><span class="w"> </span><span class="nd">@Nullable</span><span class="w"> </span><span class="n">String</span><span class="w"> </span><span class="n">selection</span><span class="p">,</span><span class="w"> </span><span class="nd">@Nullable</span><span class="w"> </span><span class="n">String</span><span class="o">[]</span><span class="w"> </span><span class="n">selectionArgs</span><span class="p">,</span><span class="w"> </span><span class="nd">@Nullable</span><span class="w"> </span><span class="n">String</span><span class="w"> </span><span class="n">sortOrder</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">SQLiteDatabase</span><span class="w"> </span><span class="n">database</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">mDbHelper</span><span class="p">.</span><span class="na">getReadableDatabase</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">Cursor</span><span class="w"> </span><span class="n">cursor</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kt">int</span><span class="w"> </span><span class="n">match</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sUriMatcher</span><span class="p">.</span><span class="na">match</span><span class="p">(</span><span class="n">uri</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">switch</span><span class="w"> </span><span class="p">(</span><span class="n">match</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">case</span><span class="w"> </span><span class="n">PRODUCTS</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="n">cursor</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">database</span><span class="p">.</span><span class="na">query</span><span class="p">(</span><span class="n">ProductEntry</span><span class="p">.</span><span class="na">TABLE_NAME</span><span class="p">,</span><span class="w"> </span><span class="n">projection</span><span class="p">,</span><span class="w"> </span><span class="n">selection</span><span class="p">,</span><span class="w"> </span><span class="n">selectionArgs</span><span class="p">,</span><span class="w"> </span><span class="kc">null</span><span class="p">,</span><span class="w"> </span><span class="kc">null</span><span class="p">,</span><span class="w"> </span><span class="n">sortOrder</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="k">break</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">case</span><span class="w"> </span><span class="n">PRODUCT_ID</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="n">selection</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ProductEntry</span><span class="p">.</span><span class="na">_ID</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="s">&#34;=?&#34;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="n">selectionArgs</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">new</span><span class="w"> </span><span class="n">String</span><span class="o">[]</span><span class="w"> </span><span class="p">{</span><span class="n">String</span><span class="p">.</span><span class="na">valueOf</span><span class="p">(</span><span class="n">ContentUris</span><span class="p">.</span><span class="na">parseId</span><span class="p">(</span><span class="n">uri</span><span class="p">))};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="n">cursor</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">database</span><span class="p">.</span><span class="na">query</span><span class="p">(</span><span class="n">ProductEntry</span><span class="p">.</span><span class="na">TABLE_NAME</span><span class="p">,</span><span class="w"> </span><span class="n">projection</span><span class="p">,</span><span class="w"> </span><span class="n">selection</span><span class="p">,</span><span class="w"> </span><span class="n">selectionArgs</span><span class="p">,</span><span class="w"> </span><span class="kc">null</span><span class="p">,</span><span class="w"> </span><span class="kc">null</span><span class="p">,</span><span class="w"> </span><span class="n">sortOrder</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="k">break</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">default</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="k">throw</span><span class="w"> </span><span class="k">new</span><span class="w"> </span><span class="n">IllegalArgumentException</span><span class="p">(</span><span class="s">&#34;Cannot query unknown URI &#34;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">uri</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">cursor</span><span class="p">.</span><span class="na">setNotificationUri</span><span class="p">(</span><span class="n">getContext</span><span class="p">().</span><span class="na">getContentResolver</span><span class="p">(),</span><span class="w"> </span><span class="n">uri</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">return</span><span class="w"> </span><span class="n">cursor</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nd">@Nullable</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nd">@Override</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">public</span><span class="w"> </span><span class="n">String</span><span class="w"> </span><span class="nf">getType</span><span class="p">(</span><span class="nd">@NonNull</span><span class="w"> </span><span class="n">Uri</span><span class="w"> </span><span class="n">uri</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">final</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">match</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sUriMatcher</span><span class="p">.</span><span class="na">match</span><span class="p">(</span><span class="n">uri</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">switch</span><span class="w"> </span><span class="p">(</span><span class="n">match</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">case</span><span class="w"> </span><span class="n">PRODUCTS</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="k">return</span><span class="w"> </span><span class="n">ProductEntry</span><span class="p">.</span><span class="na">PRODUCT_LIST_TYPE</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">case</span><span class="w"> </span><span class="n">PRODUCT_ID</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="k">return</span><span class="w"> </span><span class="n">ProductEntry</span><span class="p">.</span><span class="na">PRODUCT_ITEM_TYPE</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">default</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="k">throw</span><span class="w"> </span><span class="k">new</span><span class="w"> </span><span class="n">IllegalStateException</span><span class="p">(</span><span class="s">&#34;Unknown URI &#34;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">uri</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="s">&#34; with match &#34;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">match</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nd">@Nullable</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nd">@Override</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">public</span><span class="w"> </span><span class="n">Uri</span><span class="w"> </span><span class="nf">insert</span><span class="p">(</span><span class="nd">@NonNull</span><span class="w"> </span><span class="n">Uri</span><span class="w"> </span><span class="n">uri</span><span class="p">,</span><span class="w"> </span><span class="nd">@Nullable</span><span class="w"> </span><span class="n">ContentValues</span><span class="w"> </span><span class="n">contentValues</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">final</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">match</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sUriMatcher</span><span class="p">.</span><span class="na">match</span><span class="p">(</span><span class="n">uri</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">switch</span><span class="w"> </span><span class="p">(</span><span class="n">match</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">case</span><span class="w"> </span><span class="n">PRODUCTS</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="k">return</span><span class="w"> </span><span class="n">insertProduct</span><span class="p">(</span><span class="n">uri</span><span class="p">,</span><span class="w"> </span><span class="n">contentValues</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">default</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="k">throw</span><span class="w"> </span><span class="k">new</span><span class="w"> </span><span class="n">IllegalArgumentException</span><span class="p">(</span><span class="s">&#34;Insertion is not supported for &#34;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">uri</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">private</span><span class="w"> </span><span class="n">Uri</span><span class="w"> </span><span class="nf">insertProduct</span><span class="p">(</span><span class="n">Uri</span><span class="w"> </span><span class="n">uri</span><span class="p">,</span><span class="w"> </span><span class="n">ContentValues</span><span class="w"> </span><span class="n">values</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">SQLiteDatabase</span><span class="w"> </span><span class="n">db</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">mDbHelper</span><span class="p">.</span><span class="na">getWritableDatabase</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kt">long</span><span class="w"> </span><span class="n">id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">db</span><span class="p">.</span><span class="na">insert</span><span class="p">(</span><span class="n">ProductEntry</span><span class="p">.</span><span class="na">TABLE_NAME</span><span class="p">,</span><span class="w"> </span><span class="kc">null</span><span class="p">,</span><span class="w"> </span><span class="n">values</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">getContext</span><span class="p">().</span><span class="na">getContentResolver</span><span class="p">().</span><span class="na">notifyChange</span><span class="p">(</span><span class="n">uri</span><span class="p">,</span><span class="w"> </span><span class="kc">null</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">return</span><span class="w"> </span><span class="n">ContentUris</span><span class="p">.</span><span class="na">withAppendedId</span><span class="p">(</span><span class="n">uri</span><span class="p">,</span><span class="w"> </span><span class="n">id</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nd">@Override</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">public</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="nf">delete</span><span class="p">(</span><span class="nd">@NonNull</span><span class="w"> </span><span class="n">Uri</span><span class="w"> </span><span class="n">uri</span><span class="p">,</span><span class="w"> </span><span class="nd">@Nullable</span><span class="w"> </span><span class="n">String</span><span class="w"> </span><span class="n">whereClause</span><span class="p">,</span><span class="w"> </span><span class="nd">@Nullable</span><span class="w"> </span><span class="n">String</span><span class="o">[]</span><span class="w"> </span><span class="n">whereArgs</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// Get writable database</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">SQLiteDatabase</span><span class="w"> </span><span class="n">database</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">mDbHelper</span><span class="p">.</span><span class="na">getWritableDatabase</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kt">int</span><span class="w"> </span><span class="n">numRowsDeleted</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">0</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">final</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">match</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sUriMatcher</span><span class="p">.</span><span class="na">match</span><span class="p">(</span><span class="n">uri</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">switch</span><span class="w"> </span><span class="p">(</span><span class="n">match</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">case</span><span class="w"> </span><span class="n">PRODUCTS</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="c1">// Delete all rows that match the whereClause and whereArgs</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="n">numRowsDeleted</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">database</span><span class="p">.</span><span class="na">delete</span><span class="p">(</span><span class="n">ProductEntry</span><span class="p">.</span><span class="na">TABLE_NAME</span><span class="p">,</span><span class="w"> </span><span class="n">whereClause</span><span class="p">,</span><span class="w"> </span><span class="n">whereArgs</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="k">break</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">case</span><span class="w"> </span><span class="n">PRODUCT_ID</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="c1">// Delete a single row from the pets table using the given ID</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="n">whereClause</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ProductEntry</span><span class="p">.</span><span class="na">_ID</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="s">&#34;=?&#34;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="n">whereArgs</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">new</span><span class="w"> </span><span class="n">String</span><span class="o">[]</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">String</span><span class="p">.</span><span class="na">valueOf</span><span class="p">(</span><span class="n">ContentUris</span><span class="p">.</span><span class="na">parseId</span><span class="p">(</span><span class="n">uri</span><span class="p">))</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="n">numRowsDeleted</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">database</span><span class="p">.</span><span class="na">delete</span><span class="p">(</span><span class="n">ProductEntry</span><span class="p">.</span><span class="na">TABLE_NAME</span><span class="p">,</span><span class="w"> </span><span class="n">whereClause</span><span class="p">,</span><span class="w"> </span><span class="n">whereArgs</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="k">break</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">default</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="k">throw</span><span class="w"> </span><span class="k">new</span><span class="w"> </span><span class="n">IllegalArgumentException</span><span class="p">(</span><span class="s">&#34;Deletion is not supported for &#34;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">uri</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="p">(</span><span class="n">numRowsDeleted</span><span class="w"> </span><span class="o">!=</span><span class="w"> </span><span class="n">0</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">getContext</span><span class="p">().</span><span class="na">getContentResolver</span><span class="p">().</span><span class="na">notifyChange</span><span class="p">(</span><span class="n">uri</span><span class="p">,</span><span class="w"> </span><span class="kc">null</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">return</span><span class="w"> </span><span class="n">numRowsDeleted</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nd">@Override</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">public</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="nf">update</span><span class="p">(</span><span class="nd">@NonNull</span><span class="w"> </span><span class="n">Uri</span><span class="w"> </span><span class="n">uri</span><span class="p">,</span><span class="w"> </span><span class="nd">@Nullable</span><span class="w"> </span><span class="n">ContentValues</span><span class="w"> </span><span class="n">contentValues</span><span class="p">,</span><span class="w"> </span><span class="nd">@Nullable</span><span class="w"> </span><span class="n">String</span><span class="w"> </span><span class="n">s</span><span class="p">,</span><span class="w"> </span><span class="nd">@Nullable</span><span class="w"> </span><span class="n">String</span><span class="o">[]</span><span class="w"> </span><span class="n">strings</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">return</span><span class="w"> </span><span class="n">0</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>A ContentProvider should implement all of the CRUD methods, passing arguments through to the database, and then passing results back to the calling method. It&rsquo;s kind of like a reverse proxy&hellip;</p>
<p>Focusing on the query method first, it basically opens a readable copy of the database, and then either queries the whole product table or a single product depending on the URI passed in. You can set up arbitrary URI matcher codes at the top, but just make sure they&rsquo;re unique within the provider. If you noticed cursor.setNotificationUri, that&rsquo;s used for notifying the CursorLoader that a change was made so that it can automatically update. More on that later.</p>
<p>I&rsquo;m tempted to say that the rest of the code is self explanatory&hellip; Even if that&rsquo;s not true, you can use this as a starting point and fill things in from there. This is pretty barebones as it is.</p>
<p>Next time we&rsquo;ll look at CursorLoader, and that should conclude the Databases In Android series.</p>
]]></content:encoded>
    </item>
    
    <item>
      <title>Databases In Android - Part 3</title>
      <link>https://blog.dalydays.com/post/databases-in-android-part-3/</link>
      <pubDate>Mon, 19 Jun 2017 00:00:00 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/databases-in-android-part-3/</guid>
      <description>Part 3 of a mostly code look at using SQLite in Android (not using Room)</description>
      <content:encoded><![CDATA[<h2 id="contentproviders">ContentProviders</h2>
<p><a href="https://developer.android.com/guide/topics/providers/content-providers.html">ContentProviders</a> manage access to data, and provide a layer of abstraction between raw data and the application (widget, other apps, your own app, etc.). At the expense of a little more complexity, it adds the value of centralized data security (instead of making outside applications sanitize data, it can all be done in the provider). You can also change the data backend without making any changes to the provider API, and so no changes are necessary in any applications accessing the provider. For example you could change from using a flat file data store to using a SQLite database.</p>
<!-- raw HTML omitted -->
]]></content:encoded>
    </item>
    
    <item>
      <title>Databases In Android - Part 2</title>
      <link>https://blog.dalydays.com/post/android-sqlite-part-2/</link>
      <pubDate>Fri, 09 Jun 2017 00:00:00 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/android-sqlite-part-2/</guid>
      <description>Part 2 of a mostly code look at using SQLite in Android (not using Room)</description>
      <content:encoded><![CDATA[<p>Here&rsquo;s my thought of the day: I recently helped out on a public forum for some cloud file sync software I like to run, and this guy asked how to move a folder from one partition to another because he accidentally installed the package on the root partition with limited space. After I answered that he could just use the mv command (Ubuntu Server), he responded I think via email which included his signature and a little about himself, as in his credentials. Apparently he is a Linux Consultant.
<!-- raw HTML omitted -->
I don&rsquo;t say this as an insult to the guy. I appreciate that people ask for help, and are willing to learn (and polite!). There&rsquo;s nothing wrong with that. Maybe he&rsquo;s just getting started, but this is what he wants to become an expert in, so he&rsquo;s in the mindset that he <strong>is</strong> an expert so that he <strong>becomes</strong> an expert.
<!-- raw HTML omitted -->
I guess the point is that my first reaction was &ldquo;This guy is over-selling himself.&rdquo; But maybe part of the reason I think that is because I under-sell myself. Hm&hellip; just a thought. There still needs to be a balance though&hellip; I mean come on. I guess since I&rsquo;ve created my first Hello World app, it&rsquo;s time to go update my resume:</p>
<blockquote>
<p>Android Master</p>
</blockquote>
<!-- raw HTML omitted -->
<p>So in part 1 I talked about creating a contract class which represents the database schema, and started setting up the database helper which defines the database version and handles creation and upgrades/downgrades of the database.</p>
<h2 id="cursoradapters">CursorAdapters</h2>
<p>The next point on the list is the CursorAdapter. I don&rsquo;t think I can explain it any better than the Android documentation:</p>
<blockquote>
<p>Adapter that exposes data from a Cursor to a ListView widget.</p>
</blockquote>
<p>But it doesn&rsquo;t hurt to try. The CursorAdapter maps the values from a cursor to the fields in a ListView.</p>
<!-- raw HTML omitted -->
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-xml" data-lang="xml"><span class="line"><span class="cl"><span class="cp">&lt;?xml version=&#34;1.0&#34; encoding=&#34;utf-8&#34;?&gt;</span><span class="c">&lt;!-- Layout for a single list item in the list of pets --&gt;</span>
</span></span><span class="line"><span class="cl"><span class="nt">&lt;LinearLayout</span> <span class="na">xmlns:android=</span><span class="s">&#34;http://schemas.android.com/apk/res/android&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="na">android:layout_width=</span><span class="s">&#34;match_parent&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="na">android:layout_height=</span><span class="s">&#34;wrap_content&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="na">android:orientation=</span><span class="s">&#34;vertical&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="na">android:padding=</span><span class="s">&#34;@dimen/activity_margin&#34;</span><span class="nt">&gt;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;TextView</span>
</span></span><span class="line"><span class="cl">        <span class="na">android:id=</span><span class="s">&#34;@+id/tv_product_name&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="na">android:layout_width=</span><span class="s">&#34;wrap_content&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="na">android:layout_height=</span><span class="s">&#34;wrap_content&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="na">android:fontFamily=</span><span class="s">&#34;sans-serif-medium&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="na">android:textAppearance=</span><span class="s">&#34;?android:textAppearanceMedium&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="na">android:textColor=</span><span class="s">&#34;#2B3D4D&#34;</span> <span class="nt">/&gt;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="nt">&lt;/LinearLayout&gt;</span>
</span></span></code></pre></div><p>So the list_item layout only shows a single TextView, which is going to be the product name. Now we can start working on the CursorAdapter.</p>
<p>Here&rsquo;s what my ProductCursorAdapter looks like.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-java" data-lang="java"><span class="line"><span class="cl"><span class="kd">public</span><span class="w"> </span><span class="kd">class</span> <span class="nc">ProductCursorAdapter</span><span class="w"> </span><span class="kd">extends</span><span class="w"> </span><span class="n">CursorAdapter</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">public</span><span class="w"> </span><span class="nf">ProductCursorAdapter</span><span class="p">(</span><span class="n">Context</span><span class="w"> </span><span class="n">context</span><span class="p">,</span><span class="w"> </span><span class="n">Cursor</span><span class="w"> </span><span class="n">c</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">super</span><span class="p">(</span><span class="n">context</span><span class="p">,</span><span class="w"> </span><span class="n">c</span><span class="p">,</span><span class="w"> </span><span class="n">0</span><span class="w"> </span><span class="cm">/* flags */</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nd">@Override</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">public</span><span class="w"> </span><span class="n">View</span><span class="w"> </span><span class="nf">newView</span><span class="p">(</span><span class="n">Context</span><span class="w"> </span><span class="n">context</span><span class="p">,</span><span class="w"> </span><span class="n">Cursor</span><span class="w"> </span><span class="n">cursor</span><span class="p">,</span><span class="w"> </span><span class="n">ViewGroup</span><span class="w"> </span><span class="n">parent</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">return</span><span class="w"> </span><span class="n">LayoutInflater</span><span class="p">.</span><span class="na">from</span><span class="p">(</span><span class="n">context</span><span class="p">).</span><span class="na">inflate</span><span class="p">(</span><span class="n">R</span><span class="p">.</span><span class="na">layout</span><span class="p">.</span><span class="na">list_item</span><span class="p">,</span><span class="w"> </span><span class="n">parent</span><span class="p">,</span><span class="w"> </span><span class="kc">false</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nd">@Override</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">public</span><span class="w"> </span><span class="kt">void</span><span class="w"> </span><span class="nf">bindView</span><span class="p">(</span><span class="n">View</span><span class="w"> </span><span class="n">view</span><span class="p">,</span><span class="w"> </span><span class="n">Context</span><span class="w"> </span><span class="n">context</span><span class="p">,</span><span class="w"> </span><span class="n">Cursor</span><span class="w"> </span><span class="n">cursor</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">TextView</span><span class="w"> </span><span class="n">productNameTV</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">TextView</span><span class="p">)</span><span class="w"> </span><span class="n">view</span><span class="p">.</span><span class="na">findViewById</span><span class="p">(</span><span class="n">R</span><span class="p">.</span><span class="na">id</span><span class="p">.</span><span class="na">tv_product_name</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">String</span><span class="w"> </span><span class="n">productNameString</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">cursor</span><span class="p">.</span><span class="na">getString</span><span class="p">(</span><span class="n">cursor</span><span class="p">.</span><span class="na">getColumnIndexOrThrow</span><span class="p">(</span><span class="n">ProductEntry</span><span class="p">.</span><span class="na">COLUMN_NAME_NAME</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">productNameTV</span><span class="p">.</span><span class="na">setText</span><span class="p">(</span><span class="n">productNameString</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Notice in the bindView method, this is where you tie the data from the cursor to the elements in the ListView. Again, in this case all we&rsquo;re doing is getting the tv_product_name TextView, and setting the text to productNameString which comes from the cursor. The CursorAdapter automatically does this for all items in the Cursor, adding to the ListView as many items as needed.</p>
<!-- raw HTML omitted -->
<p>I&rsquo;m out of time today, so I&rsquo;ll continue to ContentProviders next time.</p>
]]></content:encoded>
    </item>
    
    <item>
      <title>Databases In Android - Part 1</title>
      <link>https://blog.dalydays.com/post/android-sqlite/</link>
      <pubDate>Fri, 26 May 2017 00:00:00 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/android-sqlite/</guid>
      <description>Part 1 of a mostly code look at using SQLite in Android (not using Room)</description>
      <content:encoded><![CDATA[<p>Android provides an <strong>easy</strong> way to connect to SQLite databases. It&rsquo;s so easy, assuming you know how to query a database and follow some guides. But if you need to do anything more complicated than a few tables with low record counts you&rsquo;ll find that there&rsquo;s a lot more to it. Welcome to <a href="https://developer.android.com/guide/topics/providers/content-providers.html">ContentProviders</a>, <a href="https://developer.android.com/reference/android/content/CursorLoader.html">CursorLoaders</a>, and <a href="https://developer.android.com/reference/android/widget/CursorAdapter.html">CursorAdapters</a>. The good news is that a lot of these concepts translate into different environments besides Android.</p>
<p>In this post (this might end up being a multi-part series), I want to dive into how all these things work together, explore some alternatives, and go into some specific examples while keeping it relatively simple. I want this to be an overview of the concepts that can be referenced later on.</p>
<!-- raw HTML omitted -->
<h2 id="getting-started-with-sqlite">Getting Started With SQLite</h2>
<p>If you know a little bit about writing queries, the SQLite part should be pretty easy. Generally you set up a contract class that models the schema of your database. The outer class defines the database, and the inner classes define each table. BaseColumns by the way is used to include _ID as a column name, so that your datababase plays nicely with certain Android classes like CursorAdapter.</p>
<p>If you&rsquo;re starting out with a new app, you&rsquo;ll need to insert a new class and call it something like databaseNameContract, or in my case ProductContract. Here&rsquo;s an example from a sample database app I made:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-java" data-lang="java"><span class="line"><span class="cl"><span class="kd">public</span><span class="w"> </span><span class="kd">final</span><span class="w"> </span><span class="kd">class</span> <span class="nc">ProductContract</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// private constructor prevents accidental instantiation of the contract class</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">private</span><span class="w"> </span><span class="nf">ProductContract</span><span class="p">()</span><span class="w"> </span><span class="p">{}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// inner class defines the table</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">public</span><span class="w"> </span><span class="kd">static</span><span class="w"> </span><span class="kd">class</span> <span class="nc">ProductEntry</span><span class="w"> </span><span class="kd">implements</span><span class="w"> </span><span class="n">BaseColumns</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">public</span><span class="w"> </span><span class="kd">static</span><span class="w"> </span><span class="kd">final</span><span class="w"> </span><span class="n">String</span><span class="w"> </span><span class="n">TABLE_NAME</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">&#34;product&#34;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">public</span><span class="w"> </span><span class="kd">static</span><span class="w"> </span><span class="kd">final</span><span class="w"> </span><span class="n">String</span><span class="w"> </span><span class="n">COLUMN_NAME_NAME</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">&#34;name&#34;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><!-- raw HTML omitted -->
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-java" data-lang="java"><span class="line"><span class="cl"><span class="kd">public</span><span class="w"> </span><span class="kd">class</span> <span class="nc">ProductHelper</span><span class="w"> </span><span class="kd">extends</span><span class="w"> </span><span class="n">SQLiteOpenHelper</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">public</span><span class="w"> </span><span class="kd">static</span><span class="w"> </span><span class="kd">final</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">DATABASE_VERSION</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">public</span><span class="w"> </span><span class="kd">static</span><span class="w"> </span><span class="kd">final</span><span class="w"> </span><span class="n">String</span><span class="w"> </span><span class="n">DATABASE_NAME</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">&#34;products.db&#34;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">private</span><span class="w"> </span><span class="kd">static</span><span class="w"> </span><span class="kd">final</span><span class="w"> </span><span class="n">String</span><span class="w"> </span><span class="n">SQL_CREATE_ENTRIES</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">&#34;CREATE TABLE &#34;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">ProductEntry</span><span class="p">.</span><span class="na">TABLE_NAME</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="s">&#34; (&#34;</span><span class="w"> </span><span class="o">+</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">ProductEntry</span><span class="p">.</span><span class="na">_ID</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="s">&#34; INTEGER PRIMARY KEY,&#34;</span><span class="w"> </span><span class="o">+</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">ProductEntry</span><span class="p">.</span><span class="na">COLUMN_NAME_NAME</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="s">&#34; TEXT)&#34;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">public</span><span class="w"> </span><span class="nf">ProductHelper</span><span class="p">(</span><span class="n">Context</span><span class="w"> </span><span class="n">context</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">super</span><span class="p">(</span><span class="n">context</span><span class="p">,</span><span class="w"> </span><span class="n">DATABASE_NAME</span><span class="p">,</span><span class="w"> </span><span class="kc">null</span><span class="p">,</span><span class="w"> </span><span class="n">DATABASE_VERSION</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nd">@Override</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">public</span><span class="w"> </span><span class="kt">void</span><span class="w"> </span><span class="nf">onCreate</span><span class="p">(</span><span class="n">SQLiteDatabase</span><span class="w"> </span><span class="n">db</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">db</span><span class="p">.</span><span class="na">execSQL</span><span class="p">(</span><span class="n">SQL_CREATE_ENTRIES</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nd">@Override</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">public</span><span class="w"> </span><span class="kt">void</span><span class="w"> </span><span class="nf">onUpgrade</span><span class="p">(</span><span class="n">SQLiteDatabase</span><span class="w"> </span><span class="n">db</span><span class="p">,</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">oldVersion</span><span class="p">,</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">newVersion</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">onCreate</span><span class="p">(</span><span class="n">db</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><!-- raw HTML omitted -->
<!-- raw HTML omitted -->
]]></content:encoded>
    </item>
    
    <item>
      <title>Connecting to External Databases In Drupal 7</title>
      <link>https://blog.dalydays.com/post/external-databases-in-drupal-7/</link>
      <pubDate>Tue, 13 Sep 2016 00:00:00 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/external-databases-in-drupal-7/</guid>
      <description>A demonstration of connecting to an external MySQL database in Drupal 7</description>
      <content:encoded><![CDATA[<p>Recently at work I was tasked with finding a way to connect our Drupal 7 intranet site to our client database in order to prepopulate form fields with client data which in turn makes it easier on the sales team.</p>
<p>We all agreed this was a good idea, except nobody asked me what I thought. I wasn&rsquo;t familiar with connecting to external databases in Drupal, although I am the one who built and expanded our current Drupal site which is light years ahead of what we had with the old Sharepoint site. Sometimes it&rsquo;s good to just have a nudge in a certain direction, because I could always come back and tell them it&rsquo;s not feasible or would be way too complicated, if I needed to. Luckily, Drupal is very easy to work with and easy to customize.</p>
<h2 id="add-the-database">Add The Database</h2>
<p>In <strong>settings.php</strong>, add the database credentials for the external database you&rsquo;ll be connecting to, something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-php" data-lang="php"><span class="line"><span class="cl"><span class="o">&lt;?</span><span class="nx">php</span>
</span></span><span class="line"><span class="cl"> <span class="nv">$databases</span> <span class="o">=</span> <span class="k">array</span> <span class="p">(</span>
</span></span><span class="line"><span class="cl">   <span class="s1">&#39;default&#39;</span> <span class="o">=&gt;</span>
</span></span><span class="line"><span class="cl">   <span class="k">array</span> <span class="p">(</span>
</span></span><span class="line"><span class="cl">     <span class="s1">&#39;default&#39;</span> <span class="o">=&gt;</span>
</span></span><span class="line"><span class="cl">     <span class="k">array</span> <span class="p">(</span>
</span></span><span class="line"><span class="cl">       <span class="s1">&#39;database&#39;</span> <span class="o">=&gt;</span> <span class="s1">&#39;drupal_site&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="s1">&#39;username&#39;</span> <span class="o">=&gt;</span> <span class="s1">&#39;drupal_user&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="s1">&#39;password&#39;</span> <span class="o">=&gt;</span> <span class="s1">&#39;jfkd***&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="s1">&#39;host&#39;</span> <span class="o">=&gt;</span> <span class="s1">&#39;localhost&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="s1">&#39;port&#39;</span> <span class="o">=&gt;</span> <span class="s1">&#39;&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="s1">&#39;driver&#39;</span> <span class="o">=&gt;</span> <span class="s1">&#39;mysql&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="s1">&#39;prefix&#39;</span> <span class="o">=&gt;</span> <span class="s1">&#39;&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">     <span class="p">),</span>
</span></span><span class="line"><span class="cl">   <span class="p">),</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="s1">&#39;myexternaldatabase&#39;</span> <span class="o">=&gt;</span>
</span></span><span class="line"><span class="cl">   <span class="k">array</span> <span class="p">(</span>
</span></span><span class="line"><span class="cl">     <span class="s1">&#39;default&#39;</span> <span class="o">=&gt;</span>
</span></span><span class="line"><span class="cl">     <span class="k">array</span> <span class="p">(</span>
</span></span><span class="line"><span class="cl">       <span class="s1">&#39;database&#39;</span> <span class="o">=&gt;</span> <span class="s1">&#39;dbname&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="s1">&#39;username&#39;</span> <span class="o">=&gt;</span> <span class="s1">&#39;dbuser&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="s1">&#39;password&#39;</span> <span class="o">=&gt;</span> <span class="s1">&#39;dbpassword&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="s1">&#39;host&#39;</span> <span class="o">=&gt;</span> <span class="s1">&#39;db.url.local&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="s1">&#39;port&#39;</span> <span class="o">=&gt;</span> <span class="s1">&#39;3306&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="s1">&#39;driver&#39;</span> <span class="o">=&gt;</span> <span class="s1">&#39;mysql&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="s1">&#39;prefix&#39;</span> <span class="o">=&gt;</span> <span class="s1">&#39;&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">     <span class="p">),</span>
</span></span><span class="line"><span class="cl">   <span class="p">),</span>
</span></span><span class="line"><span class="cl"> <span class="p">);</span>
</span></span></code></pre></div><h2 id="write-a-module">Write A Module</h2>
<p>Write a custom module which connects to the database, does whatever it needs to, then reconnects to Drupal&rsquo;s database when it&rsquo;s done. In my case, I&rsquo;m writing a simple module that will utilize <a href="https://api.drupal.org/api/drupal/modules%21system%21system.api.php/function/hook_form_alter/7.x">hook_form_alter</a>{:target=&quot;_ blank&quot;}. Here&rsquo;s what mymodule.info looks like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-php" data-lang="php"><span class="line"><span class="cl"><span class="nx">name</span> <span class="o">=</span> <span class="nx">Prefill</span> <span class="nx">Webforms</span>
</span></span><span class="line"><span class="cl"><span class="nx">description</span> <span class="o">=</span> <span class="nx">whatever</span> <span class="nx">you</span> <span class="nx">want</span> <span class="nx">here</span>
</span></span><span class="line"><span class="cl"><span class="nx">core</span> <span class="o">=</span> <span class="mf">7.</span><span class="nx">x</span>
</span></span><span class="line"><span class="cl"><span class="nx">dependencies</span><span class="p">[]</span> <span class="o">=</span> <span class="nx">webform</span>
</span></span></code></pre></div><p>And here&rsquo;s what mymodule.module looks like (so far):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-php" data-lang="php"><span class="line"><span class="cl"><span class="o">&lt;?</span><span class="nx">php</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="sd">/**
</span></span></span><span class="line"><span class="cl"><span class="sd"> * @file
</span></span></span><span class="line"><span class="cl"><span class="sd"> * Pull data from the external database to prefill webforms.
</span></span></span><span class="line"><span class="cl"><span class="sd"> */</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="sd">/**
</span></span></span><span class="line"><span class="cl"><span class="sd"> * Implements hook_help()
</span></span></span><span class="line"><span class="cl"><span class="sd"> *
</span></span></span><span class="line"><span class="cl"><span class="sd"> * Displays help and module information
</span></span></span><span class="line"><span class="cl"><span class="sd"> *
</span></span></span><span class="line"><span class="cl"><span class="sd"> * @param path
</span></span></span><span class="line"><span class="cl"><span class="sd"> *   Which path of the site we&#39;re using to display help
</span></span></span><span class="line"><span class="cl"><span class="sd"> * @param arg
</span></span></span><span class="line"><span class="cl"><span class="sd"> *  Array that holds the current path as returned from arg() function
</span></span></span><span class="line"><span class="cl"><span class="sd"> */</span>
</span></span><span class="line"><span class="cl"><span class="k">function</span> <span class="nf">prefill_webforms_help</span><span class="p">(</span><span class="nv">$path</span><span class="p">,</span> <span class="nv">$arg</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">  <span class="nv">$output</span> <span class="o">=</span> <span class="s1">&#39;&#39;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="k">switch</span> <span class="p">(</span><span class="nv">$path</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">case</span> <span class="s2">&#34;admin/help#prefill_webforms&#34;</span><span class="o">:</span>
</span></span><span class="line"><span class="cl">      <span class="nv">$output</span> <span class="o">=</span> <span class="nx">t</span><span class="p">(</span><span class="s1">&#39;To use this module, create a webform, then modify the code of this module to handle that particular form upon submission.&#39;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">  <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="k">return</span> <span class="nv">$output</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="sd">/**
</span></span></span><span class="line"><span class="cl"><span class="sd"> * Implements hook_form_alter()
</span></span></span><span class="line"><span class="cl"><span class="sd"> *
</span></span></span><span class="line"><span class="cl"><span class="sd"> * @param form
</span></span></span><span class="line"><span class="cl"><span class="sd"> *   Nested array of form elements that comprise the form.
</span></span></span><span class="line"><span class="cl"><span class="sd"> * @param form_state
</span></span></span><span class="line"><span class="cl"><span class="sd"> *   A keyed array containing the current state of the form. The arguments that drupal_get_form() was originally called with are available in the array $form_state[&#39;build_info&#39;][&#39;args&#39;].
</span></span></span><span class="line"><span class="cl"><span class="sd"> * @param form_id
</span></span></span><span class="line"><span class="cl"><span class="sd"> *   String representing the name of the form itself. Typically this is the name of the function that generated the form.
</span></span></span><span class="line"><span class="cl"><span class="sd"> */</span>
</span></span><span class="line"><span class="cl"><span class="k">function</span> <span class="nf">prefill_webforms_form_alter</span><span class="p">(</span><span class="o">&amp;</span><span class="nv">$form</span><span class="p">,</span> <span class="o">&amp;</span><span class="nv">$form_state</span><span class="p">,</span> <span class="nv">$form_id</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">  <span class="c1">//dpm($form);
</span></span></span><span class="line"><span class="cl"><span class="c1"></span>  <span class="k">if</span> <span class="p">(</span><span class="nv">$form_id</span> <span class="o">==</span> <span class="s1">&#39;webform_client_form_229&#39;</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nv">$form</span><span class="p">[</span><span class="s1">&#39;#validate&#39;</span><span class="p">][]</span> <span class="o">=</span> <span class="s1">&#39;prefill_webforms_submit_handler&#39;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">  <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="sd">/**
</span></span></span><span class="line"><span class="cl"><span class="sd"> * Implements hook_submit_handler()
</span></span></span><span class="line"><span class="cl"><span class="sd"> *
</span></span></span><span class="line"><span class="cl"><span class="sd"> * @param form
</span></span></span><span class="line"><span class="cl"><span class="sd"> *   Nested array of form elements that comprise the form.
</span></span></span><span class="line"><span class="cl"><span class="sd"> * @param form_state
</span></span></span><span class="line"><span class="cl"><span class="sd"> *   A keyed array containing the current state of the form. The arguments that drupal_get_form() was originally called with are available in the array $form_state[&#39;build_info&#39;][&#39;args&#39;].
</span></span></span><span class="line"><span class="cl"><span class="sd"> */</span>
</span></span><span class="line"><span class="cl"><span class="k">function</span> <span class="nf">prefill_webforms_submit_handler</span><span class="p">(</span><span class="nv">$form</span><span class="p">,</span> <span class="o">&amp;</span><span class="nv">$form_state</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">  <span class="c1">// get submitted client ID from form
</span></span></span><span class="line"><span class="cl"><span class="c1"></span>  <span class="nv">$submitted_client_id</span> <span class="o">=</span> <span class="nv">$form_state</span><span class="p">[</span><span class="s1">&#39;input&#39;</span><span class="p">][</span><span class="s1">&#39;submitted&#39;</span><span class="p">][</span><span class="s1">&#39;client_id&#39;</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="c1">// select the maveric database defined in settings.php
</span></span></span><span class="line"><span class="cl"><span class="c1"></span>  <span class="nx">db_set_active</span><span class="p">(</span><span class="s1">&#39;myexternaldatabase&#39;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">  <span class="c1">// fetch the query using the submitted client ID
</span></span></span><span class="line"><span class="cl"><span class="c1"></span>  <span class="nv">$result</span> <span class="o">=</span> <span class="nx">db_query</span><span class="p">(</span><span class="s2">&#34;SELECT t.name, t.website FROM tablename t WHERE client_id = :client_id&#34;</span><span class="p">,</span> <span class="nx">arrary</span><span class="p">(</span><span class="s1">&#39;client_id&#39;</span><span class="o">=&gt;</span><span class="nv">$submitted_client_id</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">  <span class="c1">// set database connection back to default
</span></span></span><span class="line"><span class="cl"><span class="c1"></span>  <span class="nx">db_set_active</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="k">foreach</span><span class="p">(</span><span class="nv">$result</span> <span class="k">as</span> <span class="nv">$record</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nv">$form_state</span><span class="p">[</span><span class="s1">&#39;values&#39;</span><span class="p">][</span><span class="s1">&#39;submitted&#39;</span><span class="p">][</span><span class="s1">&#39;client_name&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="nv">$record</span><span class="o">-&gt;</span><span class="na">name</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="nv">$form_state</span><span class="p">[</span><span class="s1">&#39;values&#39;</span><span class="p">][</span><span class="s1">&#39;submitted&#39;</span><span class="p">][</span><span class="s1">&#39;website&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="nv">$record</span><span class="o">-&gt;</span><span class="na">website</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">  <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="c1">//dpm($form);
</span></span></span><span class="line"><span class="cl"><span class="c1"></span>  <span class="c1">//dpm($form_state);
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="p">}</span>
</span></span></code></pre></div>]]></content:encoded>
    </item>
    
    <item>
      <title>My Android Development Journey</title>
      <link>https://blog.dalydays.com/post/learning-android-development/</link>
      <pubDate>Mon, 29 Aug 2016 00:00:00 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/learning-android-development/</guid>
      <description>My story about learning Android development so far</description>
      <content:encoded><![CDATA[<p>Lately I&rsquo;ve been learning to develop for Android using <a href="https://www.udacity.com/">Udacity</a>. There are a lot of other good resources available, but Udacity seems promising because the Android courses were developed by Google and Udacity.</p>
<p>I wouldn&rsquo;t consider myself a beginner in programming, but I also don&rsquo;t have a ton of work experience (maybe more than I realize though). I do some limited JavaScript at work to manipulate <a href="http://www.eclipse.org/birt/">BIRT</a> reports. I also wrote a company portal site using PHP which authenticates against our Active Directory server to log in. I excelled in C and C++ when I was learning that 10+ years ago by reading books and visiting the <a href="http://www.cprogramming.com/">CProgramming</a> forums. In particular, <a href="https://www.amazon.com/Groups-Primer-Mitchell-Stephen-Paperback/dp/B011DC2HNG?SubscriptionId=AKIAILSHYYTFIVPWUY6Q&amp;tag=duckduckgo-d-20&amp;linkCode=xm2&amp;camp=2025&amp;creative=165953&amp;creativeASIN=B011DC2HNG">C Primer Plus</a>, and heck yes, I have the first edition. I found it for $1 at a Goodwill, and it is one of the best programming books I&rsquo;ve read.</p>
<p>So lately I&rsquo;m realizing more and more that my job value is not all about how much I already know. It&rsquo;s all about being able to learn what I need to know and being able to get something done. When I think back to any job I&rsquo;ve ever had, that&rsquo;s how it worked. I didn&rsquo;t know everything before getting hired, I used some past experience and applied it to the new environment, adding new skills and knowledge along the way. I used to think I needed to &ldquo;learn&rdquo; a programming language which to me meant I needed to read books and make sure I could complete all the chapter exercises, and after I completed the &ldquo;advanced&rdquo; sections that I would be capable of programming in C (or whatever language). The reality is that I wasted tons of time memorizing how to do things like memory management in C (malloc), use pointers, and a bunch of other stuff I have never used outside of the practice problems. Sure it&rsquo;s valuable from an academic standpoint, but not from a career advancement standpoint. It&rsquo;s not about book learning, it&rsquo;s about experience. Learn primarily by doing and not primarily by reading. If you&rsquo;re going to read a book, do while you read. I&rsquo;m realizing this now, and this is what I plan to do about it.</p>
<p>The best thing I can do is start programming things. I don&rsquo;t think I will come up with the best ideas for game changing apps that will change the world, but I can write apps that copy other ideas, just for practice. But that&rsquo;s practice that has real world value. Practice interacting with databases, practice authenticating users, and practice arranging layouts. The more I do now, the less time it will take me on the job.</p>
<p>At this point, I&rsquo;ve decided to learn Android. I also realize I will never be an expert in one area, maybe not even Android. But I am an expert in learning what I need to know to move forward with a problem and get the job done. Right now I will focus on learning Android by going through the courses at Udacity and writing apps. I post everything I do to <a href="https://github.com/linucksrox">GitHub</a> for anyone to check out. I particularly enjoyed the MediaPlayer sample app which I used to play the Rick Roll song, then I disabled the Pause/Stop buttons and installed it on a friend&rsquo;s phone. It&rsquo;s hilarious because you can&rsquo;t stop the music without force closing the app. Don&rsquo;t worry, I won&rsquo;t publish that to the Play store.</p>
<p>I plan to learn Android, and I&rsquo;m committing to writing a blog once a week, starting now. I want to mention John Sonmez at <a href="https://simpleprogrammer.com/">Simple Programmer</a>. I saw him on Youtube first, and wasn&rsquo;t sure of my opinion. After reading his blog and watching more videos though, I agree with a lot of what he says and it just started to click. This is me taking action and starting my blog</p>
]]></content:encoded>
    </item>
    
    <item>
      <title>How To Get Going With Jekyll and GitHub Pages</title>
      <link>https://blog.dalydays.com/post/setting-up-jekyll/</link>
      <pubDate>Fri, 29 Jul 2016 00:00:00 +0000</pubDate>
      
      <guid>https://blog.dalydays.com/post/setting-up-jekyll/</guid>
      <description>How to set up Jekyll for a blog (outdated)</description>
      <content:encoded><![CDATA[<p>I&rsquo;m running <a href="https://xubuntu.org/">Xubuntu 16.04</a>. This method should be the same/similar in any Debian/Ubuntu distro. These are the steps I followed to get this thing working, using the terminal:</p>
<pre tabindex="0"><code>sudo apt install ruby ruby-dev ri bundler build-essential git
sudo gem install jekyll
sudo gem install minima
</code></pre><p>Note that if the Jekyll version is newer than the supported version on GitHub (see the <a href="https://pages.github.com/versions/">GitHub dependency versions</a> page) then you will need to install the latest supported version of jekyll instead, like this:</p>
<pre tabindex="0"><code>sudo gem install jekyll -v 3.1.6
</code></pre><p>Next, you need to create a special repo in GitHub for your site. In GitHub, create a new repo, and name it:</p>
<pre tabindex="0"><code>whatever-your-username-is.github.io
</code></pre><p>Finally, grab a local copy of the new repo:</p>
<pre tabindex="0"><code>git clone https://github.com/whatever-your-username-is/whatever-your-username-is.github.io
</code></pre><p>Create a brand new Jekyll site with the same name:</p>
<pre tabindex="0"><code>jekyll new whatever-your-username-is.github.io
</code></pre><p>Move to the directory:</p>
<pre tabindex="0"><code>cd whatever-your-username-is.github.io
</code></pre><p>Test your site locally by running this command, then go to the site it shows you in the terminal (default http://127.0.0.1:4000):</p>
<pre tabindex="0"><code>jekyll serve
</code></pre><p>Once you know it&rsquo;s working, or after you make whatever changes you want, throw it back up on GitHub!</p>
<pre tabindex="0"><code>git push
</code></pre><p>After you successfully push your newly created Jekyll site to GitHub, wait maybe 2-3 seconds (not really, it&rsquo;s just amazingly fast) and then go to:</p>
<pre tabindex="0"><code>http://whatever-your-username-is.github.io
</code></pre><p>Start blogging!</p>
]]></content:encoded>
    </item>
    
  </channel>
</rss>
