[{"content":"","date":"2026-05-07","externalUrl":null,"permalink":"/blog/","section":"Blog \u0026 Articles","summary":"","title":"Blog \u0026 Articles","type":"blog"},{"content":"","date":"2026-05-07","externalUrl":null,"permalink":"/categories/","section":"Categories","summary":"","title":"Categories","type":"categories"},{"content":"","date":"2026-05-07","externalUrl":null,"permalink":"/tags/deep-learning/","section":"Tags","summary":"","title":"Deep-Learning","type":"tags"},{"content":" I\u0026rsquo;m a Software Engineer and PhD in Distributed Systems, with a focus on Site Reliability Engineering and Application Performance Management. I\u0026rsquo;m passionate about building backend systems, managing cloud infrastructure, and automating everything in between.\nMy journey started at the University of Catania\u0026rsquo;s Distributed Computing Lab. That passion deepened in my industry roles, giving me the space to apply emerging technologies in production. My expertise includes container orchestration (Kubernetes, OpenShift/OKD, Docker), infrastructure automation (Ansible, Helm), and CI/CD design, alongside hands-on experience with Kafka, Airflow, Linux administration, and networking. Currently, I\u0026rsquo;m exploring the AI and LLM space, focusing on fine-tuning models for domain-specific applications and deploying them in production at scale.\nCurrent focus # QoS \u0026amp; SRE SLA-driven decision engines and APM across the edge-cloud continuum. Kubernetes Workload simulation, scheduler evaluation, and failure injection on bare-metal clusters. Edge networks Latency-aware routing and resource allocation, including FANET-style multi-layer scenarios. LLMs On-prem inference (vLLM, Ollama), fine-tuning with Unsloth, privacy-risk analysis on legal text (COAT). ","date":"2026-05-07","externalUrl":null,"permalink":"/","section":"Hi, ciao","summary":"","title":"Hi, ciao","type":"page"},{"content":"","date":"2026-05-07","externalUrl":null,"permalink":"/tags/index/","section":"Tags","summary":"","title":"Index","type":"tags"},{"content":"","date":"2026-05-07","externalUrl":null,"permalink":"/tags/llm/","section":"Tags","summary":"","title":"LLM","type":"tags"},{"content":"","date":"2026-05-07","externalUrl":null,"permalink":"/categories/machine-learning/","section":"Categories","summary":"","title":"Machine Learning","type":"categories"},{"content":"","date":"2026-05-07","externalUrl":null,"permalink":"/tags/nlp/","section":"Tags","summary":"","title":"NLP","type":"tags"},{"content":"","date":"2026-05-07","externalUrl":null,"permalink":"/tags/self-attention/","section":"Tags","summary":"","title":"Self-Attention","type":"tags"},{"content":"","date":"2026-05-07","externalUrl":null,"permalink":"/tags/","section":"Tags","summary":"","title":"Tags","type":"tags"},{"content":"","date":"2026-05-07","externalUrl":null,"permalink":"/tags/transformer/","section":"Tags","summary":"","title":"Transformer","type":"tags"},{"content":" The paper Attention Is All You Need (Vaswani et al., 2017) redefined the field of deep learning for natural language. It introduced the Transformer, an architecture that completely removes recurrent and convolutional networks, replacing them with a single mechanism: self-attention. This article walks through the intuition behind the Transformer — starting from the problem it solves, through the math of the mechanism, and into the implications for training and inference.\nConceptual prerequisites # Before getting into the Transformer, three foundational concepts.\nWord embeddings: words as vectors # A language model doesn\u0026rsquo;t operate on text strings, but on numerical vectors. Each token (word or sub-word) is mapped to a point in \\(\\mathbb{R}^d\\) via an embedding matrix learned during training. In the original Transformer, \\(d_{\\text{model}} = 512\\).\nThe intuition is geometric: words with similar meanings occupy nearby regions of vector space. Operations like \\(\\text{vec}(\\text{king}) - \\text{vec}(\\text{man}) + \\text{vec}(\\text{woman}) \\approx \\text{vec}(\\text{queen})\\) show that directions in the space encode semantic relationships.\nThe sequence-to-sequence problem # Many NLP tasks (translation, summarization, question answering) require transforming an input sequence into an output sequence. Before the Transformer, the dominant architecture was the RNN-based encoder-decoder: an encoder that compresses the input sequence into a fixed representation, and a decoder that generates the output one token at a time. The Transformer keeps this encoder-decoder structure but radically changes the internal mechanism.\nThe RNN bottleneck # Recurrent networks (RNNs, LSTMs, GRUs) process sequences one element at a time, in order. At each position \\(t\\), the model computes a hidden state \\(h_t\\) as a function of the previous state and the current input:\n$$h_t = f(h_{t-1}, x_t)$$\nThis serial dependency has two critical consequences:\nNo parallelism: to compute \\(h_t\\) you must have completed \\(h_{t-1}\\). On modern GPUs this is a devastating efficiency constraint. Signal degradation: information from the first token has to travel through the entire chain to reach the last. Even though LSTM and GRU gates mitigate the vanishing gradient, the path length between two distant positions remains \\(O(n)\\). Interactive visualization — Sequential RNN processing\nUse the arrows to step through and watch how each hidden state \\\\(h_t\\\\) depends on \\\\(h_{t-1}\\\\) being complete.\n\u0026#8592; \u0026#8594; \u0026#9654; Formulah₀ = 0 DependencyInitial state Parallelizable?— The Transformer architecture: overview # The Transformer replaces recurrence with self-attention: a mechanism that lets every position in the sequence directly access all the others, in a single computational step.\nThe architecture keeps the encoder-decoder structure:\nEncoder (6 identical layers): each layer contains a multi-head self-attention block followed by a feed-forward network. Every sub-layer is wrapped in a residual connection and layer normalization. Decoder (6 identical layers): like the encoder, but with an additional cross-attention block that attends over the encoder output, plus a masked self-attention that prevents the decoder from \u0026ldquo;looking ahead\u0026rdquo; during generation. The depth (6+6 layers) and width (\\(d_{\\text{model}} = 512\\)) are the hyperparameters that define the capacity of the base Transformer.\nSelf-attention: the core mechanism # From O(n) to O(1): comparison with RNNs # The computational advantage of self-attention is immediate: the maximum path length between any two positions drops from \\(O(n)\\) (RNN) to \\(O(1)\\) (self-attention). This means information from the first token can directly influence the last one, with no signal degradation through intermediate states.\nInteractive visualization — RNN vs Transformer\nClick a token to compare the information path in the two architectures.\nSequential RNN — chain of hidden states Parallel Transformer — self-attention Click a token to compare the information path\nRNN steps— Transformer steps— Complexity— Query, Key, Value: the intuition # The three vectors Q, K, V are the heart of self-attention. The most immediate analogy is a library:\nThe Query (\\(Q\\)) is the question a position asks: \u0026ldquo;what information do I need?\u0026rdquo; The Key (\\(K\\)) is the label every position exposes: \u0026ldquo;this is what my position contains\u0026rdquo; The Value (\\(V\\)) is the actual information content extracted when there\u0026rsquo;s a match Q, K, and V are not defined by hand — they\u0026rsquo;re computed via learned linear projections:\n$$Q = XW^Q, \\quad K = XW^K, \\quad V = XW^V$$\nwhere \\(W^Q, W^K \\in \\mathbb{R}^{d_{\\text{model}} \\times d_k}\\) and \\(W^V \\in \\mathbb{R}^{d_{\\text{model}} \\times d_v}\\) are weight matrices. Training optimizes these so that:\n\\(W^Q\\) produces vectors that \u0026ldquo;ask the right question\u0026rdquo; for each position \\(W^K\\) produces vectors that \u0026ldquo;describe the content\u0026rdquo; in a way compatible with the queries \\(W^V\\) produces vectors carrying the useful information to extract The dot product \\(Q_i \\cdot K_j\\) measures the compatibility between position \\(i\\)\u0026rsquo;s query and position \\(j\\)\u0026rsquo;s key: aligned vectors in the space produce high scores, orthogonal vectors produce zero.\nInteractive visualization — The roles of Query, Key, and Value\nClick each token to see how its \"question\" (Q), its \"label\" (K), and its \"content\" (V) change, and how the attention weights distribute information.\nQuery Key Value Scaled dot-product attention: the math # The full self-attention formula, as presented in the paper, is:\n$$\\text{Attention}(Q, K, V) = \\text{softmax}\\left(\\frac{QK^T}{\\sqrt{d_k}}\\right) V$$\nLet\u0026rsquo;s unpack it step by step.\nStep 1: Dot product \\(QK^T\\) # We compute the dot product between every query vector and every key vector, producing an \\(n \\times n\\) matrix of raw scores. The score \\(s_{ij} = Q_i \\cdot K_j\\) measures how much position \\(i\\) should \u0026ldquo;pay attention\u0026rdquo; to position \\(j\\).\nStep 2: Scaling by \\(\\sqrt{d_k}\\) # The factor \\(\\frac{1}{\\sqrt{d_k}}\\) is crucial. Without it, with large \\(d_k\\) the dot product tends to have high-magnitude values (variance grows linearly with \\(d_k\\)). This pushes the softmax into saturation regions where gradients are nearly zero, hindering training. With \\(d_k = 64\\), the scaling divides by \\(\\sqrt{64} = 8\\), bringing the score variance back to 1.\nStep 3: Softmax # The softmax normalizes each row of the score matrix into a probability distribution:\n$$\\alpha_{ij} = \\frac{\\exp(s_{ij} / \\sqrt{d_k})}{\\sum_{l=1}^{n} \\exp(s_{il} / \\sqrt{d_k})}$$\nThe weights \\(\\alpha_{ij}\\) sum to 1 for every row \\(i\\). A high weight indicates that position \\(j\\) is strongly relevant to position \\(i\\).\nStep 4: Weighted average of Values # The output for each position is a weighted average of the Value vectors, with the weights from the softmax:\n$$z_i = \\sum_{j=1}^{n} \\alpha_{ij} V_j$$\nEach position gets a representation that mixes the contents of all the other positions, proportional to semantic relevance.\nInteractive visualization — Scaled dot-product attention step-by-step\nNavigate the 5 steps with the buttons and click the tokens to change the active query.\nMulti-head attention # A single attention head captures only one type of relation between positions. Multi-head attention runs \\(h\\) heads in parallel, each with its own projection matrices, and concatenates the results:\n$$\\text{MultiHead}(Q, K, V) = \\text{Concat}(\\text{head}_1, \\ldots, \\text{head}_h) W^O$$\nwhere each head is:\n$$\\text{head}_i = \\text{Attention}(QW_i^Q, KW_i^K, VW_i^V)$$\nIn the base Transformer: \\(h = 8\\) heads, each with \\(d_k = d_v = d_{\\text{model}} / h = 64\\). The total compute cost is equivalent to a single full-dimension head, but the model can simultaneously attend to different relations — one head might capture subject-verb agreement, another coreference, another syntactic structure.\nThe final projection \\(W^O \\in \\mathbb{R}^{hd_v \\times d_{\\text{model}}}\\) maps the concatenation of the \\(h\\) heads back into the \\(d_{\\text{model}}\\)-dimensional space.\nPositional encoding # Since self-attention is order-invariant (it treats the sequence as a set), the Transformer needs an explicit position signal. The paper uses sinusoidal functions:\n$$PE_{(pos, 2i)} = \\sin\\left(\\frac{pos}{10000^{2i/d_{\\text{model}}}}\\right)$$\n$$PE_{(pos, 2i+1)} = \\cos\\left(\\frac{pos}{10000^{2i/d_{\\text{model}}}}\\right)$$\nEach embedding dimension receives a sinusoidal signal at a different frequency. The choice of sinusoids lets the model learn to attend to relative positions: for any fixed offset \\(k\\), the transformation \\(PE_{pos+k}\\) can be expressed as a linear function of \\(PE_{pos}\\).\nThe positional encoding is added to the token embedding before entering the first layer.\nFeed-forward network and residual connections # Every Transformer layer contains, after the attention block, a position-wise feed-forward network:\n$$\\text{FFN}(x) = \\max(0, xW_1 + b_1)W_2 + b_2$$\nTwo linear transformations with a ReLU in between. The inner dimension is \\(d_{ff} = 2048\\) (4× the model dimension). This network is applied independently at each position — this is where the model \u0026ldquo;reasons\u0026rdquo; over a single representation after enriching it with context via attention.\nEvery sub-layer (attention or FFN) is wrapped by:\nResidual connection: \\(x + \\text{SubLayer}(x)\\) — lets gradients flow directly through the layers, stabilizing training of deep networks. Layer normalization: normalizes activations to reduce internal covariate shift. The full sub-layer scheme is: \\(\\text{LayerNorm}(x + \\text{SubLayer}(x))\\).\nThe Transformer at inference time # Encoder: a single parallel pass # At inference, the encoder processes the entire input sequence in a single forward pass. Every token \u0026ldquo;sees\u0026rdquo; all the others through self-attention, and the output is a sequence of contextualized representations — every position contains information about the whole sentence.\nDecoder: autoregressive generation # The decoder produces the output one token at a time, autoregressively:\nReceives the start-of-sequence token (e.g. \u0026lt;sos\u0026gt;) Produces a probability distribution over the vocabulary for the next token Selects the token (greedy, beam search, or sampling) Appends it to the decoder input and repeats At each step, the masked self-attention prevents the decoder from looking at future positions. For position \\(t\\), the attention scores toward positions \\(t+1, t+2, \\ldots\\) are set to \\(-\\infty\\) before the softmax, zeroing their weight. This is essential: without the mask, the model would \u0026ldquo;see the answer\u0026rdquo; and never learn to predict.\nThe decoder also contains a cross-attention block that operates on the encoder output: the Queries come from the decoder, while the Keys and Values come from the encoder. This lets every generated position \u0026ldquo;consult\u0026rdquo; the entire input sequence.\nThe Transformer at training time # Teacher forcing # During training, the decoder doesn\u0026rsquo;t generate autoregressively — it uses teacher forcing: it receives the correct target sequence (right-shifted by one position) as input and predicts every token in parallel. The masked attention still ensures position \\(t\\) doesn\u0026rsquo;t see future tokens.\nThis makes it possible to compute the loss across all tokens in a single forward pass, fully exploiting the parallelism of self-attention.\nLoss function # The loss is the cross-entropy between the model\u0026rsquo;s predicted distribution and the correct token:\n$$\\mathcal{L} = -\\sum_{t=1}^{T} \\log P(y_t | y_{\u0026lt;t}, X)$$\nwhere \\(y_t\\) is the target token and \\(y_{\u0026lt;t}\\) are the previous tokens. Vaswani et al. also use label smoothing (\\(\\epsilon = 0.1\\)): instead of assigning probability 1 to the correct token, they distribute a small amount of probability mass over the others. This penalizes the model\u0026rsquo;s overconfidence and improves generalization.\nOptimization # The optimizer is Adam (\\(\\beta_1 = 0.9\\), \\(\\beta_2 = 0.98\\), \\(\\epsilon = 10^{-9}\\)) with a learning-rate warm-up schedule:\n$$lr = d_{\\text{model}}^{-0.5} \\cdot \\min(\\text{step}^{-0.5}, \\text{step} \\cdot \\text{warmup_steps}^{-1.5})$$\nThe learning rate grows linearly for the first warmup_steps (4000), then decays proportionally to the inverse square root of the step number. This schedule avoids instability in the early phase of training, when parameters are still far from a good region.\nRegularization # In addition to label smoothing, the Transformer uses dropout (\\(P_{drop} = 0.1\\)) applied to:\nThe output of every sub-layer (before the residual connection) The sum of embedding + positional encoding The attention weights themselves Computational complexity comparison # Self-attention RNN Convolution Per-layer complexity \\(O(n^2 \\cdot d)\\) \\(O(n \\cdot d^2)\\) \\(O(k \\cdot n \\cdot d^2)\\) Sequential operations \\(O(1)\\) \\(O(n)\\) \\(O(1)\\) Maximum path length \\(O(1)\\) \\(O(n)\\) \\(O(\\log_k n)\\) Self-attention pays a quadratic cost in sequence length (\\(n^2\\)), but every operation is parallelizable. For typical NLP sequences (\\(n \u0026lt; 1000\\)) at the time of publication this trade-off was clearly favorable. For very long sequences, variants like sparse attention reduce the complexity.\nResults and impact # The base Transformer (65M parameters, 6+6 layers, \\(d_{\\text{model}} = 512\\)) reached a BLEU score of 27.3 on English-to-German translation (WMT 2014), with a training cost of 3.3 days on 8 P100 GPUs. The big Transformer (213M parameters, \\(d_{\\text{model}} = 1024\\)) reached 28.4 BLEU on the same task — state of the art at the time of publication.\nBut the real impact of the paper goes far beyond machine translation. The Transformer architecture became the foundation of:\nBERT (Devlin et al., 2019): encoder-only, bidirectional pre-training with masked language modeling GPT (Radford et al., 2018-2023): decoder-only, autoregressive pre-training, scaled to hundreds of billions of parameters T5 (Raffel et al., 2020): encoder-decoder, every NLP task framed as text-to-text All current Large Language Models (Llama, Claude, Gemini, etc.) References # Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., \u0026amp; Polosukhin, I. (2017). Attention Is All You Need. Advances in Neural Information Processing Systems (NeurIPS). arXiv:1706.03762 Devlin, J., Chang, M., Lee, K., \u0026amp; Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. NAACL-HLT. arXiv:1810.04805 Radford, A., Narasimhan, K., Salimans, T., \u0026amp; Sutskever, I. (2018). Improving Language Understanding by Generative Pre-Training. OpenAI. Brown, T. B., et al. (2020). Language Models are Few-Shot Learners. NeurIPS. arXiv:2005.14165 Bahdanau, D., Cho, K., \u0026amp; Bengio, Y. (2015). Neural Machine Translation by Jointly Learning to Align and Translate. ICLR. arXiv:1409.0473 (The paper that introduced the attention mechanism for RNNs, direct precursor of self-attention.) Ba, J. L., Kiros, J. R., \u0026amp; Hinton, G. E. (2016). Layer Normalization. arXiv:1607.06450 Alammar, J. (2018). The Illustrated Transformer. jalammar.github.io (Excellent visual guide, complementary to this article.) Article written with the assistance of Claude (Anthropic). The interactive visualizations were developed during a study session on the original paper.\n","date":"2026-05-07","externalUrl":null,"permalink":"/blog/transformers-explainer/","section":"Blog \u0026 Articles","summary":"","title":"Understanding Transformers: from intuition to the math","type":"blog"},{"content":"","date":"2026-05-05","externalUrl":null,"permalink":"/tags/devops/","section":"Tags","summary":"","title":"Devops","type":"tags"},{"content":" Why ostree-based systems like Fedora Silverblue and Bottlerocket trade Linux\u0026rsquo;s old habits for atomic, predictable updates - and what /sysroot is really doing under the hood. I never imagined I\u0026rsquo;d have to reboot a system just to install a package.\nThat, after all, is part of why Linux quietly took over daily operations everywhere - cough Hello, Windows cough. Robust, simple, predictable. You install a package you\u0026rsquo;ve never heard of, run a command you copy-pasted from Stack Overflow in a hurry, and everything works. It isn\u0026rsquo;t magic: it\u0026rsquo;s the patient work of countless people who, over thirty years, have made an operating system genuinely solid.\nNow imagine this: after years of Ubuntu LTS humming along on your servers, somebody walks up and tells you, \u0026ldquo;for a more robust system, you need to reboot.\u0026rdquo; Sounds backwards. And yet that\u0026rsquo;s exactly what an immutable OS asks of you.\nIt isn\u0026rsquo;t done out of love for ruining your uptime. Behind that request lies a very specific idea: separate what\u0026rsquo;s sitting on the disk from what the system is using right now. Once you grasp that trick, you also start to understand why df -h on a Fedora Silverblue install shows you weird things - like a mount point called /sysroot you\u0026rsquo;d never seen before.\nThis article is the explanation I wish I\u0026rsquo;d had in the afternoon I first stared at that /sysroot and started wondering what the hell was going on.\nThe HelloWord Boot Process # To understand what changes in an immutable OS, it\u0026rsquo;s worth reviewing how a regular Linux works. Let\u0026rsquo;s consider a typical Linux distro on the disk, such as Ubuntu or Debian.\nThe disk has (at least) two partitions we care about: a small /boot partition with the kernel and bootloader, and a root partition with the actual filesystem: /usr, /etc, /var, and whatever chaos has accumulated in /home.\nWe hit the power button and this is what happens:\nThe firmware (BIOS/UEFI) initializes the hardware and looks for a bootloader on the disk. The bootloader (GRUB, u-boot, systemd-boot) loads two things from the /boot partition: the kernel and a file called the initramfs. Both go into RAM. The kernel starts. The initramfs is a compressed CPIO archive: the kernel extracts it into RAM and uses it as the first /. Inside there\u0026rsquo;s just the bare essentials: busybox, init scripts, drivers needed to mount the real root partition. The initramfs has one job: identify the real root partition, mount it, and hand over control. When it succeeds, it performs a switch root: the system root becomes the partition on disk, the initramfs gets unmounted, its RAM freed. systemd starts and the boot proceeds normally. The key point: after boot, / is the disk. One-to-one mapping. The initramfs served as a trampoline and no longer exists. If you cat /etc/hostname, you\u0026rsquo;re reading a file that physically sits on the root partition. No surprises.\nThe Problem: Updating Without Praying # How many times have we fired off apt upgrade on a server you actually cared about, and held your breath until it finished? Exactly.\nImagine being able to update the system atomically. Either we\u0026rsquo;re on the old version or the new one, never halfway. No more \u0026ldquo;the package manager got interrupted and now the system won\u0026rsquo;t boot.\u0026rdquo; And if the new version turns out broken, you want an instant rollback.\nA naïve solution: keep two complete copies of the OS on disk, and pick one at boot. But now you have a representation problem. How does this present itself to userspace? If you have /usr-version-A and /usr-version-B on disk, programs don\u0026rsquo;t know to look there. They expect /usr and nothing else.\nYou need a layer of indirection.\nThe Solution: Disk as Archive, / as Window # Immutable OSes (Fedora Silverblue, Fedora CoreOS, Fedora IoT, and others) use a technology called ostree that solves exactly this problem. Systems like openSUSE MicroOS take a conceptually similar approach with different technology (btrfs snapshots). The idea is simple and elegant.\nOn disk, the root partition no longer contains /usr, /etc, /var as plain directories. It contains a more articulated structure:\n(root partition, raw view) ├── boot/ └── ostree/ ├── repo/ ← content-addressed objects └── deploy/\u0026lt;distro\u0026gt;/ ├── deploy/abc123.../ ← complete OS, version N │ ├── usr/ │ ├── etc/ │ └── ... ├── deploy/def456.../ ← complete OS, version N-1 └── var/ ← shared state On disk we have N complete installations of the OS, each in its own directory called a \u0026ldquo;deployment.\u0026rdquo; They\u0026rsquo;re content-addressed: their name is the SHA-256 hash of the contents. Think of each deploy/abc123/ as an unpacked container image on disk, because conceptually that\u0026rsquo;s exactly what it is.\nBut then how does the system make a program believe /usr exists? This is where /sysroot enters the picture.\n/sysroot: The Window onto the Warehouse # When an ostree system boots, it makes an elegant move. Instead of mounting the root partition on / like a regular Linux, it mounts it on /sysroot. Then it builds an artificial / made of bind mounts that reach into a specific deployment.\nAt runtime, from inside the system, we see this:\n/ ← artificial view, built at boot ├── usr ← bind mount → /sysroot/.../deploy/abc123/usr ├── etc ← /sysroot/.../deploy/abc123/etc (3-way merge) ├── var ← bind mount → /sysroot/.../var └── sysroot/ ← the actual disk, in its entirety └── ostree/... We get two views onto the same disk:\ncd / shows us only the active deployment, presented as a normal OS. All programs see /usr, /etc, /var exactly where they expect them. Nothing broken, nothing weird for anything running on top. cd /sysroot shows us the physical disk in its entirety, with all deployments, the ostree repo, the shared storage. It\u0026rsquo;s the \u0026ldquo;landlord\u0026rsquo;s\u0026rdquo; view. The trick is all here: a layer of indirection between the disk and the apparent root.\nThe initramfs in an Immutable System # There\u0026rsquo;s a natural question at this point: if / is a view built at boot, who builds it? The answer: the initramfs, with one extra binary called (in ostree) ostree-prepare-root.\nThe flow becomes:\nBootloader loads kernel + initramfs. Identical to the normal case. Kernel starts, initramfs is extracted into RAM. Identical. ostree-prepare-root mounts the disk\u0026rsquo;s root partition on /sysroot. Different: previously it would have been mounted directly on /. It reads the bootloader entries to figure out which deployment is the active one. It builds the bind mounts for /usr, /var, /etc pointing inside that deployment. Switch root into the constructed view. From here on, systemd starts and sees a \u0026ldquo;normal\u0026rdquo; /. It doesn\u0026rsquo;t know, and doesn\u0026rsquo;t care, that it\u0026rsquo;s artificial. The rest of the boot proceeds exactly like on any other Linux.\nThe Analogy That Makes It Click # If you\u0026rsquo;re familiar with containers, you already have the mental model: this is exactly how a container works, applied to the boot of the host itself.\nContainer Immutable OS Image layers stored in the container runtime Deployments stored under /sysroot/ostree/ Container sees only its own rootfs System sees only the active deployment as / Host sees all images From /sysroot you see all deployments Switching containers = switching rootfs Updating the OS = switching which deployment is active An immutable OS is, in a sense, a container that promoted itself to host system. The same conceptual primitive - separating the storage of \u0026ldquo;all possible views\u0026rdquo; from the \u0026ldquo;currently exposed view\u0026rdquo; - applied at boot rather than at process runtime.\nWhy You Should Care # Even if you\u0026rsquo;ll never run an immutable OS in production, understanding this model changes how you think about systems. Three intuitions I take away from the moment it clicked:\nThe filesystem isn\u0026rsquo;t the disk. We\u0026rsquo;ve always known this (through proc, tmpfs, bind mounts, containers) but immutable OSes make it the organizing principle. What you see mounted is an interpretation, not the ultimate truth.\nAtomicity requires indirection. Want atomic updates, rollback, A/B testing of the OS? You need to be able to swap the entire view at once. That\u0026rsquo;s impossible if the view is the storage. You need a layer in between that decouples \u0026ldquo;what\u0026rsquo;s on disk\u0026rdquo; from \u0026ldquo;what\u0026rsquo;s currently exposed.\u0026rdquo;\nContainers are Linux applied to itself. Container primitives - namespaces, mounts, overlayfs - were born to isolate processes. Immutable OSes show that the same primitives, applied at boot, give you transactionality across the entire system. It\u0026rsquo;s Linux rediscovering its own patterns in a new domain.\nThe next time you find yourself staring at df -h with a mysterious mount point called /sysroot, you\u0026rsquo;ll know what you\u0026rsquo;re looking at: a carefully constructed window onto a much bigger warehouse.\n","date":"2026-05-05","externalUrl":null,"permalink":"/blog/immutable_os/","section":"Blog \u0026 Articles","summary":"","title":"Do you want an immutable OS? You have to reboot to update it.","type":"blog"},{"content":"","date":"2026-05-05","externalUrl":null,"permalink":"/tags/fedora/","section":"Tags","summary":"","title":"Fedora","type":"tags"},{"content":"","date":"2026-05-05","externalUrl":null,"permalink":"/tags/immutable-infra/","section":"Tags","summary":"","title":"Immutable-Infra","type":"tags"},{"content":"","date":"2026-05-05","externalUrl":null,"permalink":"/tags/linux/","section":"Tags","summary":"","title":"Linux","type":"tags"},{"content":"","date":"2026-05-05","externalUrl":null,"permalink":"/tags/ostree/","section":"Tags","summary":"","title":"Ostree","type":"tags"},{"content":"","date":"2026-05-05","externalUrl":null,"permalink":"/tags/sysadmin/","section":"Tags","summary":"","title":"Sysadmin","type":"tags"},{"content":"","date":"2026-02-08","externalUrl":null,"permalink":"/tags/homelab/","section":"Tags","summary":"","title":"Homelab","type":"tags"},{"content":"","date":"2026-02-08","externalUrl":null,"permalink":"/tags/kubernetes/","section":"Tags","summary":"","title":"Kubernetes","type":"tags"},{"content":"","date":"2026-02-08","externalUrl":null,"permalink":"/tags/metallb/","section":"Tags","summary":"","title":"Metallb","type":"tags"},{"content":"","date":"2026-02-08","externalUrl":null,"permalink":"/tags/networking/","section":"Tags","summary":"","title":"Networking","type":"tags"},{"content":" Running a full OpenShift 4.16 cluster on bare metal with a single NIC and one public IP using Proxmox, OVS bridging, pfSense, HAProxy, and MetalLB. TL;DR: This isn\u0026rsquo;t a step-by-step installation guide but a walkthrough-like of the homelab architecture I\u0026rsquo;m currently using and the decisions behind it. One physical server, one NIC, one public IP: an OpenShift 4.16 cluster running on Proxmox, with OVS bridging to multiplex IPs on a single port, pfSense as the internet gateway, and HAProxy for HTTP routing. I cover how I expose both web apps and TCP services (like databases) to the internet using MetalLB, cert-manager with Let\u0026rsquo;s Encrypt, and Cloudflare DNS. If you\u0026rsquo;re running a bare-metal cluster behind a single public IP, this should save you some debugging time.\n⚠️ Note: This post does not cover the installation or configuration of Proxmox, pfSense networking/bridging, or OpenShift cluster setup.\nThanks to Aleskandro for the help with the setup and debugging.\nThe Challenge # When I started learning Kubernetes, everything seemed simple on a local machine with kind, minikube, or whatever similar tools. I understood how Services and NodePorts worked conceptually, but replicating a LoadBalancer service was a different story. There was no cloud provider to provision an external IP. Just my laptop and a Pending status that never resolved. It made me wonder: how do things actually work in the real world? What happens behind the scenes when you create a LoadBalancer in a cloud environment?\nThat question led me to build this lab. The goal: run a production-like OpenShift cluster that can serve both web applications and non-HTTP services (databases, message brokers) to the outside world, with proper TLS, DNS, and routing, all from a single physical server.\nHardware and Virtualization Layer # The entire lab runs on a single physical server using Proxmox VE as the hypervisor Proxmox was chosen for its KVM-based maturity, built-in ZFS support, and a web UI that makes VM management practical without being heavyweight.\nThe following virtual machines are involved in the setup:\nVM Role IP pfSense Firewall, DNS, VPN, NAT WAN: public IP / LAN: 10.1.0.1 HAProxy HTTP reverse proxy 10.1.0.x (static) master0 OpenShift control plane 10.1.0.109 (static) compute0 OpenShift worker 10.1.0.110 (static) compute1 OpenShift worker 10.1.0.111 (static) All VMs communicate on the same internal network (10.1.0.0/24).\nThe Single NIC Problem and OVS # The server sits on a managed network where, by default, it receives a DHCP address that is only reachable from within the local network, not from the internet. A separate reserved public IP is available for external access. Both addresses need to coexist on the server\u0026rsquo;s single physical ethernet port.\nThis is where Open vSwitch (OVS) comes in. An OVS bridge configured in Proxmox can multiplex multiple network identities over a single physical interface. Through the bridge, different VMs bind to different IPs on the same port: Proxmox uses the internal DHCP address for its management dashboard, pfSense holds the reserved public IP for internet-facing traffic, and all internal VMs communicate on the 10.1.0.0/24 subnet, all through one ethernet cable.\nA standard Linux bridge could handle basic switching, but OVS provides cleaner separation between internal and external traffic, VLAN support, and the ability to mirror or inspect traffic when debugging network issues.\nReference: Proxmox OVS Bridge Documentation\nThe Single Public IP Problem # With OVS handling the physical layer, pfSense now owns the public IP. But that\u0026rsquo;s still a single entry point for all internet traffic. Every service, whether it\u0026rsquo;s a web app on port 443 or a PostgreSQL database on port 5432, must enter through the same address.\nEverything else, i.e. HAProxy, the OpenShift nodes, the databases, lives behind pfSense on private addresses. The question becomes: how do you route the right traffic to the right service when there\u0026rsquo;s only one door in?\nThe solution is layered:\nInternet │ ┌──────▼──────┐ │ pfSense │ │ (Public IP)│ └──────┬──────┘ │ ┌────────────┼────────────┐ │ │ │ NAT :80/:443 NAT :5432 NAT :other │ │ │ HAProxy MetalLB VIP MetalLB VIP │ 10.1.0.200 10.1.0.201 │ │ │ OCP Router Database Other TCP │ │ Services ┌────┴────┐ │ app1 app2 app3 │ (Routes) (Pods) pfSense acts as the gatekeeper. It performs NAT (Network Address Translation), forwarding incoming traffic on specific ports to the appropriate internal services. This is configured through port forwarding rules in pfSense\u0026rsquo;s firewall.\nThe key insight is that HTTP and non-HTTP traffic require fundamentally different approaches:\nHTTP/HTTPS (ports 80/443): Can be multiplexed. HAProxy reads the Host header and routes to different backends on the same port. Ten web apps can share ports 80 and 443. TCP (databases, MQTT, etc.): Cannot be multiplexed. Each service needs its own unique port. One port equals one service. Reference: pfSense NAT Port Forward Documentation\npfSense: The Network Core # pfSense is doing a lot of heavy lifting in this setup, serving as:\nFirewall: controls what traffic enters and leaves the network NAT gateway: maps public IP ports to internal services DNS resolver: provides internal DNS for all VMs (e.g., master0.lab.local → 10.1.0.109) DHCP server: manages IP assignments on the 10.1.0.0/24 subnet (range 10.1.0.10 - 10.1.0.100) VPN server: allows remote access to the internal network The VPN capability is particularly relevant to me. Since all the VMs have internal IPs only, VPN access through pfSense is the only way to manage the cluster remotely, whether that\u0026rsquo;s SSH to nodes, accessing the OpenShift web console, or connecting to internal services.\nNAT Rules # The NAT configuration in pfSense is minimal by design:\nWAN Port Destination Purpose 80 HAProxy HTTP traffic 443 HAProxy HTTPS traffic 5432 10.1.0.200 (MetalLB VIP) TimescaleDB Each new TCP service that needs internet exposure requires one additional NAT rule. This is intentional. It acts as a deliberate security gate. Not every internal service should be internet-facing, and requiring a manual pfSense rule ensures someone has made a conscious decision to expose it.\nReference: pfSense Documentation\nHAProxy: HTTP Traffic Routing # HAProxy sits between pfSense and the OpenShift router, handling all HTTP/HTTPS traffic. Its job is simple: receive connections on ports 80 and 443, and forward them to the OpenShift router pods on the worker nodes.\nThe OpenShift router (based on HAProxy itself, ironically) then uses the Host header to determine which Route resource matches and forwards traffic to the correct application pod.\nThis two-tier proxy setup might seem redundant, but it serves a purpose: the external HAProxy handles the pfSense-to-cluster boundary, while the OpenShift router handles in-cluster routing with all the OpenShift-native features (route annotations, TLS termination, path-based routing).\nFor non-HTTP TCP services, HAProxy can also be configured in mode tcp to forward raw TCP streams. However, with MetalLB in place, this is no longer necessary. TCP services get their own dedicated IP and bypass HAProxy entirely.\nReference: HAProxy Documentation\nOpenShift 4.16 on OKD # The cluster runs OKD (the community distribution of OpenShift) version 4.16, with a single control plane node and two worker nodes. This is a minimal but functional topology for a lab. Production clusters would typically have three control plane nodes for etcd quorum.\nOKD was chosen over vanilla Kubernetes for its opinionated platform features: integrated image registry, built-in monitoring stack, the Route resource for HTTP ingress, and OperatorHub for lifecycle management of cluster add-ons.\nThe Route Resource # OpenShift\u0026rsquo;s Route is the primary mechanism for exposing HTTP services. Unlike Kubernetes Ingress, Routes are a first-class OpenShift concept with native TLS termination, wildcard support, and integration with the cluster\u0026rsquo;s built-in router.\nA typical Route looks like:\napiVersion: route.openshift.io/v1 kind: Route metadata: name: my-app annotations: cert-manager.io/issuer-kind: ClusterIssuer cert-manager.io/issuer-name: letsencrypt-prod spec: host: my-app.mydomain.com tls: termination: edge insecureEdgeTerminationPolicy: Redirect to: kind: Service name: my-app port: targetPort: http Reference: OKD Documentation\nMetalLB: Load Balancing for Bare Metal # This is where it gets interesting. In cloud environments, creating a Kubernetes Service of type: LoadBalancer automatically provisions a cloud load balancer with a public IP. On bare metal, that request just sits in Pending forever because there\u0026rsquo;s nothing to fulfil it.\nMetalLB solves this. It\u0026rsquo;s a load balancer implementation for bare metal Kubernetes clusters that assigns real, routable IPs to LoadBalancer services from a configured pool.\nHow It Works # MetalLB runs in L2 mode in this setup (as opposed to BGP mode). In L2 mode, MetalLB responds to ARP requests for the virtual IPs it manages. When a LoadBalancer service is created, MetalLB picks an IP from the pool and one of the speaker pods begins answering ARP queries for that IP. Traffic arrives at the node running that speaker, and kube-proxy routes it to the correct pod.\nInstallation on OpenShift # MetalLB is available as an operator from OperatorHub. After installing the operator, three resources need to be created:\n# 1. MetalLB instance apiVersion: metallb.io/v1beta1 kind: MetalLB metadata: name: metallb namespace: metallb-system --- # 2. IP address pool (must not overlap with DHCP range) apiVersion: metallb.io/v1beta1 kind: IPAddressPool metadata: name: lab-pool namespace: metallb-system spec: addresses: - 10.1.0.200-10.1.0.230 --- # 3. L2 advertisement apiVersion: metallb.io/v1beta1 kind: L2Advertisement metadata: name: lab-l2adv namespace: metallb-system spec: ipAddressPools: - lab-pool The IP pool (10.1.0.200-10.1.0.230) was deliberately placed above the DHCP range (10.1.0.10-100) and the static node IPs (10.1.0.109-111) to avoid conflicts. The 31 available addresses should be more than enough for a lab.\nUsing It # Once MetalLB is running, exposing a TCP service is a single manifest:\napiVersion: v1 kind: Service metadata: name: database-service namespace: app-namespace spec: type: LoadBalancer selector: app: my-database ports: - protocol: TCP port: 5432 targetPort: 5432 MetalLB automatically assigns an IP (e.g., 10.1.0.200), and the service is immediately reachable at 10.1.0.200:5432 from anywhere on the internal network. To make it internet-accessible, a single pfSense NAT rule forwards port 5432 from the public IP to 10.1.0.200.\nA Note on the Installation # During the MetalLB installation, the controller and speaker pods can be stuck in ContainerCreating for several reasone In my case, the root cause turned out to be a stale leader lease in the OpenShift service-ca operator , i.e., the component responsible for generating TLS serving certificates for internal services. If the lease is stale, i.e. need to be renewed as in my case, it had stopped processing certificate requests cluster-wide.\nThe fix was to delete the stale lease and restart the service-ca pods:\noc delete lease service-ca-controller-lock -n openshift-service-ca oc delete pods -n openshift-service-ca --all After that, the service-ca operator acquired a new lease, generated the missing TLS secrets for MetalLB, and the pods started normally. This is worth mentioning because a dead service-ca operator can silently break many things beyond MetalLB (any operator or service that relies on auto-generated serving certificates will fail)\nReference: MetalLB Documentation | MetalLB Operator for OpenShift\ncert-manager: Automated TLS with Let\u0026rsquo;s Encrypt # With HTTP services exposed via Routes, the next challenge is TLS. The OpenShift router has a default wildcard certificate for its internal domain (e.g. *.apps.mylab.lab), but external access uses a different domain (*.apps.mycustomdomain.com). Browsers correctly reject the mismatched certificate.\nThe solution is cert-manager with Let\u0026rsquo;s Encrypt, using Cloudflare DNS-01 challenges for domain validation.\nWhy DNS-01? # Let\u0026rsquo;s Encrypt supports two challenge types:\nHTTP-01: Let\u0026rsquo;s Encrypt makes an HTTP request to your domain to verify ownership. Requires port 80 to be open and routable. DNS-01: Let\u0026rsquo;s Encrypt checks for a specific TXT record in your DNS. No inbound traffic required. DNS-01 was the right choice because it works regardless of the proxy setup, supports wildcard certificates, and integrates cleanly with Cloudflare\u0026rsquo;s API.\nSetup # The cert-manager operator is installed from OperatorHub. After creating the CertManager operand instance, three components are needed:\n1. Cloudflare API Token (stored as a Kubernetes secret):\noc create secret generic cloudflare-api-token \\ -n cert-manager \\ --from-literal=api-token=\u0026lt;YOUR_TOKEN\u0026gt; 2. ClusterIssuer: configures Let\u0026rsquo;s Encrypt with Cloudflare as the DNS solver:\napiVersion: cert-manager.io/v1 kind: ClusterIssuer metadata: name: letsencrypt-prod spec: acme: server: https://acme-v02.api.letsencrypt.org/directory email: your-email@example.com privateKeySecretRef: name: letsencrypt-prod-account-key solvers: - dns01: cloudflare: apiTokenSecretRef: name: cloudflare-api-token key: api-token selector: dnsZones: - mycustomdomain.com 3. OpenShift Routes integration: a small controller that watches for annotated Routes and manages certificates for them:\noc apply -f \u0026lt;(helm template openshift-routes -n cert-manager \\ oci://ghcr.io/cert-manager/charts/openshift-routes \\ --set omitHelmLabels=true) With this in place, any Route with the following annotations automatically gets a Let\u0026rsquo;s Encrypt certificate:\nannotations: cert-manager.io/issuer-kind: ClusterIssuer cert-manager.io/issuer-name: letsencrypt-prod The certificate lifecycle (request, validation, issuance, renewal) is fully automated.\nCloudflare SSL/TLS Considerations # One gotcha: Cloudflare\u0026rsquo;s free Universal SSL certificate only covers one level of subdomain depth (*.mycustomdomain.com). Multi-level subdomains like example.apps.mycustomdomain.com are not covered. If you\u0026rsquo;re using Cloudflare proxy (orange cloud), the browser sees Cloudflare\u0026rsquo;s edge certificate first, and if it doesn\u0026rsquo;t match, the connection fails before traffic even reaches your origin.\nThe solution is to use DNS-only mode (grey cloud) for these deep subdomains, letting the browser connect directly to your origin where the valid Let\u0026rsquo;s Encrypt certificate is served. Alternatively, Cloudflare\u0026rsquo;s Advanced Certificate Manager covers multi-level subdomains.\nReference: cert-manager Documentation | Let\u0026rsquo;s Encrypt | cert-manager OpenShift Routes\nDNS Architecture # DNS operates at two layers in this setup:\nExternal DNS (Cloudflare) # Cloudflare manages the public mycustomdomain.com domain. A wildcard record (*.mycustomdomain.com) points to the pfSense public IP. For subdomains deeper than one level (like *.apps.mycustomdomain.com), DNS-only mode is used so that Let\u0026rsquo;s Encrypt certificates from the origin are served directly.\nInternal DNS (pfSense) # pfSense\u0026rsquo;s DNS Resolver handles internal name resolution, allowing VPN clients and internal VMs to reach services by name without routing through the public IP.\nThe important thing to remember: DNS only resolves names to IPs, it doesn\u0026rsquo;t know about ports. This is why HTTP multiplexing via the Host header is so valuable (it lets many services share a single IP and port pair). TCP services don\u0026rsquo;t have this luxury and each requires a unique port.\nPutting It All Together # Here\u0026rsquo;s the complete decision flow for exposing a new service:\nHTTP Service # Deploy the application in OpenShift Create a Service (type: ClusterIP) Create a Route with cert-manager annotations cert-manager requests and injects a Let\u0026rsquo;s Encrypt certificate automatically Traffic flows: Internet → pfSense:443 → HAProxy → OCP Router → Pod TCP Service # Deploy the application in OpenShift Create a Service (type: LoadBalancer) MetalLB assigns a VIP from the pool automatically Create a pfSense NAT rule for the desired port → MetalLB VIP Traffic flows: Internet → pfSense:port → MetalLB VIP → Pod The only manual step for TCP services is the pfSense NAT rule, which is an intentional security decision (every internet-facing port should be a deliberate choice).\nLessons Learned # Check your service-ca operator. A stale leader lease can silently break certificate generation cluster-wide. Monitor the lease renewal time periodically.\nUnderstand Cloudflare\u0026rsquo;s SSL depth limits. Free Universal SSL covers *.domain.com but not *.subdomain.domain.com. Plan your hostname scheme accordingly or budget for Advanced Certificate Manager.\nMetalLB L2 mode is simple and sufficient for homelabs. BGP mode offers more features but adds complexity that isn\u0026rsquo;t justified in a single-network lab.\nOne public IP is not a limitation, it\u0026rsquo;s an architecture. With proper NAT, reverse proxying, and service mesh tooling, a single IP can serve dozens of services. The constraint forces clean thinking about traffic flow and security boundaries.\nReferences # Proxmox VE Documentation Proxmox OVS Bridge pfSense Documentation OKD / OpenShift Documentation MetalLB Project MetalLB on OpenShift HAProxy Documentation cert-manager Documentation cert-manager OpenShift Routes Integration Let\u0026rsquo;s Encrypt Cloudflare SSL/TLS Modes ","date":"2026-02-08","externalUrl":null,"permalink":"/blog/okd_homelab/","section":"Blog \u0026 Articles","summary":"","title":"OKD Homelab: Single Public IP, Full Cluster","type":"blog"},{"content":"","date":"2026-02-08","externalUrl":null,"permalink":"/tags/openshift/","section":"Tags","summary":"","title":"Openshift","type":"tags"},{"content":"","date":"2026-02-08","externalUrl":null,"permalink":"/tags/proxmox/","section":"Tags","summary":"","title":"Proxmox","type":"tags"},{"content":"I like understanding why systems break, and building them so they don\u0026rsquo;t. That curiosity led me from a PhD in Distributed Systems at the University of Catania to maintaining production Kubernetes clusters\nThese days I split my time between backend engineering, infrastructure automation, and a growing interest on LLMs and their applications. I\u0026rsquo;m currently interested in fine-tuning LLMs for domain-specific tasks and figuring out how to deploy them without melting the hardware budget.\nWhen I\u0026rsquo;m not debugging YAML or staring at Prometheus dashboards, I\u0026rsquo;m probably over-engineering my homelab or convincing myself that this side project will actually get finished.\nSkills # Languages Python, Golang, Java, Shell, Swift, C Infrastructure Docker, Kubernetes, OpenShift, Proxmox, Nginx Automation Ansible, Terraform, Helm, GitLab CI/CD, Operators Observability Prometheus, Grafana, Jaeger, Istio Data PostgreSQL, MySQL, Redis, Kafka Experience # Software Engineer @ Vyxel, Italy Jan 2026 – Now\nProvisioning and managing bare-metal infrastructure to deliver reliable and scalable cloud services. PhD Research Consultant @ Aucta Cognitio SRL, Italy Oct 2022 – Nov 2025\nDesigned cloud-native architectures leveraging Kubernetes for orchestration of distributed workloads. Developed MLOps pipelines using Kubeflow and Apache Airflow for automated ML workflows. Software \u0026amp; AI Engineer @ Sangiorgi SRL, Italy Mar 2025 – Nov 2025\nLed COAT project (EU-funded, NGI Sargasso / Horizon Europe): on-premise LLM inference with Ollama/vLLM, fine-tuned Qwen3 with Unsloth. Visiting PhD Researcher @ Telecom SudParis, Palaiseau, France Jan 2025 – Jul 2025\nBuilt a Kubernetes Event Generator to simulate workloads, node failures, and scheduler configurations. Research on distributed systems scheduling and resource management in cloud-native environments. Teaching Assistant @ University of Catania, Italy Sep 2022 – Feb 2025\nDistributed Systems \u0026amp; Big Data (LM-32): tutoring on distributed computing and cluster technologies. Programming Techniques for Distributed Systems (LM-27): lab sessions on distributed programming. Object Oriented Programming (L-8): lab sessions on Java programming, student exercises and projects. Software Engineer @ Sangiorgi SRL, Catania, Italy May 2021 – May 2022\nDeveloped microservice backends (Python, Golang) for IoT and mobile apps (WeeNet, Ruppu, YouSpeed). Managed Kubernetes deployments, GitLab CI/CD pipelines, and infrastructure with Prometheus monitoring. Education # PhD in Computer Engineering @ University of Catania, Italy 2022 – 2025 Thesis: Decision Support System in Quality of Service Application Performance Management.\nProfessional License @ University of Catania, Italy 2022 Ingegnere dell\u0026rsquo;Informazione Sez. A (95/100)\nMaster\u0026rsquo;s Degree in Computer Engineering @ University of Catania, Italy, Grade 110/110 cum laude 2018 – 2022 Thesis: Resource Allocation with Multiple Offloading Options in Cloud-Edge Scenarios.\nBachelor\u0026rsquo;s Degree in Computer Engineering @ University of Catania, Italy 2014 – 2018 Thesis: Sentiment Analysis on Web Comments Using Azure Machine Learning.\nPublications # Gollo, M., Morana, G., Genovese, A., Di Stefano, A. (2025). SLADE: An SLA-Driven AI-Based Decision Engine for QoS Management in the Edge Cloud Continuum. Computer Communications. (under review) Gollo, M., Sangiorgi, A., Morana, G., Di Martino, M., Esposito, F. (2025). Quantifying Privacy Risk in Online Agreements with COAT: An LLM Approach. WETICE 2025, IEEE. Di Stefano, A., Genovese, A., Gollo, M., Morana, G. (2025). Latency-Constrained Overlay Networks for QoS Assurance in the Edge-Cloud Continuum. WETICE 2025, IEEE. Davoli, G., Grasso, C., et al., Gollo, M., Morana, G., et al. (2025). Demonstration of Dynamic Service-Chain Deployment in a Multi-Layer FANET Edge-Computing Architecture. NoF 2025. Di Stefano, A., Gollo, M., Morana, G. (2024). An SLA-driven, AI-based QoS Manager for Edge Cloud Continuum. WETICE 2024, IEEE. Di Stefano, A., Gollo, M., Morana, G. (2024). Improving QoS Management Using Associative Memory and Event-Driven Transaction History. Information, 15(9), 569. Conferences \u0026amp; Workshops # Grifin Workshop on AI, Networks \u0026amp; Cybersecurity — Sorbonne University \u0026amp; LINCS, Paris, France (Apr 22–23, 2025) AI, Science \u0026amp; Society Conference (AI Summit) — Institut Polytechnique de Paris, Palaiseau, France (Feb 6–7, 2025) WETICE: International Conference on Enabling Technologies — IEEE, Catania, Italy (Jul 23–25, 2025) WETICE: International Conference on Enabling Technologies — IEEE, Reggio Emilia, Italy (Jun 26–28, 2024) RESTART Plenary Dissemination Workshop — Catania, Italy (2025) VIII Mediterranean School of Complex Networks (PhD School) — Catania, Italy (Jun 25–30, 2023) ","externalUrl":null,"permalink":"/about/","section":"About","summary":"","title":"About","type":"about"},{"content":"","externalUrl":null,"permalink":"/authors/","section":"Authors","summary":"","title":"Authors","type":"authors"},{"content":"","externalUrl":null,"permalink":"/projects/","section":"Projects","summary":"","title":"Projects","type":"projects"},{"content":"","externalUrl":null,"permalink":"/series/","section":"Series","summary":"","title":"Series","type":"series"}]