<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[Samuele Resca]]></title><description><![CDATA[Samuele Resca]]></description><link>https://samueleresca.net/</link><image><url>https://samueleresca.net/favicon.png</url><title>Samuele Resca</title><link>https://samueleresca.net/</link></image><generator>Ghost 5.33</generator><lastBuildDate>Thu, 23 Apr 2026 06:53:08 GMT</lastBuildDate><atom:link href="https://samueleresca.net/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[Implementing Thread State Analysis (TSA)]]></title><description><![CDATA[This post goes through some of the exploration I did around Thread State Analysis (TSA) and proposes a solution to one of the exercises of the System Performance book where the ask is to implement a TSA tool for Linux.]]></description><link>https://samueleresca.net/implementing-thread-state-analysis-methodology/</link><guid isPermaLink="false">673f6f6a827b68149a69ac2d</guid><category><![CDATA[performance]]></category><dc:creator><![CDATA[Samuele Resca]]></dc:creator><pubDate>Fri, 13 Dec 2024 21:09:24 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p>Lately, I have been reading System Performance for Enterprise and Cloud by Brendan Gregg<sup class="footnote-ref"><a href="#fn1" id="fnref1">[1]</a></sup>. Among other things, the book provides a set of methodologies for application performance analysis. One of these is the Thread State Analysis (TSA).</p>
<p>This post describes some of the exploration I did around TSA. The methodology has been covered by Gregg multiple times<sup class="footnote-ref"><a href="#fn2" id="fnref2">[2]</a></sup><sup class="footnote-ref"><a href="#fn3" id="fnref3">[3]</a></sup>. This post proposes a solution to one of the book&apos;s exercises which asks to implement a TSA tool for Linux.</p>
<p>The code presented later in the post is available <a href="https://github.com/samueleresca/tsastat">https://github.com/samueleresca/tsastat</a>.</p>
<h2 id="overview">Overview</h2>
<p>The Thread State Analysis (TSA) method measures the time threads spend in certain states. It then investigates each state, from most to least frequent<sup class="footnote-ref"><a href="#fn2" id="fnref2:1">[2:1]</a></sup>. It can used as the first step to analyse application performances.</p>
<p>The methodology can be used independently from the type of application you want to investigate. Furthermore, it can be applied as an initial approach to guide your investigation in the right direction.</p>
<p>The methodology proposes a set of thread states<sup class="footnote-ref"><a href="#fn4" id="fnref4">[4]</a></sup> that most operating systems have. For each state, it provides some follow-up actions to get further insights into the state of the system. Below are the states proposed by the methodology:</p>
<ul>
<li><code>Running</code>: the time spent by the thread running on-CPU. The call to action in case of high value is to investigate the application with Profiling tools to determine where the CPU cycles are spent. Note that, since spinning locks are running on-CPU, they would add time in this state.</li>
<li><code>Runnable</code> represents the time spent by the thread waiting for scheduling. A high value in this state usually implies that the application needs more CPU. Therefore, review the CPU resources and the limits configured for the application or the system.</li>
<li><code>Swapping</code> the time spent by the thread in a swapped-out, wait-for-residency state. A long time spent in this state usually points to memory utilization issues.</li>
<li><code>Sleeping</code> represents the thread waiting for any I/O. A high value here might suggest resource limitations (e.g.: disk, network). The next step is to understand which I/O is the bottleneck and to perform Off-CPU analysis<sup class="footnote-ref"><a href="#fn5" id="fnref5">[5]</a></sup> to gather further details.</li>
<li><code>Lock</code> state measures the thread waiting on lock contention. A high time spent here might suggest investigating which lock is blocking the thread and why it is blocking. Further lock analysis is required.</li>
</ul>
<p>A prerequisite of capturing these states is to identify how the states are actually implemented in the operating system. For Linux systems, we can gather the possible states of a thread by looking at the <code>linux/sched.h</code>. More on this in the later section.</p>
<h2 id="implementing-tsa-on-linux-with-bpftrace">Implementing TSA on Linux with bpftrace</h2>
<p>As mentioned in the intro, this section proposes an implementation of the Thread State Analysis on Linux using bpftrace. To quote the exercise in the book:</p>
<blockquote>
<p><em>&quot;Develop a tool for Linux called <code>tsastat(8)</code> that prints columns for multiple thread state analysis states, with time spent in each. This can behave similarly to pidstat(1) and produce a rolling output.&quot;</em></p>
</blockquote>
<p>All the code described next is available <a href="https://github.com/samueleresca/tsastat">https://github.com/samueleresca/tsastat</a>. The scripts require bpftrace installed in the system. More detailed instructions are in the README of the repo.</p>
<p>A similar implementation for FreeBSD has been already implemented by Gregg<sup class="footnote-ref"><a href="#fn6" id="fnref6">[6]</a></sup>.</p>
<h3 id="calculating-on-cpu-time-running-state">Calculating on-CPU time (<code>Running</code> state)</h3>
<p>The on-CPU time is the time spent by the thread on the CPU. We can use the <code>tracepoint:sched:sched_switch</code> tracepoint that is triggered at every scheduler switch event. Each tracepoint comes with some arguments that are passed in when the action is executed. It is possible to list the arguments using the following command:</p>
<pre><code>$ bpftrace -lv &quot;tracepoint:sched:sched_switch&quot;
tracepoint:sched:sched_switch
    char prev_comm[16]
    pid_t prev_pid
    int prev_prio
    long prev_state
    char next_comm[16]
    pid_t next_pid
    int next_prio
</code></pre>
<p>In this case, the <code>prev_</code> prefixed fields track the information of the task that switched out from the CPU. While the <code>next_</code> prefixed fields refer to the task that is about to run.</p>
<p>Below is a bpftrace script that gathers the on-CPU time for each thread:</p>
<script src="https://gist.github.com/samueleresca/2ef2907401b636a2e3cf527bf95a41a8.js"></script>
<p>At each <code>sched_switch</code> the script calculates the interval between the current time (<code>nsecs</code>) and the start time of the previous task. In this way, we can keep track of the on-CPU interval. In addition, it sets the start time for the <code>next_pid</code> for the subsequent interval calculations.</p>
<h3 id="tracking-runnable-state-waiting-on-queue">Tracking <code>Runnable</code> state (Waiting on Queue)</h3>
<p>To calculate the <code>Runnable</code> state we can introduce 2 new tracepoints:</p>
<ul>
<li><code>tracepoint:sched:sched_wakeup</code> triggers when the scheduler wakes up a task. It typically occurs when a task transitions from a sleeping state to a runnable state.</li>
<li><code>tracepoint:sched:sched_wakeup_new</code> triggers when a new task transitions for the first time into a runnable state.</li>
</ul>
<p>Below is a snippet to compute the runnable state time:</p>
<script src="https://gist.github.com/samueleresca/9c46655d335fa0ba5ad2a48536ab874c.js"></script>
<p>The implementation listens for the <code>wakeup</code> and <code>wakeup_new</code> tracepoints to track the timestamp of a task moved into a runnable state. Next, it uses the previously introduced <code>tracepoint:sched:sched_switch</code> to calculate the delta between the runnable and running states. This will give us the <code>Runnable</code> time for each thread.</p>
<h3 id="tracking-the-sleeping-time-off-cpu">Tracking the <code>Sleeping</code> time (off-CPU)</h3>
<p>Sleeping times indicate the task is off-CPU waiting for an event. The tracking of the sleeping times can be built on top of the previously used tracepoints.</p>
<p>The <code>tracepoint:sched:sched_switch</code> action verifies the state of the previous thread. In case the thread switched out hasn&apos;t the <code>TASK_RUNNING</code> state, then we can save the timestamp and the state:</p>
<pre><code>tracepoint:sched:sched_switch
{
  ...
  // Store sleep state for previous task
  // in case it is not TASK_RUNNING
  if ($prev_state &gt; 0)
  {
    @sleeping_ts[$prev_pid] = nsecs;
    @sleeping_state[$prev_pid] = $prev_state;
  }
}
</code></pre>
<p>The code above allows to keep track of the sleeping times and states of each thread. The next step is to record the sleep time every time the thread wakes up. This can be done by extending the <code>sched_wakeup</code> and <code>sched_wakeup_new</code> tracepoints.</p>
<p>Below a full snippet showing the changes.</p>
<script src="https://gist.github.com/samueleresca/f5efa6e8c21d155c684a1cbb309f0774.js"></script>
<p>The implementation uses the states in <code>linux/sched.h</code><sup class="footnote-ref"><a href="#fn7" id="fnref7">[7]</a></sup> to check the thread&apos;s state saved during the scheduler switch. It computes the delta and saves the results in the map corresponding to the sleeping state. Finally, once the delta is saved, it proceeds by deleting the key.</p>
<p>The sleeping states tracked in the implementation are:</p>
<ul>
<li><code>TASK_INTERRUPTIBLE</code> state indicates the task is sleeping (or waiting). The task will move to the <code>TASK_RUNNING</code> state once some event occurs. The task can be awakened by signals. This state is for tasks that can safely be interrupted without any risk of causing inconsistencies. An example might be if the application calls functions like <code>poll()</code> or <code>select()</code> to wait for input from the keyboard.</li>
<li>
<ul>
<li>The <code>TASK_UNINTERRUPTIBLE</code> state refer to a sleeping task similarly to <code>TASK_INTERRUPTIBLE</code>. User-space signals cannot wake it up. An example is a call to a blocking function like <code>read()</code> to fetch data from the disk. In that case, the system puts the task in an uninterruptible state because interruption could cause data corruption or inconsistency.</li>
</ul>
</li>
<li><code>TASK_RUNNING</code> state indicates the task is currently executing or is ready to execute on a CPU.</li>
<li><code>TASK_STOPPED</code> state means the task has been stopped. This is usually due to a SIGSTOP signal or a similar signal that pauses the task.</li>
</ul>
<p>The next section goes to the implementation of tracking lock times.</p>
<h3 id="tracking-lock-time-locked-state">Tracking lock time (<code>Locked</code> state)</h3>
<p>This section proposes two implementations to calculate the time spent on lock contention. The first listens for the futex syscall<sup class="footnote-ref"><a href="#fn8" id="fnref8">[8]</a></sup>(&quot;Fast Userspace muTEX&quot;). The implementation uses the following tracepoints:</p>
<ul>
<li><code>tracepoint:syscalls:sys_enter_futex</code> triggered on the lock attempt.</li>
<li><code>tracepoint:syscalls:sys_exit_futex</code> triggered on the lock releases.</li>
</ul>
<p>Similarly to the previous states, we can keep track of the lock attempts by saving the timestamps in a map and calculating the delta on the lock release.</p>
<p>Below is the implementation:</p>
<script src="https://gist.github.com/samueleresca/91a7793206ded928d13cc0e215aca9a9.js"></script>
<p>When <code>tracepoint:syscalls:sys_enter_futex</code> events fire, the action gets executed in case of <code>FUTEX_WAIT</code> operations (see the predicate declared in the probe)<sup class="footnote-ref"><a href="#fn9" id="fnref9">[9]</a></sup>. The action tracks the thread id and the initial timestamp. When <code>tracepoint:syscalls:sys_exit_futex</code> events fire, the action checks if the initial timestamp for that thread has been saved (see predicate). It then calculates the time delta. Finally, it deletes the thread key from the <code>lock_ts</code> map.</p>
<p>Futexes operate in two phases:</p>
<ul>
<li><em>fast path</em>: threads can lock and unlock resources directly in user space without entering the kernel if there&#x2019;s no contention.</li>
<li><em>slow path</em>: if contention occurs (e.g., a thread needs to wait for a lock to be released), futexes involve the kernel to handle waiting and waking up threads.</li>
</ul>
<p>Note that the tracepoints above capture the slow path. On top of that, futexes are general synchronization mechanisms used in the kernel. So, the timing above might be misleading. As it might also capture other sync operations, not just the locks.</p>
<p>The second implementation uses the <code>tracepoint:lock:contention_begin</code> and <code>tracepoint:lock:contention_end</code><sup class="footnote-ref"><a href="#fn10" id="fnref10">[10]</a></sup> tracepoints. This has a similar pattern to the previous one:</p>
<script src="https://gist.github.com/samueleresca/c8d4e2cfa8146261bdd87918de834ad6.js"></script>
<p>The contention time by thread is measured at <code>contention_begin</code>. The delta is calculated using the <code>contention_end</code> time when the lock contention ends.</p>
<h3 id="displaying-the-results">Displaying the results</h3>
<p>We can use the END built-in event to display the final results. The action bound to the event has access to all the maps populated in precedence (see the previous sections):</p>
<script src="https://gist.github.com/samueleresca/9c65ec73ffd67d4250a1e371557a451d.js"></script>
<p>The script prints the header for the table at first. Then it proceeds by iterating over the maps to print the corresponding thread state time aggregation for each thread by converting the times in milliseconds.</p>
<p>By default, bpftrace prints the content of the &apos;non-empty&apos; maps at exit. To avoid that behaviour it is possible to specify the <code>print_maps_on_exit=0</code> configuration<sup class="footnote-ref"><a href="#fn11" id="fnref11">[11]</a></sup>.</p>
<p>As a final result, you will get something similar to the psstat output:</p>
<pre><code>user@ubuntu-vm:~/Projects$ tsastat.bt
Attaching 7 probes...
Tracing scheduler events... Hit Ctrl-C to end.
^C
Thread state analysis:
COMM             PID       CPU   RUNQ    SLP    USL    SUS    LCK    DEA
futex_contentio  32834       2    201    201      0      0    225      0
xdg-desktop-por  2401        7      0  58985      0      0      0      0
futex_contentio  33769       2    209    348      0      0    384      0
CacheThread_Blo  2727        0      0      0      0      0      0      0
futex_contentio  33207       3    352    402      0      0    503      0
futex_contentio  32026       2    138    160      0      0    166      0
kworker/u4:5     31410      19      0      0    110      3      0   3224
gmain            2410        1      0      0      0      0      0      0
ThreadPoolForeg  2744       32      0    746      0      0   7077      0
</code></pre>
<p>For better readability, you can implement rolling input. Replace the <code>END</code> built-in event with <code>interval:5s</code> and clear the terminal output each time.</p>
<h2 id="wrap-up">Wrap up</h2>
<p>The post proposes an implementation of Thread State Analysis on Linux using bpftrace. It&apos;s in response to an exercise in the System Performance book<sup class="footnote-ref"><a href="#fn1" id="fnref1:1">[1:1]</a></sup>. The tracepoints used in the implementation might vary based on your OS&apos;s configuration. In general, you can get a list of tracepoints available on your system by running: <code>bpftrace -l &apos;tracepoint:*&apos;</code>. In addition, please verify the script before using it in production environments. Ut might cause some noticeable overhead on your system as it traces very frequent events (including the scheduler events).</p>
<hr class="footnotes-sep">
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn1" class="footnote-item"><p><a href="https://www.brendangregg.com/blog/2020-07-15/systems-performance-2nd-edition.html">Systems Performance: Enterprise and the Cloud, 2nd Edition</a> <a href="#fnref1" class="footnote-backref">&#x21A9;&#xFE0E;</a> <a href="#fnref1:1" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn2" class="footnote-item"><p>The og TSA method post described by Gregg: <a href="https://www.brendangregg.com/tsamethod.html">https://www.brendangregg.com/tsamethod.html</a> <a href="#fnref2" class="footnote-backref">&#x21A9;&#xFE0E;</a> <a href="#fnref2:1" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn3" class="footnote-item"><p><a href="https://www.brendangregg.com/blog/2017-10-28/bsd-performance-analysis-methodologies.html">EuroBSDcon: System Performance Analysis Methodologies</a> <a href="#fnref3" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn4" class="footnote-item"><p>In Linux a task is a runnable entity that can be either a thread, a process with a single thread or a kernel thread. In this post, thread and task are used interchangeably. <a href="#fnref4" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn5" class="footnote-item"><p>The Off-CPU analysis is another methodology <a href="https://www.brendangregg.com/offcpuanalysis.html">introduced by Brendan Gregg</a> <a href="#fnref5" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn6" class="footnote-item"><p>See: <a href="https://github.com/brendangregg/DTrace-tools/blob/master/sched/tstates.d">https://github.com/brendangregg/DTrace-tools/blob/master/sched/tstates.d</a> <a href="#fnref6" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn7" class="footnote-item"><p><a href="https://github.com/torvalds/linux/blob/v6.12/include/linux/sched.h#L99">include/linux/sched.h</a> <a href="#fnref7" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn8" class="footnote-item"><p><a href="https://man7.org/linux/man-pages/man2/futex.2.html">futex manpage</a> <a href="#fnref8" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn9" class="footnote-item"><p><a href="https://github.com/torvalds/linux/blob/v6.12/include/uapi/linux/futex.h#L11">include/uapi/linux/futex</a> <a href="#fnref9" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn10" class="footnote-item"><p>These tracepoints have been introduced in 2022 in the kernel. See: <a href="https://lore.kernel.org/all/20220301010412.431299-2-namhyung@kernel.org/">[PATCH 1/4] locking: Add lock contention tracepoints</a> <a href="#fnref10" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn11" class="footnote-item"><p><a href="https://github.com/bpftrace/bpftrace/blob/master/man/adoc/bpftrace.adoc#print_maps_on_exit">print_maps_on_exit - bpftrace</a> <a href="#fnref11" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
</ol>
</section>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Notes on CVE assessment]]></title><description><![CDATA[This post collects some notes about the lifecycle of vulnerabilities. It also discusses the challenges I faced during the assessment process: from the need to keep the analysis consistent to the limits of the CVSS base score. ]]></description><link>https://samueleresca.net/notes-on-cve-assessment/</link><guid isPermaLink="false">64aafac8827b68149a698f6e</guid><category><![CDATA[security]]></category><dc:creator><![CDATA[Samuele Resca]]></dc:creator><pubDate>Fri, 22 Mar 2024 18:05:32 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p>This post collects some notes about the lifecycle of vulnerabilities. It also discusses the challenges I faced during the assessment process: from the need to keep the analysis consistent to the limits of the CVSS base score. The post interchanges the terms &quot;vulnerability&quot; and &quot;CVE&quot;. A &quot;system&quot; is any solution or service, which is deployed somewhere or consumed by another upstream component.</p>
<h2 id="vulnerabilities-lifecycle">Vulnerabilities lifecycle</h2>
<p>CVE is an acronym for Common Vulnerabilities and Exposures. &quot;A CVE&quot; is usually referring to a specific vulnerability affecting a system. For example, if you read &quot;This CVE is affecting X,&quot; it means a CVE-ID vulnerability is present in component X or its downstream dependencies (i.e.: 3rd party Open-source dependencies).</p>
<p>CVE is a system maintained by <a href="https://www.mitre.org/">The MITRE Organization</a> in collaboration with some other organisations.</p>
<p>The lifecycle of a vulnerability usually follows these steps:</p>
<ol>
<li>A person or an organisation (let&apos;s call them a researcher) finds a security flaw in a software product.</li>
<li>The researcher disclose in private the security concern to the owner(s) (let&apos;s call they vendor) of the software product.</li>
<li>The vendor plans a fix by following a pre-defined security policy. It also prepares the disclosure of the vulnerability.</li>
<li>At the start of the disclosure process, the vendor reports the vulnerability to a CVE Program Partner <sup class="footnote-ref"><a href="#fn1" id="fnref1">[1]</a></sup>. This process assigns a CVE ID to the vulnerability.</li>
<li>The CNA submits the details of the vulnerability linked to the CVE ID.</li>
<li>Finally, the CVE ID is published.</li>
</ol>
<p>The final result usually looks like something similar to this: <a href="https://www.cve.org/CVERecord?id=CVE-2023-27536">CVE-2023-27536 - cve.org</a>. The CVE details summarize the problem briefly and they display the name of the CNA that published the CVE.</p>
<p>Note that, the <a href="cve.org">cve.org</a> website does not provide any information on the severity of a CVE. The next section, &quot;Scoring process&quot;, explains how severities are assigned.</p>
<h3 id="vulnerabilities-in-open-source">Vulnerabilities in Open-Source</h3>
<p>Open-Source security reporting process varies across projects. The owner of an OSS package may be an individual, a community, or an organization. As a result, the Open-source ecosystem does not have a unified vulnerabilities reporting process. Yet, there are some similarities:</p>
<ul>
<li>Open-source projects usually have a <code>SECURITY.md</code> file. It describes the process for reporting security problems for the project.</li>
<li>Maintainers and small organisation owning a open-source projects, use the <a href="https://www.cve.org/PartnerInformation/ListofPartners/partner/mitre">MITRE Corporation CNA</a> to create CVE-IDs.</li>
<li>As a general rule, vulnerabilities or security concerns should never been disclosed in public (e.g.: Issue Github page of the project). Usually, the best approach is to follow the process in the <code>SECURITY.md</code> file. In alternative, reach the organization or the maintainer in private.</li>
</ul>
<p>Note that, more mature OSS organizations usually have a more established security teams and processes in place. <sup class="footnote-ref"><a href="#fn2" id="fnref2">[2]</a></sup></p>
<h3 id="scoring-process">Scoring process</h3>
<p>When a CVE-ID is generated, multiple groups and vendors take part to the process of assigning the severity and other details to the vulnerability. The details include:</p>
<ul>
<li>The severity of the vulnerability, usually measured through the CVSS standard.</li>
<li>The proof of exploitation, if any.</li>
<li>All the component vulnerable versions.</li>
<li>Any reference to the vulnerability.</li>
<li>Any link to a specific CWE-ID tracked in the <a href="https://cwe.mitre.org/">Common Weakness Enumeration List (CWE)</a> maintained by MITRE.</li>
</ul>
<p>These groups and vendors keep records of the details in their databases. Some of the well-known examples are: the <a href="https://nvd.nist.gov/">National Vulnerability Database (NVD)</a><sup class="footnote-ref"><a href="#fn3" id="fnref3">[3]</a></sup> which is supported by NIST, <a href="https://github.com/advisories">GitHub Advisories Database</a>, or more vendor specific databases such as the <a href="https://ubuntu.com/security/cves">Ubuntu CVEs report</a>.</p>
<p>CVSS is the common scoring system for measuring the severity of a CVE. The most adopted version is CVSS 3.1. Few months back, CVSS 4.0 <a href="https://samueleresca.net/changes-and-improvements-in-cvss-4-0/">has been released</a>, but it will take time for the field to adopt it. Vulnerability databases map the CVSS severity score to the CVEs. This score is also referred as &quot;Base score&quot;, because it keeps track of the intrinsic characteristics of the vulnerability which are constant over time.</p>
<p>The CVSS score of a CVE is linked to a CVSS vector. A CVSS vector gives a short view of a vulnerability&apos;s characteristics. These characteristics determine its severity. The vector string has metric names and values separated by forward slashes. For example, the following CVSS vector:</p>
<pre><code>AV:L/AC:H/PR:L/UI:R
</code></pre>
<p>Represents a vulnerability with the following metrics: <code>Attack Vector (AV): Local</code>, <code>Attack Complexity (AC): High</code>,  <code>Priviledge Required (PR): Low</code> and <code>User Interaction(UI): Required</code>.</p>
<p>The result at the end of the scoring process is what developers and engineers usually exchange when they refer to a CVE: <a href="https://nvd.nist.gov/vuln/detail/CVE-2023-27536">CVE-2023-27536 - nvd.nist.gov</a>.</p>
<h2 id="challenges-in-vulnerabilities-assessment">Challenges in vulnerabilities assessment</h2>
<p>This section summarizes some of the challenges you might face in doing vulnerabilities assessment and in defining a vulnerabilities assessment process.</p>
<h3 id="cves-system-is-not-perfect">CVEs system is not perfect</h3>
<p>CVEs, CVSS scoring are not perfect. Some OSS software maintainers have expressed their frustration with reported CVEs that are not really security flaws<sup class="footnote-ref"><a href="#fn4" id="fnref4">[4]</a></sup><sup class="footnote-ref"><a href="#fn5" id="fnref5">[5]</a></sup>. This is the main cause of CVE disputes. You should look at why the CVE is disputed. This prevents wasting effort in assessing something that is not a vulnerability.</p>
<h3 id="cvss-base-score-is-not-accurate">CVSS base score is not accurate</h3>
<p>The CVSS base score is the raw score and severity given to a CVE as you find on NVD or other CVEs databases. It doesn&apos;t include any environmental, temporal metrics<sup class="footnote-ref"><a href="#fn6" id="fnref6">[6]</a></sup>. The CVSS base score is not enough to measure the true severity of a vulnerability. It ignores how and where the affected component is deployed or consumed. The new CVSS 4.0 specification clarifies that the base score alone is not an accurate measure of the severity of a vulnerability. The specification states that the base score (CVSS-B) should be combined with the other metric groups. These include the Threat (CVSS-BT) and Environmental (CVSS-BE) groups.</p>
<p>Populating the Environmental and Temporal metric groups, in addition to the Base score metrics, gives a more precise measure of the vulnerability risk<sup class="footnote-ref"><a href="#fn7" id="fnref7">[7]</a></sup>. This enables organizations to prioritize their mitigation efforts and use resources more efficiently.</p>
<h3 id="keep-analysis-consistent">Keep analysis consistent</h3>
<p>It is important to keep analysis consistent when evaluating vulnerabilities. This means the similar/same CVEs should be analyzed in the same way. This is true for the different services it impacts. Also, the analysis should not depend on who is doing it. The same CVE might be evaluated by different engineers on the project. Inconsistency often indicates a lack of a documented process that is shared across the team.</p>
<h3 id="vulnerabilities-hot-spots">Vulnerabilities hot spots</h3>
<p>Vulnerabilities&apos; hot spots are the parts and components of the software dependency tree affected by many CVEs. There are different factors that can lead to vulnerabilities hotspots. It could be an old dependency. Older dependencies usually get more CVEs. Or, it could be a base image that installs many packages.</p>
<p>It is important to identify these hot spots. That said, some parts of a system are more prone to vulnerabilities. For example, config parsing and I/O systems are more sensitive to vulnerabilities. For instance, sometimes a CVE impacts a protocol. For example, <a href="https://nvd.nist.gov/vuln/detail/CVE-2023-44487">CVE-2023-44487</a> affects the HTTP/2 protocol, which implies that all the components handling HTTP/2 protocol will be vulnerable.</p>
<h2 id="analyze-vulnerabilities">Analyze vulnerabilities</h2>
<p>This section contains some tips on how to assess vulnerabilities. It presumes you did everything possible to fix the vulnerabilities in your system. For example, you updated to the latest dependency versions. And you have a system in place (usually a vulns. scan tooling running at CI level across your repositories) for detecting new vulnerabilities.</p>
<h3 id="early-checks">Early checks</h3>
<p>I typically perfrom some early checks before doing an in-depth analysis. The early checks should be documented and shared between the engineers in the team. There are many reasons for performing some early checks:</p>
<ul>
<li>Vulnerabilities analysis tooling might detect false positive.</li>
<li>Vulnerabilities data sources are not always accurate.</li>
<li>Vulnerabilities are usually affecting section of a dependency.</li>
</ul>
<p>Below some questions to ask yourself before analyze a vulnerability. The goal of a early checks is to save time by excluding the vulnerability from a more in-depth analysis.</p>
<h4 id="1-is-a-false-positive">1. Is a false positive?</h4>
<p>The tools that find CVEs are not always reliable. They might report false positives<sup class="footnote-ref"><a href="#fn8" id="fnref8">[8]</a></sup>. It is better to rule out this scenario first.</p>
<p>Also, the scans tools can have errors at different levels. Most of the tools look for the packages and binaries within a dependency. They compare them with a set of data sources, usually NVD, GitHub Advisory Database, and others. If the CVE metadata in a database is not correct, then the tooling would flag a false-positive. An example is <code>CVE-2023-39017</code>. It affected the <code>quartz-jobs</code> package but also showed up on the <code>quartz</code> package<sup class="footnote-ref"><a href="#fn9" id="fnref9">[9]</a></sup>.</p>
<h4 id="2-is-the-cve-os-specific">2. Is the CVE OS-specific?</h4>
<p>Some vulnerabilities are effetive only when the affected system run on a certain operating system. If your deployment is not using that OS, then the vulnerability cannot be exploited.</p>
<p>For reference:</p>
<ul>
<li><a href="https://rustsec.org/advisories/RUSTSEC-2023-0001.html">RUSTSEC-2023-0001</a> affecting <code>tokio</code> only when the library is running on Windows.</li>
<li><a href="https://nvd.nist.gov/vuln/detail/CVE-2023-4807">CVE-2023-4807</a> affecting <code>openssl</code> only for Windows 64 platform.</li>
</ul>
<p>It is often not very costly to check if a vulnerability depends on the OS. Therefore, doing this early check usually saves time excluding vulnerabilities affecting a OS not used by your solution.</p>
<h4 id="3-is-the-cve-exploitable-under-any-particular-condition">3. Is the CVE exploitable under any particular condition?</h4>
<p>This point builds on the previous one. A vulnerability may only impact some features of a library. For instance, <a href="https://nvd.nist.gov/vuln/detail/CVE-2023-37276">CVE-2023-37276</a> only affects the HTTP server (i.e. <code>aiohttp.Application</code>) of the <code>aiohttp</code> package. It is possible that your system only uses the HTTP client capabilities of the package. So, the system affected by a vulnerability might be safe.</p>
<h4 id="4-is-the-cve-disputed">4. Is the CVE disputed?</h4>
<p>Vendors or maintainers might dispute vulnerabilities. Disputes can happen for various reasons, few examples:</p>
<ul>
<li><a href="https://nvd.nist.gov/vuln/detail/CVE-2018-20225">CVE-2018-20225</a> the vulnerability is considered a feature of the pip package manager.</li>
<li><a href="https://nvd.nist.gov/vuln/detail/CVE-2023-45322">CVE-2023-45322</a> is not considered critical enough by the vendor to reserve a CVE ID.</li>
<li><a href="https://nvd.nist.gov/vuln/detail/CVE-2023-39017">CVE-2023-39017</a> is another disputed vulnerability. The reason is the plausibility of untrusted user input to reach the code where injection could occur.</li>
</ul>
<p>A disputed CVE is not always a sign of low risk. But, the person who disputes the CVE is usually the maintainer or owner of the affected component. They usually give the background and insights explaining why the CVE is not a security issue. Depending on that the prioritization of the fix might change.</p>
<h3 id="scoring-analysis">Scoring analysis</h3>
<p>The early checks eliminates the vulnerabilities that are not exploitable or do not affect the product. Then, the scoring analysis assess the severity of the vulnerability is for the system. This can be done by populating the additional metrics groups of the CVSS score.<sup class="footnote-ref"><a href="#fn10" id="fnref10">[10]</a></sup></p>
<p>Next there are some questions to consider while populating the additional metrics groups and assessing of the severity.</p>
<h4 id="is-the-affected-system-deployed-behind-network-boundary">Is the affected system deployed behind network boundary?</h4>
<p>Many vulnerabilities are associated to the <code>Attack Vector: Network</code> metric. The network attack vector establishes that the attack has to be conducted over the network to be successful. If the affected component is in a private network or behind a firewall, the attacker would need to be inside the network to exploit the vulnerability. So, if the system is deployed on a private network, the <code>Modified Attack Vector</code> can be overwritten to <code>Adj. network</code><sup class="footnote-ref"><a href="#fn11" id="fnref11">[11]</a></sup>. This will contribute to reduce the severity of the vulnerability.</p>
<h4 id="is-the-affected-system-running-on-containers">Is the affected system running on containers?</h4>
<p>The CVSS metrics might have different implications depending on the infrastructure where the vulnerable component is running. For example in case the affected system is deployed in a container orchestration environment (e.g.: Kubernetes) <sup class="footnote-ref"><a href="#fn12" id="fnref12">[12]</a></sup>.</p>
<p>A vulnerability with <code>Attack Vector: Local</code> means that the attacker needs local access to the system. In a container orchestration system, that would mean that the attacker has high privilege and they can run a local command on a container. So, the privileges needed to launch an attack increases.</p>
<p>Moreover, a vulnerability that enables privilege escalation on a system deployed in a container orchestrated environment would only give the privileges within the container (not the whole host). Therefore, in some cases, the CVSS confidentiality availability and integrity metrics can be lowered.</p>
<h4 id="populate-the-temporal-score-metrics">Populate the Temporal Score metrics</h4>
<p>The Temporal score metrics<sup class="footnote-ref"><a href="#fn6" id="fnref6:1">[6:1]</a></sup> measure the current state of exploit techniques. In CVSS 3.1, temporal metrics are defined as:</p>
<ul>
<li><code>Exploit Code Maturity</code> which measures the likelihood of the vulnerability being attacked.</li>
<li><code>Remediation Level</code> which tracks if there is an official fix or a workaround in place.</li>
<li><code>Report confidence</code> which measures the degree of confidence in the existence of the vulnerability and the technical details related to that.</li>
</ul>
<p>During analysis, fill in Temporal Score metrics to reflect the status of the vulnerability at the time of the assessment.</p>
<h2 id="an-approach-to-analysed-vulnerabilities-tracking">An approach to analysed vulnerabilities tracking</h2>
<p>As discussed, analysis consistency can be challenging. Specially when the analysis is performed by multiple engineers. A possible solution is to have a set of assessment questions that the engineer should answer when doing vulnerability analysis. Besides that, it is helpful to record in a central place the vulnerabilities previously analysed. One way to do this is to maintain a file with the known vulnerabilities of the system in the git repository(s). If the system is built by many repositories, each service will have its own analysis file. The file should contain the following information for each vulnerability that is analysed:</p>
<ul>
<li><em>CVE-ID (string)</em>: The unique identifier of the vulnerability.</li>
<li><em>Affected component (string)</em>: The component affected by the vulnerability. For example: a 3rd-party OSS package or a container image. Note that the version of the affected component should be specified as well. (i.e.: <code>crypthography:38.0.0</code>, <code>&lt;your_registry&gt;/busybox:1.27</code>).</li>
<li><em>Modified score (numeric)</em>: The modified score derived from the analysis and the modified CVSS vector.</li>
<li><em>Modified CVSS vector (string)</em>: The modified CVSS vector derived from the assessment.</li>
<li><em>Analysis details (text)</em>: A text that briefly explains the reasoning behind the analysis that has been done.</li>
</ul>
<p>The fields above would be repeated for each assessed vulnerability, and they can be stored on a versioned file in any format: JSON, YAML, Markdown. By documenting the examined vulnerabilities and adding them to the repositories, there are several benefits:</p>
<ul>
<li>You can keep track of all the analysis in one place and everything is versioned. An engineer can look at previous analysis and take them as reference or correct them.</li>
<li>If you use git tags or release branches to deploy the service, you will have an analysis file for each release. Therefore you can check the status of the vulnerabilities at each release point.</li>
<li>You can track the history of the analysis using git history.</li>
<li>It is easy to add automation to the CI pipeline (e.g.: alerting the internal security team in case a new vulnerability is detected but not analyzed yet).</li>
</ul>
<h2 id="vulnerabilities-in-container-images">Vulnerabilities in container images</h2>
<p>Today, a lot of production software runs container orchestration systems (e.g. Kubernetes). Orchestration systems comes with Cloud-native patterns, e.g.: sidecar containers and container injection. These patterns can be useful and efficient in some cases. But, they also force you to depend on a high number of container images<sup class="footnote-ref"><a href="#fn13" id="fnref13">[13]</a></sup>. This can pose some problems. A container image might have a lot of packages and dependencies which are vectors for vulnerabilities.</p>
<p>Minimal container images come handy here. Make sure you have only the minimum dependencies to run your app. Also, have the correct pre-configured <code>nonroot</code> user. Scratch or distroless images are good for these cases. Canonical recently presented an interesting set of toolings inspired by distroless images, called &quot;chisel&quot;<sup class="footnote-ref"><a href="#fn14" id="fnref14">[14]</a></sup>. Chisel produces &quot;chiselled images&quot;: it lets you create images with a subset of Debian packages. It is based on the idea that a package (let&apos;s call it <code>package A</code>) which depends on another package (<code>package B</code>) only uses some of <code>package B</code>&apos;s files. So, we can chisel <code>package B</code> to cut the unnecessary content. This will reduce the chance of vulnerabilities.</p>
<p>Finally, before adding a container image as a dependency of your system, check if the image follows best practices.</p>
<ul>
<li>Is there a <code>nonroot</code> user set up in the image?</li>
<li>Does the image come from a reputable vendor?</li>
<li>Is the image creation process open-sourced?</li>
<li>How often is a new version of the image released per month?</li>
<li>In case a CVE impacts the image, how does the source release a fixed version of the image?</li>
<li>In case a CVE impacts the image, does the vendor fix all the major versions of the image or only the most recent one?</li>
<li>If you are running on a containerized environment, make sure to use minimal container images.</li>
</ul>
<p>The considerations above might help you reduce the number of vulnerabilities to assess in your container-based system.</p>
<h2 id="prioritize-mitigation-with-epss">Prioritize mitigation with EPSS</h2>
<p>first.org is introducing a new scoring system called EPSS. The definition is:</p>
<blockquote>
<p>The Exploit Prediction Scoring System (EPSS) is a data-driven effort for estimating the likelihood (probability) that a software vulnerability will be exploited in the wild.</p>
</blockquote>
<p>A detailed explanation of EPSS can be found in the paper: <a href="https://arxiv.org/pdf/2302.14172.pdf">Enhancing Vulnerability Prioritization: Data-Driven Exploit Predictions with Community-Driven Insights</a>.</p>
<p>EPSS doesn&apos;t measure the risk of a vulnerability, but the probability for the vulnerability to be exploited. So, if we combine the EPSS and CVSS scores, we can see the chance of a vulnerability being exploited and the impact of the exploit. This can help organizations prioritize their vulnerability management. It can also help them allocate resources better.</p>
<h2 id="final-thoughts">Final thoughts</h2>
<p>This post went through the lifecycle and the challenges in assessing vulnerabilities. The few takeaways:</p>
<ul>
<li>There is no doubt that vulnerabilities must be fixed. Try your best to keep updated all the 3rd party components consumed by your software.</li>
<li>The ecosystem surrounding CVEs and CVSS works, but is not perfect. Don&apos;t blindly accept the severity of a CVE as accurate.</li>
<li>CVSS Base score does not reflect the true impact of a vulnerability on your system. To get a accurate severity, you need to assess the CVE in relation to your system and populate the additional CVSS metrics.</li>
<li>Establish a method for identifying, evaluating, and tracking CVEs.</li>
<li>Do whatever you could to keep the container images as slim as possible.</li>
</ul>
<hr class="footnotes-sep">
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn1" class="footnote-item"><p>A partner is a trusted source participating to the CVE program. It could be a company or an organization. A Partner could be also a CVE Numbering Authority (CNA) that assigns new CVE IDs. <a href="https://www.cve.org/PartnerInformation/ListofPartners">The list of partners</a> is available on cve.org. <a href="#fnref1" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn2" class="footnote-item"><p>For example, the Apache Software Foundation (ASF) has an established <a href="https://www.apache.org/security/">security policy</a>. That said, usually the new vulnerabilities are reported in-private directly in the project security mailing list. <a href="#fnref2" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn3" class="footnote-item"><p>At the time of writing, NVD is delaying the analysis of vulnerabilities and addressing challenges in the NVD program. More details available at <a href="https://resilientcyber.substack.com/p/death-knell-of-the-nvd">Death Knell of the NVD? - Resilient Cyber</a> <a href="#fnref3" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn4" class="footnote-item"><p><a href="https://daniel.haxx.se/blog/2023/08/26/cve-2020-19909-is-everything-that-is-wrong-with-cves/">CVE-2020-19909 is everything that is wrong with CVEs | daniel.haxx.se</a> <a href="#fnref4" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn5" class="footnote-item"><p><a href="https://www.postgresql.org/about/news/cve-2020-21469-is-not-a-security-vulnerability-2701/">PostgreSQL: CVE-2020-21469 is not a security vulnerability</a> <a href="#fnref5" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn6" class="footnote-item"><p>Starting from CVSS 4.0 the &quot;Temporal score metrics&quot; have been renamed to &quot;Threat metrics&quot;. <a href="#fnref6" class="footnote-backref">&#x21A9;&#xFE0E;</a> <a href="#fnref6:1" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn7" class="footnote-item"><p><a href="https://www.first.org/cvss/v4-0/cvss-v40-presentation.pdf">Announcing CVSS v4.0 - first.org</a> <a href="#fnref7" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn8" class="footnote-item"><p>The vuln. detection tools can detect false positive/false negatives. Here is two lists: <a href="https://github.com/aquasecurity/trivy/discussions/categories/false-detection">Trivy false positive reports</a> and <a href="https://github.com/anchore/grype/issues?q=label%3Afalse-positive">Grype false positive reports</a> <a href="#fnref8" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn9" class="footnote-item"><p>See the following GitHub issue <a href="https://github.com/quartz-scheduler/quartz/issues/943">quartz-scheduler - #943</a>. The discussion in the issue highlights multiple problems related to the CVE. From the wrong package associated to the vulnerability to the &quot;indefensible&quot; CVSS 3.1 base score. <a href="#fnref9" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn10" class="footnote-item"><p>There are multiple CVSS tools online for populating the additional metrics groups. Few examples: <a href="https://nvd.nist.gov/vuln-metrics/cvss/v3-calculator">NVD - CVSS v3 Calculator</a>, <a href="https://www.first.org/cvss/calculator/3.0">Common Vulnerability Scoring System Version 3.0 Calculator</a>. You can copy and paste the CVSS vector of the vulnerability and start populating the metrics. <a href="#fnref10" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn11" class="footnote-item"><p>See &quot;Table 1: Attack Vector&quot; for a description of the &quot;Adjacent&quot; metric in <a href="https://www.first.org/cvss/v3.1/specification-document#2-1-1-Attack-Vector-AV">CVSS v3.1: Specification Document</a> <a href="#fnref11" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn12" class="footnote-item"><p><a href="https://www.redhat.com/en/blog/containers-vulnerability-risk-assessment">Containers vulnerability risk assessment - Red Hat Blog</a> discusses how the impact of vulnerabilities on containerized environments can differ from traditional operating systems by providing some concrete examples. <a href="#fnref12" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn13" class="footnote-item"><p>The container images are usually fetched from, or built by 3rd party providers. Some examples on top of my mind are Istio or Ingress NGINX controller image. <a href="#fnref13" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn14" class="footnote-item"><p><a href="https://github.com/canonical/chisel">Github - canonical/chisel</a> <a href="#fnref14" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
</ol>
</section>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Changes and improvements in CVSS 4.0]]></title><description><![CDATA[The CVSS Special Interest Group (SIG) recently released the new 4.0 version of CVSS. This post outlines the changes and the improvements in CVSS 4.0. These notes originate from the CVSS 4.0 public preview presentations.]]></description><link>https://samueleresca.net/changes-and-improvements-in-cvss-4-0/</link><guid isPermaLink="false">64afe15a827b68149a699115</guid><category><![CDATA[security]]></category><dc:creator><![CDATA[Samuele Resca]]></dc:creator><pubDate>Thu, 14 Mar 2024 19:57:05 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p>The Common Vulnerability Scoring System (CVSS) is a system for capturing the properties and severity of software vulnerabilities. The <a href="https://first.org">Forum of Incident Response and Security Teams (FIRST)</a> and the CVSS Special Interest Group (SIG) recently released the 4.0 version of CVSS<sup class="footnote-ref"><a href="#fn1" id="fnref1">[1]</a></sup>.</p>
<p>This post outlines the improvements introduced in CVSS 4.0 and what&apos;s changed compared with CVSS 3.1. The post originates from the CVSS 4.0 public preview presentations<sup class="footnote-ref"><a href="#fn2" id="fnref2">[2]</a></sup><sup class="footnote-ref"><a href="#fn3" id="fnref3">[3]</a></sup>.</p>
<h2 id="base-score-alone-is-not-accurate">Base score alone is not accurate</h2>
<p>The CVSS 3.1 Base score is used as a primary input for assessing CVEs. The Base score definition captures all the vulnerability&apos;s characteristics that don&apos;t change over time nor across user environments. Most of the CVEs published on the vendor databases comes with the base score already set. Downstream consumers often rely only on the base score for doing vulnerability assessment and prioritization.</p>
<p>CVSS 4.0 emphasize that the Base score is only one part of the CVSS scoring. Introducing the following terms:</p>
<ul>
<li>CVSS-B refers to the CVSS base score alone.</li>
<li>CVSS-BT refers to a CVSS score that provides both the CVSS Base and Threat metrics.</li>
<li>CVSS-BE refers to a CVSS score that provides both CVSS Base and Environmental metrics.</li>
<li>CVSS-BTE refers to a CVSS score that provides CVSS Base, Threat and Environmental metrics.</li>
</ul>
<p>CVSS 4.0 states that using more metrics makes the vulnerability assessment more accurate. <sup class="footnote-ref"><a href="#fn2" id="fnref2:1">[2:1]</a></sup> Moreover, CVSS-B should not be used alone to set the remediation priority of a vulnerability.</p>
<h2 id="temporal-metrics-werent-impacting-the-final-score">Temporal metrics weren&apos;t impacting the final score</h2>
<p>In CVSS 3.1, the <em>Temporal Score metrics</em> had a low impact on the overall score. The <em>Temporal Score metrics</em> are <code>Exploit Code Maturity (E)</code>, <code>Remediation Level (RL)</code>, and <code>Report Confidence (RC)</code>. During the assessment, the <code>Remediation Level (RL)</code> is frequently set as <code>Official fix (O)</code> and the <code>Report Confidence (RC)</code> is frequently set as <code>Confirmed (C)</code>.</p>
<p>CVSS 4.0 introduces the following changes:</p>
<ul>
<li>Rename the <em>Temporal Score metrics</em> to <em>Threat Metrics</em>.</li>
<li>Rename the <code>Exploit Code Maturity</code> to <code>Exploit Maturity</code>.</li>
<li>The <code>Exploit Maturity</code> is now enumerated with <code>Attacked (A)</code>, <code>POC (P)</code>, <code>Unreported (U)</code>.</li>
<li>Retire the <code>Remediation Level (RL)</code> and <code>Report Confidence (RC)</code> metrics.</li>
</ul>
<p>On top of the changes above, the <em>Threat Metrics</em> have now a higher impact on the calucation of the final CVSS-BTE score.</p>
<h2 id="user-interaction-metric-wasnt-granular-enough">User Interaction metric wasn&apos;t granular enough</h2>
<p>The <code>User Interaction (UI)</code> metric in the CVSS 3.1 specification has two possible values: <code>None(N)</code>, <code>Required(R)</code>. These values do not tell the difference between voluntary and involuntary interaction.</p>
<p>CVSS 4.0 introduces new metrics values for the <code>User Interaction (UI)</code> metric:</p>
<ul>
<li><code>None (N)</code> : the vulnerable system can be exploited without intervention.</li>
<li><code>Passive (P)</code> : a user needs to interact with the vulnerable system. These interactions would be involuntary.</li>
<li><code>Active (A)</code>: a user must interact with the vulnerable component and the attacker&#x2019;s payload. Or, the user&apos;s actions would subvert protection mechanisms and lead to exploitation.</li>
</ul>
<p>The changes above provide a better representation of the type of interaction needed for exploiting the system.</p>
<h2 id="scope-s-metric-was-ambiguous">Scope (S) metric was ambiguous</h2>
<p>CVSS 3.1 defined a <code>Scope (S)</code> metric. The metric captured the idea of a vulnerable system, that when compromised, could also impact resources beyond the security control of the vulnerable system. The metrics was inconsistent across the vendors.</p>
<p>CVSS 4.0 introduces the concept of <em>vulnerable system</em>, which refers to the actual system that is vulnerable. And a <em>subsequent systems</em>, referring to the downstream system that is impacted by the exploitability of the vulnerable system.</p>
<p>The new CVSS 4.0  <code>Impact Metrics</code> are split into two groups. They represent the difference between the vulnarable and the subsequent system. Each group contains the familiar <code>Confidentiality (C)</code>, <code>Integrity (I)</code>, <code>Availability (A)</code> metrics.</p>
<p>This approach remove the need of the <code>Scope (S)</code> metric.</p>
<h2 id="new-supplemental-metrics">New supplemental metrics</h2>
<p>CVSS 4.0 introduces a set of <code>Supplemental Metrics</code>. They give additional information on a vulnerability. Note that <code>Supplemental metrics</code> do not affect the CVSS score. They store extra information to help organizations assessments. It is up to the organizations ignore or use this group of metrics during the assessment.</p>
<p>Below is a summary of the supplemental metrics added in CVSS 4.0.</p>
<h3 id="safety-s">Safety (S)</h3>
<p><code>Safety (S)</code> is intended for fitness devices. Exploiting the vulnerability might have a safety impact on an individual or a system.</p>
<p>The <code>Safety (S)</code> metric can assume the following values:</p>
<ul>
<li><code>Not Defined (X)</code> indicates that the metric is not defined.</li>
<li><code>Negligible (N)</code> means the vulnerability&apos;s consequences are &quot;Negligible&quot; by IEC 61508.</li>
<li><code>Present (P)</code> means that the vulnerability&apos;s consequences meet the IEC 61508 consequence definitions. These include &quot;Marginal&quot;, &quot;Critical&quot;, or &quot;Catastrophic&quot;.</li>
</ul>
<h3 id="automatable-au">Automatable (AU)</h3>
<p><code>Automatable (AU)</code> metric answers the following question &quot;Can the attackers automate the exploitation across multiple targets?&quot;. The <code>Automatable (AU)</code> metric can have these self-explanatory values: <code>Not Defined (X)</code>, <code>No (N)</code>, <code>Yes (Y)</code>.</p>
<h3 id="recovery-r">Recovery (R)</h3>
<p>The <code>Recovery (R)</code> metric describes how well a system can recover from exploitation. The ability to recover has to be intended from an availability and performance point of view.</p>
<p>The <code>Recovery (R)</code> metric can assume the following values:</p>
<ul>
<li><code>Not Defined (X)</code>: the metric is not defined.</li>
<li><code>Automatic (A)</code>: the component recovers automatically after an attack.</li>
<li><code>User (U)</code>: The component requires manual intervention by the user to recover after an attack.</li>
<li><code>Irrecoverable (I)</code>: the component is irrecoverable by the user, after an attack.</li>
</ul>
<h3 id="value-density-v">Value Density (V)</h3>
<p>The <code>Value Density (V)</code> metric represents the amount of resources an attacker will gain control of with a single exploitation event.</p>
<p>The <code>Value Density(V)</code> metric can assume the following values:</p>
<ul>
<li><code>Diffuse (D)</code>: the system that contains the vulnerable components has limited resources.</li>
<li><code>Concentrated (C)</code>: the system that contains the vulnerable component is rich in resources.</li>
</ul>
<h3 id="vulnerability-response-effort-re">Vulnerability Response Effort (RE)</h3>
<p>The <code>Vulnerability Response Effort (RE)</code> measures how difficult it is for consumers to provide an initial response to the impact of vulnerabilities for deployed services.</p>
<p>The <code>Vulnerability Response Effort (RE)</code> has the following values:</p>
<ul>
<li><code>Low (L)</code>: the effort required to respond to the vulnerability is low. Some example provided includes configuration workarounds.</li>
<li><code>Medium (M)</code>: responding to the vulnerability will need some effort. It could cause minimal service impact. Some examples provided by the CVSS 4.0 specifications include: simple remote updates or a low-touch software upgrade.</li>
<li><code>High (H)</code>: the actions required to respond to a vulnerability are significant and/or difficult. Some examples provided by the CVSS 4.0 specifications include: a highly privileged driver update or updates that require careful analysis.</li>
</ul>
<h3 id="provider-urgency-u">Provider Urgency (U)</h3>
<p>The <code>Provider Urgency (U)</code> metric is meant to capture the provider&apos;s assessment. It shows how urgent the vulnerability should be fixed. It can assume the following: <code>Not Defined (X)</code>, <code>Red</code>, <code>Amber</code>, <code>Green</code>, <code>Clear</code>. Where <code>Red</code> implies the higher urgency while <code>Clear</code> the lower (no urgency).</p>
<p>CVSS 4.0 also specifies that any provider along the supply chain may set a <code>Provider Urgency</code> rating. For example:</p>
<pre><code>Library Maintainer Urgency -&gt; OS/Distro Maintainer Urgency -&gt; Provider 1 .. 
Provider N  Urgency -&gt; Consumer
</code></pre>
<p>From a <code>Consumer</code> point of view, the <code>Provider N</code> urgency is the more accurate.</p>
<h2 id="new-epss-scoring-system">New EPSS scoring system</h2>
<p>On top of the CVSS 4.0 release, <a href="https://first.org">FIRST</a> recently introduced a new scoring system called EPSS<sup class="footnote-ref"><a href="#fn4" id="fnref4">[4]</a></sup>. The system relies on a model trained on list of data sources, see <em>Table 1. Description of data sources used in EPSS</em><sup class="footnote-ref"><a href="#fn5" id="fnref5">[5]</a></sup>, for estimating the probability that a vulnerability would be exploited within 30 days following the prediction.</p>
<h2 id="recap">Recap</h2>
<ul>
<li>The Common Vulnerability Scoring System (CVSS) is a system for capturing the properties and severity of software vulnerabilities.</li>
<li>CVSS 4.0 introduces new terms and emphasizes that the Base score is only one part of the CVSS scoring.</li>
<li>The <em>Temporal Score metrics</em> have been renamed to <em>Threat Metrics</em> and have a higher impact on the final score.</li>
<li>The <code>User Interaction (UI)</code> metric has been updated to provide a better representation of the type of interaction needed for exploitation.</li>
<li>The <code>Scope (S)</code> metric has been removed and replaced with the concept of vulnerable and subsequent systems.</li>
<li>CVSS 4.0 introduces a set of Supplemental Metrics that do not affect the CVSS score but provide additional information on a vulnerability.</li>
<li>FIRST also introduced a new scoring system called EPSS, which estimates the probability that a vulnerability would be exploited within 30 days following the prediction.</li>
</ul>
<hr class="footnotes-sep">
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn1" class="footnote-item"><p><a href="https://www.first.org/cvss/v4-0/">CVSS 4.0 specifications</a>. At the time of writing, most vendors have not adopted CVSS 4.0 yet. However, CVSS 4.0 launched in late 2023. It will take time before it is widely used by vulnerabilities databases and integrated with the vulnerabilities detection tools. <a href="#fnref1" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn2" class="footnote-item"><p><a href="https://csrc.nist.gov/csrc/media/Presentations/2023/update-on-cvss-4-0/jan-25-2023-ssca-dugal-rich.pdf">CVSS v4.0 NIST presentation</a> <a href="#fnref2" class="footnote-backref">&#x21A9;&#xFE0E;</a> <a href="#fnref2:1" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn3" class="footnote-item"><p><a href="https://www.first.org/cvss/v4-0/cvss-v40-presentation.pdf">Announcing CVSS v4.0</a> <a href="#fnref3" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn4" class="footnote-item"><p><a href="https://www.first.org/epss/">Exploit Prediction Scoring System</a> <a href="#fnref4" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn5" class="footnote-item"><p><a href="https://arxiv.org/abs/2302.14172">Jacobs, Jay, Sasha Romanosky, Octavian Suciu, Ben Edwards, and Armin Sarabi. &quot;Enhancing Vulnerability prioritization: Data-driven exploit predictions with community-driven insights.</a> <a href="#fnref5" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
</ol>
</section>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Memory management optimization techniques]]></title><description><![CDATA[This article builds upon the previous one from this series and delves into the optimizations outlined in the "What a programmer can do" section of the "What every programmer should know about memory" paper by Ulrich Drepper. The code example is implemented in Rust]]></description><link>https://samueleresca.net/memory-management-optimizations-techniques/</link><guid isPermaLink="false">641581d7827b68149a698349</guid><category><![CDATA[performance]]></category><dc:creator><![CDATA[Samuele Resca]]></dc:creator><pubDate>Sat, 20 May 2023 13:42:27 GMT</pubDate><media:content url="https://samueleresca.net/content/images/2023/05/boccadasse-2.png" medium="image"/><content:encoded><![CDATA[<img src="https://samueleresca.net/content/images/2023/05/boccadasse-2.png" alt="Memory management optimization techniques"><p>This article builds upon the <a href="https://samueleresca.net/analysis-of-what-every-programmer-should-know-about-memory">previous one from this series</a> and delves into the optimizations outlined in the &quot;What a programmer can do&quot; section of the &quot;What every programmer should know about memory&quot; paper by Ulrich Drepper. The code examples are implemented in Rust, and the execution is analyzed using perf. </p><div class="kg-card kg-callout-card kg-callout-card-grey"><div class="kg-callout-emoji">&#x1F4A1;</div><div class="kg-callout-text">The code in this post is available at <a href="https://github.com/samueleresca/what-every-programmer-should-know-about-memory">samueleresca/what-every-programmer-should-know-about-memory</a>.</div></div><p></p><!--kg-card-begin: markdown--><h2 id="understanding-perf">Understanding perf</h2>
<p>This post uses <code>perf</code> to explore how the code is executed on the CPU. <code>perf</code> is the official Linux profiler and it is included in the kernel source code, see <a href="https://perf.wiki.kernel.org/index.php/Tutorial">perf Tutorial</a>.</p>
<p>Let&apos;s take a moment to go through some of the main <code>perf</code> events that will be used in this post:</p>
<ul>
<li><code>instructions</code>: tells us how many instructions our code issued.</li>
<li><code>cycles</code> tells us how many CPU cycles our code took. The number of cycles is usually lower than the number of instructions because the CPU can issue many instructions per cycle. The difference between <code>cycles</code> and <code>instructions</code> is called the <code>IPC</code> (Instructions Per Cycle). The <code>IPC</code> is a good metric to understand how efficiently the CPU is running your code. The higher the <code>IPC</code>, the better. It is also an indicator of how well the code is doing vectorization.</li>
<li><code>cache-references</code> is representing the number of times the CPU accessed the cache. if we do not already have this data in the cache, we will have to access the main memory.</li>
<li><code>cache-misses</code> is representing the number of times the CPU accessed the main memory. As mentioned in the previous post of this series, this is a very expensive operation, and we want to cut it.</li>
<li><code>task-clock</code>: refers to how many clock cycles our task took. The <code>task-clock</code> is usually in milliseconds. It differs from the total runtime because if the program uses many CPUs or threads, the <code>task-clock</code> will be the sum of the time spent on each CPU or thread.</li>
<li><code>cs</code> is representing the context switches happening during the code execution.</li>
<li><code>page-faults</code> is representing the number of page faults happening during the code execution. Usually, there are dedicated events to distinguish between <code>minor-page-faults</code> and <code>major-page-faults</code>.</li>
</ul>
<p>Note that, depending on the CPU model and the manufacturer, the events available in <code>perf</code> will be different. In some cases, dedicated cache layer events, such as L1 cache misses / references, will be available. In other cases, you will have to use the generic <code>cache-misses</code>/ <code>cache-references</code> events.</p>
<h2 id="bypassing-cache">Bypassing cache</h2>
<p>One of the ways of influencing memory proposed by the paper is to bypass CPU caches when data is initialized and not reused immediately. This approach avoids pushing out data that is already cached in favour of data that will not be read soon.</p>
<p>One of the intrinsics provided by the processors is for <em>non-temporal writes</em> operations. <em>Non-temporal writes</em> function flags the data as non-temporal so is not stored in the CPU cache.</p>
<p>Below is the implementation of a standard vs non-temporal matrix initialization:</p>
<script src="https://gist.github.com/samueleresca/6fb753f9b90f0d3e32af3e687ec19953.js"></script>
<p>The <code>standard_initialize</code> function initializes a matrix in a standard way. The <code>nocache_initialize</code> function takes the <em>non-temporal</em> approach by using the <code>_mm_stream_si32</code> intrinsic function.</p>
<p>The <code>_mm_stream_si32</code> function is defined in the <a href="https://doc.rust-lang.org/beta/core/arch/x86_64/fn._mm_stream_si32.html">std::arch::x86_64::_mm_stream_si32</a> backed by the corresponding <a href="https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=_mm_stream_si32&amp;ig_expand=7207">Intel intrinsic</a>.</p>
<p>The following analysis uses the <code>perf stat</code> command to understand the impact of non-temporal writes on the process execution. Note that the code is built in optimized mode using the <code>--release</code> flag.</p>
<table>
<thead>
<tr>
<th>Event Name</th>
<th><em>standard_initialize</em></th>
<th><em>nocache_initialize</em></th>
</tr>
</thead>
<tbody>
<tr>
<td><code>l1d.replacement</code>   (in millions)</td>
<td>1,052,585</td>
<td>532,639</td>
</tr>
<tr>
<td><code>l2_lines_in.all</code></td>
<td>1,497,033</td>
<td>91,486</td>
</tr>
</tbody>
</table>
<p>The table contains the event count comparison between standard and <em>non-temporal</em> writes. The <code>l1d.replacement</code> event describes the number of cache lines replaced in the L1 data cache. The <code>l2_lines_in.all</code> event describes the L2 cache lines filling L2.</p>
<p>Note that the numbers in the tables are the average event counts for 10 executions. A single execution is monitored using the <code>perf stat</code> command.<br>
The events are part of the <a href="https://lore.kernel.org/lkml/tip-b115df076d337a727017538d11d7d46f5bcbff15@git.kernel.org/">Intel Ice Lake V1.00 events definitions</a>.</p>
<p>The analysis above shows how the initialization using the <em>non-temporal</em> approach avoids performing L1d cache replacement. On top of that, the L2 cache lines are ~16x less than a standard initialization.</p>
<h2 id="cache-optimization-with-sequential-access-and-loop-tiling">Cache optimization with sequential access and loop tiling</h2>
<p>Sequential access is crucial for improving the L1d cache performances because the processor prefetches data.</p>
<p>The paper takes as an example a matrix multiplication described as:</p>
<p>$$c_{i,j} \space = \space a_{i,1} \space b_{1,j} + \space a_{i,2} \space b_{2,j} + \space \cdots + \space a_{i,n} \space b_{n,j}$$</p>
<p>Matrix multiplication can be performed as follow:</p>
<script src="https://gist.github.com/samueleresca/f6a3e5e4b0041b397038ac3652f05d13.js"></script>
<p>The matrix multiplication above is an example of an operation that can be optimized by taking advantage of sequential access. The <code>m1</code> is accessed sequentially (by accessing the matrix per row), while the <code>m2</code> is not accessed sequentially (the implementation iterates on the columns first).</p>
<p>It is possible to transpose the <code>m2</code> matrix to access it sequentially. This is done in the <code>optimized</code> function below:</p>
<script src="https://gist.github.com/samueleresca/d8af00ccb0bc0b6bee01098219b0961a.js"></script>
<p>Translating and multiplying the <code>m2</code> matrix results in better performance. The table below shows the <code>cycles</code> and <code>instruction</code> perf events resulting from the execution of the <code>non_optimized</code> and <code>optimized</code> functions (Again, the code has been compiled in release mode).</p>
<table>
<thead>
<tr>
<th>Event</th>
<th><em>non optimized</em></th>
<th><em>optimized</em></th>
</tr>
</thead>
<tbody>
<tr>
<td><code>cycles</code> (in billion)</td>
<td>13.66</td>
<td><strong>9.47</strong></td>
</tr>
<tr>
<td><code>instructions</code>  (in billion)</td>
<td>26.41</td>
<td><strong>24.97</strong></td>
</tr>
<tr>
<td><code>ins. per cycle</code></td>
<td>1.94</td>
<td><strong>2.64</strong></td>
</tr>
</tbody>
</table>
<p>The <code>ins. per cycle</code> is a metric that shows how many instructions are executed per cycle. The higher the value, the better the performance. The <code>optimized</code> function has higher <code>ins. per cycle</code> value because it is taking advantage of the sequential access for both the <code>m1</code> and <code>m2</code> matrices.</p>
<p>Another way to improve matrix multiplication speed is to increase the usage of L1d cache by implementing techniques such as <em>loop tiling</em>. The following code demonstrates how loop tiling is applied in matrix multiplication:</p>
<script src="https://gist.github.com/samueleresca/899ce8463a45ade58f5d7993a39f338c.js"></script>
<p>The loop tiling technique enhances cache locality by computing multiplications in blocks that can fit into the L1d cache, which maximizes cache usage and reduces cache misses. The results of the performance tests I run on three functions: <code>non_optimized</code>, <code>optimized</code>, and <code>optimized_tiled</code> are presented in the table below:</p>
<table>
<thead>
<tr>
<th>Event</th>
<th><em>non optimized</em></th>
<th><em>optimized</em></th>
<th><em>optimized with loop tiling</em></th>
</tr>
</thead>
<tbody>
<tr>
<td><code>cycles</code> (in billion)</td>
<td>13.66</td>
<td>9.47</td>
<td><strong>7.05</strong></td>
</tr>
<tr>
<td><code>instructions</code> (in billion)</td>
<td>26.47</td>
<td>24.97</td>
<td><strong>27.77</strong></td>
</tr>
<tr>
<td><code>ins. per cycle</code></td>
<td>1.94</td>
<td>2.64</td>
<td><strong>3.94</strong></td>
</tr>
<tr>
<td><code>L1d-cache-load-misses</code> (in billion)</td>
<td>1.48</td>
<td>0.13</td>
<td><strong>0.05</strong></td>
</tr>
</tbody>
</table>
<p>As seen in the table, the <code>optimized_tiled</code> function outperforms the other two functions, with higher <code>ins. per cycle</code> metric and fewer L1d cache misses. Note that additonal optimization can be probably achieved using SIMD instructions. Again, that would involve the use of the intrinsics provided by the CPU vendor.</p>
<h2 id="optimizing-using-huge-pages">Optimizing using huge pages</h2>
<p>Another aspect to consider in how to optimize an application is the use of <em>huge pages</em>. Huge pages are pages that are larger than the usual ~4KB page size. The advantage of using huge pages is that the TLB can store more entries. So the CPU can access more memory without having to access the page table in memory and re-do the page table walk. There is an awesome <a href="https://www.evanjones.ca/hugepages-are-a-good-idea.html">post by Evan Jones</a> that explains the advantages of using huge pages.<br>
The post refers to a <a href="https://www.usenix.org/conference/osdi21/presentation/hunter">Beyond malloc efficiency to fleet efficiency: a hugepage-aware memory allocator</a> paper by Google. The paper explains how they extended malloc to make it huge page aware and improve their requests-per-second (RPS) by 7.7% while reducing RAM usage by 2.4%.</p>
<p>Evan Jones provides the <a href="https://github.com/evanj/hugepagedemo">evanj/hugepagedemo</a> repository containing a Rust demo showing the huge pages advantages. Again, the demo use <code>perf</code> to show the analyze the performance of the code.</p>
<p>Some other techniques for achieving huge pages are described in the <a href="https://www.hudsonrivertrading.com/hrtbeat/low-latency-optimization-part-2/">Low Latency Optimization: Using Huge Pages on Linux (Part 2)</a> post by Guillaume Morin.</p>
<h2 id="wrap-up">Wrap-Up</h2>
<p>This post covered some of the techniques that can be used to perform adhoc optimizations on the code. The post includes bypassing cache using non-temporal writes, cache optimization with sequential access and loop tiling, and using huge pages for improving the TLB performance. These techniques can be applied to most programming languages. It worth noting that these optimizations are strongly dependent on the CPU architecture. The same applies also to the <code>perf</code> events tracked in the post.<br>
Finally, most of the theoretical concepts are explained in the <a href="https://lwn.net/Articles/250967/">What every programmer should know about memory</a> paper. The full code samples are available in the <a href="https://github.com/samueleresca/what-every-programmer-should-know-about-memory">samueleresca/what-every-programmer-should-know-about-memory</a> repository.</p>
<h2 id="references">References</h2>
<p><a href="https://lwn.net/Articles/255364/">What every programmer should know about memory (What a programmer can do) - Ulrich Drepper&apos;s</a></p>
<p><a href="https://www.evanjones.ca/hugepages-are-a-good-idea.html">Hugepages are a good idea - Evan Jones</a></p>
<p><a href="https://www.evanjones.ca/hugepages-are-a-good-idea.html">Beyond malloc efficiency to fleet efficiency: a hugepage-aware memory allocator - A.H. Hunter and Chris Kennelly and Paul Turner and Darryl Gove and Tipp Moseley and Parthasarathy Ranganathan</a></p>
<p><a href="https://www.hudsonrivertrading.com/hrtbeat/low-latency-optimization-part-2/">Low Latency Optimization: Using Huge Pages on Linux (Part 2) - Guillaume Morin</a></p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Analysis of 'What Every Programmer Should Know About Memory']]></title><description><![CDATA[This post highlights the main concepts useful to software engineers, presented in the preparatory part of the "What every programmer should know about memory" paper. The 2nd part of this series provides some Rust examples to explore how to write memory-optimized code]]></description><link>https://samueleresca.net/analysis-of-what-every-programmer-should-know-about-memory/</link><guid isPermaLink="false">63ea0538827b68149a697b2e</guid><category><![CDATA[performance]]></category><dc:creator><![CDATA[Samuele Resca]]></dc:creator><pubDate>Sat, 20 May 2023 13:41:59 GMT</pubDate><media:content url="https://samueleresca.net/content/images/2023/04/boccadasse-2.png" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: markdown--><img src="https://samueleresca.net/content/images/2023/04/boccadasse-2.png" alt="Analysis of &apos;What Every Programmer Should Know About Memory&apos;"><p>I have recently attended a Linux kernel programming training. One of the references in the memory management section of the training was the paper <a href="https://people.freebsd.org/~lstewart/articles/cpumemory.pdf">What Every Programmer Should Know About Memory</a> by Ulrich Drepper. The paper was published as a series in the <a href="lwn.net">LWN.net</a> newsletter.</p>
<!--kg-card-end: markdown--><!--kg-card-begin: markdown--><p>This post highlights the main concepts useful to software engineers presented in the paper. The 2nd part of this series (see <a href="https://samueleresca.net/memory-management-optimizations-techniques">Memory management optimization techniques</a>) provides some Rust examples to explore how to write memory-optimized code (coming from the &quot;What a programmer can do&quot; section).</p>
<h2 id="this-content-is-from-2007-why-i-should-care">This content is from 2007 why I should care?</h2>
<p>Ulrich Drepper&apos;s paper titled &quot;What Every Programmer Should Know About Memory&quot; was published in 2007.</p>
<p>The paper provides an extensive exploration of memory, covering topics such as management, organization, and optimization. These fundamental concepts continue to apply today, and their understanding is crucial for programmers to write efficient and effective code.</p>
<p>Some of the concepts described in the paper are still widely referred to in the industry. For example, <a href="https://www.hudsonrivertrading.com/hrtbeat/low-latency-optimization-part-1/">Low Latency Optimization: Understanding Huge Pages (Part 1)</a> refers to the paper in the Further reading section. Furthermore, the concepts explored in Drepper&apos;s paper can be used to comprehend and make decisions based on the output of performance tools such as <code>perf</code>.</p>
<p>In summary, despite being over a decade old, Ulrich Drepper&apos;s paper remains highly pertinent to software engineering today. Some of its insights and concepts continue to be applicable, and understanding them is essential for optimal code performance.</p>
<h2 id="commodity-hardware-architectures">Commodity hardware architectures</h2>
<p>The paper starts with a section about the anatomy of commodity hardware. Systems scaling is usually achieved horizontally these days, and commodity hardware is what services and applications run on.</p>
<p>The commodity hardware structure usually involves the following:</p>
<ul>
<li>The <em>northbridge</em> components put in communication the RAMs and the CPUs. The northbridge also communicates with the southbridge to reach all other system devices.</li>
<li>The <em>FSB</em> is the bus that connects the CPUs with the northbridge.</li>
<li>The <em>memory controller</em> is the controller associated with each RAM. Different types of RAM correspond to different memory controllers. The memory controller can be contained in the Northbridge or it can be as a standalone component.</li>
<li>The <em>southbridge</em> (a.k.a I/O bridge) handles communications with all the system devices (e.g.: SATA, USB PCI-E).</li>
</ul>
<p>The first hardware architecture presented in the paper is the following:</p>
<p><img src="https://samueleresca.net/content/images/2023/05/north_south_bridge_architecture-1.png" alt="Analysis of &apos;What Every Programmer Should Know About Memory&apos;" loading="lazy"></p>
<p>The main bottleneck in the architecture above is the Northbridge communication to the RAM. In this architecture, the northbridge contains the memory controller for interfacing with the RAM. Therefore, the data goes through a unique memory bus and a unique memory controller.</p>
<p>Nowadays, the architectures integrate the <em>memory controllers into the CPU</em> so that there is no need for the northbridge at all. Below are the schema describing an architecture without the Northbridge:</p>
<p><img src="https://samueleresca.net/content/images/2023/05/integrated_memory_controller_architecture.png" alt="Analysis of &apos;What Every Programmer Should Know About Memory&apos;" loading="lazy"></p>
<p>In this case, the memory bandwidth is proportional to the number of CPUs. This architecture leads to a non-uniform memory because the memory is now bounded to the CPU (local memory) usually referred to as NUMA: Non-Uniform Memory Architecture. (<a href="#numa">See dedicated section for more details.</a>)</p>
<h3 id="optimizing-device-access-with-dma">Optimizing device access with DMA</h3>
<p>In the early architectures, every communication to/from a device had to pass through the CPU. This has been solved with DMA. DMA is an optimization for offloading the CPU. It allows transferring data from devices with minimal effort from the CPU only needs to initiate the transfer and move over to new tasks.</p>
<h2 id="numa">NUMA</h2>
<p>NUMA stands for Non-uniform memory architectures. The non-uniform part refers to the locality of the memory compared with the CPU.</p>
<p>In a NUMA architecture, a processor can access its local memory faster than non-local memory. As already mentioned, the Non-uniform memory is needed because the northbridge becomes a severe bottleneck since all the memory traffic is routed through it. So, the memory controller and the memory are integrated into the CPU.</p>
<p>NUMA requires that the OS takes into account the distributed nature of the memory:</p>
<ul>
<li>if a process run on a given processor then the physical RAM assigned to the process address space should be from the local memory of that processor.</li>
<li>the OS should not migrate a process or a thread from one node to another. In case migrating the process across nodes is necessary then the selection of the new processor should be restricted to processors that do not have higher access costs to the memory than the current processor.</li>
</ul>
<p>The memory allocated in NUMA is by default not allocated only on the local node to avoid saturation of the local memory of nodes that are running large processes.</p>
<h2 id="cpu-caches">CPU caches</h2>
<p>The memory is not as fast as the CPU. Thus, the need for CPU caches between CPUs and memory. It is feasible to build a faster RAM (see SRAM internals), but it is not cheap. The choice is either for a large amount of DRAM or for a smaller amount of very fast SRAM.</p>
<p>The approach taken on most of the commodity hardware is hybrid. The DRAM is used as the main memory facility. On top of that, there is a small amount of SRAM to cache data used in the near future by the processor.</p>
<h3 id="architecture">Architecture</h3>
<p>The CPU caches architecture consists of many cache layers. The reason for having many layers is historical. The new layers are added as the speed difference between the cache and main memory continues increasing.</p>
<p>Below is the schema of a 3 levels cache architecture:</p>
<p><img src="https://samueleresca.net/content/images/2023/05/3_level_cpu_architecture.png" alt="Analysis of &apos;What Every Programmer Should Know About Memory&apos;" loading="lazy"></p>
<p>The speed is higher in the cache layers closer to the core. The architecture presents two flavours of <code>L1</code> cache:</p>
<ul>
<li><code>L1d</code> cache is dedicated to the data.</li>
<li><code>L1i</code> cache is dedicated to the instructions.</li>
</ul>
<p>In a multi-processor / multi-core architecture a cache might be shared depending on the level. The paper describes the cache in a multi-processor architecture with the following schema:</p>
<p><img src="https://samueleresca.net/content/images/2023/05/multiprocessor_architecture.png" alt="Analysis of &apos;What Every Programmer Should Know About Memory&apos;" loading="lazy"></p>
<p>The schema represent a 2 processor, each one with 2 cores. Each core has 2 threads respectively. The <code>L2</code> and <code>L3</code> caches are shared across the cores within a processor, while the threads within a core shares <code>L1</code> caches.</p>
<h3 id="cpu-caches-implementation">CPU caches implementation</h3>
<p>All the data read or written by the CPU is stored in the CPU caches. If the CPU needs a data word, it searches in the caches first. Entries are stored in the cache as lines of several contiguous words for the following reasons:</p>
<ul>
<li>The <em>spatial locality</em>: in a short period, there is a good chance that the same code or data gets reused.</li>
<li>The RAM modules are more effective if they can transport many data words in a row.</li>
</ul>
<p>When the processor needs the memory content the entire cache line is loaded into the <code>L1d</code>. The following steps describe, at high-level, the process for <em>modifying memory</em>:</p>
<ol>
<li>The processor loads a cache line first.</li>
<li>The processor modifies the interested part of the cache line, marking it as &quot;dirty&quot;.</li>
<li>When the processor writes back the changes to the main memory the dirty flag is cleared.</li>
</ol>
<p>To load data in a cache it is necessary to make room in the cache. The <em>eviction process</em> from <code>L1d</code> pushes the cache line down into <code>L2</code>. In turn, evicting the cache in <code>L2</code> means pushing the content into <code>L3</code>. Finally, <code>L3</code> pushes content into the main memory.</p>
<h3 id="associativity">Associativity</h3>
<p>The core problem in CPU cache implementations is that many memory locations fight for each place in the cache. The cache associativity plays a role in the performance of the overall system. In specific, the associativity determines where a particular memory block can be placed when it is loaded into the cache.</p>
<pre><code>     &#x250C;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x252C;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x252C;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2510;
     &#x2502;     TAG      &#x2502;      SET INDEX     &#x2502;   BLOCK   &#x2502;
     &#x2502;              &#x2502;                    &#x2502;   OFFSET  &#x2502;
     &#x2514;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2534;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2534;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2500;&#x2518;
          t bits            s bits           b bits
</code></pre>
<p>The caches are organized in sets divided into cache lines. The addresses referring to cache values are usually composed of the followings parts:</p>
<ul>
<li><strong>Tag</strong> represents the tag to store alongside the cached value. The tag contains the address of the target data in the main memory. It is needed because in some cases many memory addresses can end up being stored in the same cache line, see associativity types below.</li>
<li><strong>Set index</strong> identifies the target set.</li>
<li><strong>Block offset</strong> identifies the offset within the cache line. A cache line can store many words; the offset is needed to identify a single word.</li>
</ul>
<p>There are 3 types of CPU caches: the <em>direct-mapped</em>, <em>fully-associative</em> and <em>set-associative</em> types. Each of them determines how the content of the cache is stored and fetched.</p>
<h4 id="direct-mapped-cache">Direct-mapped cache</h4>
<p>The cache is organized into many sets with a single cache line per set. Thus, a memory block can only occupy a single cache line.</p>
<p>The <em>write pattern</em> is as simple as placing the memory block in the set index identified by the address. The tag is stored alongside the set. The new data replace the old one already stored.</p>
<p>The <em>search pattern</em> is as simple as retrieving the set corresponding to the set index contained in the address. If the tag (stored alongside the set) corresponds to the tag in the address then the data is retrieved from the cache (cache hit). Otherwise, it is a cache miss.</p>
<p>The advantage of this approach is that it is simple and cheap to implement. The disadvantage is that it leads to a lot of <em>cache misses</em>.</p>
<h4 id="fully-associative-cache">Fully-associative cache</h4>
<p>A fully-associative cache is organized in a unique set with many cache lines.<br>
A memory block can be placed in any of the cache lines. The addresses of a fully-associative cache do not have any <em>Set index</em> as the set is unique.</p>
<p>The <em>write pattern</em> consists in looking for the first cache line available to place the memory block. If no space is available, a block in the cache is evicted based on a replacement policy.</p>
<p>The <em>search pattern</em> consists in checking each <em>Tag</em>  field against the memory address with the tag bits associated with each cache line. The <em>Offset</em> is used to select the byte to return.</p>
<p>The advantage is that the cache is fully utilised and the cache hits rate is maximized. The disadvantages are that the <em>search pattern</em> iterates over all the cache lines and the implementation is expensive.</p>
<p>Because of the performance and hardware costs, this approach is usually adopted for small caches with a few dozen of entries.</p>
<h4 id="set-associative">Set-associative</h4>
<p>It is a hybrid between the fully associative and the direct-mapped approach. Each set can contain many cache lines. It is the go-to approach for CPU caches.</p>
<p>A memory block is first mapped to a set and then it can be placed into any cache line of that set.</p>
<p>The <em>writing pattern</em> consists in placing the memory block in any of the cache lines in the set defined by the <em>Set index</em>. The tag is stored in the field associated with the cache line. In case all the cache lines are occupied, then one block in that set is evicted.</p>
<p>The <em>search pattern</em> identifies the set using the <em>Set index</em>. The <em>Tag</em> bit is compared with the tags of all cache lines of the set.</p>
<p>This approach is a trade-off between the previous two approaches.</p>
<h3 id="write-behaviours-in-a-single-processor">Write behaviours in a single processor</h3>
<p>The paper describes 3 cache policy implementations to achieve coherency in a single processor scenario: <em>write-through</em>, <em>write-back</em> and <em>write-combining</em>.</p>
<p>The <em>write-through</em> implementation is the simplest. If the cache line is written to, the processor immediately also writes it to the main memory. This implementation is simple but not fast and it creates a lot of traffic on the FSB.</p>
<p>The <em>write-back</em> implementation consists in marking the cache line dirty just after it is updated. When the cache line is dropped from the cache the dirty bit will instruct the processor to write the data back to the main memory.  The <em>write-back</em> approach is more complex, but it provides better performance. The <em>write-back</em> approach is the widespread approach adopted by the processors.</p>
<p>The <em>write-combining</em> implementation combines many write accesses before the cache line is written. This approach is used to cut the transfer and it is usually used when a device is involved, such as a GPU.</p>
<h3 id="multi-processor-coherency-and-mesi">Multi-processor coherency and MESI</h3>
<p>The cache behaviours get more complicated in a multi-processor setup. The <em>write-back</em> implementation fails to keep a consistent coherency with many processors.</p>
<p>Let&apos;s take as an example a 2 processors scenario (<code>P1</code>, <code>P2</code>). <code>P1</code> writes an updated cache line, that cache line won&apos;t be available either on the main memory or on the caches of the <code>P2</code>, as they don&apos;t share the same caches. Thus, the <code>P2</code> would read out-of-date information.</p>
<p>Providing access from <code>P2</code> to the cache of <code>P1</code> would be impractical for performance reasons. Thus, the go-to approach is to transfer the cached content over to the other processor in case it is needed. The same is also valid for the caches that are not shared on the same processor. The cache line transfer happens when a processor needs a cache line which is dirty in another processor.</p>
<p>The <em>MESI</em> is the coherency protocol that is used to implement <em>write-back</em> caches in a multi-processor scenario.</p>
<p>MESI takes the name from the 4 states a cache line can be:</p>
<ul>
<li><em>Modified</em>: the cache line is dirty and only present in the current cache.</li>
<li><em>Exclusive</em>: the cache line is only present in the current cache, but it is not modified.</li>
<li><em>Shared</em>: the cache line is stored in many caches but is clean (it matches the main memory)</li>
<li><em>Invalid</em>: the cache line is invalid.</li>
</ul>
<p>The cache lines are initialized as <code>Invalid</code>:</p>
<ul>
<li>when data is loaded into the cache for writing the cache line is marked as <code>Modified</code>.</li>
<li>when data is loaded for reading the state depends on other processors:
<ul>
<li>if another processor has the same data then the line is marked as <code>Shared</code></li>
<li>if no other processor has the same data then the line is <code>Exclusive</code></li>
</ul>
</li>
</ul>
<p>The state of the cache line is subject 2 main stimuli:</p>
<ul>
<li>a <em>local request</em>, which does not need to notify other processors. For example, it happens when a <code>Modified</code> or <code>Exclusive</code> cache line is read or written locally. In this case, the state does not change.</li>
<li>a <em>bus request</em>, which happens when a processor needs to notify another processor. Some examples:
<ul>
<li>a processor called <code>P2</code> wants to read the cache line from another processor <code>P1</code>. <code>P1</code> has to send the content of the cache to <code>P1</code> and change the state to <code>Shared</code>. The data is sent to the memory controller that stores the data in memory.</li>
<li>a processor called <code>P2</code> wants to write a cache line that is in a <code>Shared</code> or <code>Invalid</code> state. This operation would trigger a bus request to notify the other processors and mark their local copy <code>Invalid</code>.</li>
</ul>
</li>
</ul>
<p>In general, a write operation can only be performed as a <em>local request</em> if the cache line is in <code>Modified</code> / <code>Exclusive</code> state. In case a write-request acts on a cache line in a <code>Shared</code> state, then all the other cached copies must be invalidated. The invalidation process happens by notifying the change of the other processors. This is also known as <em>Request for ownership (RFO)</em>.</p>
<h2 id="performance-considerations-on-cpu-caches">Performance considerations on CPU caches</h2>
<p><em>Associativity</em> plays a crucial role in reducing cache misses, and thus in the performance of the CPU caches. The paper demonstrates how moving from direct to set associativity can reduce cache misses by the 44% in a 8M L2 cache layer.</p>
<p>Worth mentioning that another factor that plays a role in the cache misses reduction is the <em>cache size</em>. Larger the cache, less are the chances to encounter cache misses.</p>
<p><em>Request for ownership (RFO)</em> in the context of MESI can slow down things in the context of multi-processor operations. They are expensive coordination operations. The coherency messages need to be sent to <code>N</code> processors and await a reply from all of them. So, the speed of this exchange is determined by the longest possible time needed to get a reply.</p>
<h2 id="virtual-memory">Virtual Memory</h2>
<p>The paper proceeds by describing the virtual memory system used by the processors. The virtual memory provides the virtual address spaces used by the processors. Some of the advantages of virtual memory:</p>
<ul>
<li>Virtual memory handles the offloading of the physical memory by moving the in-memory data into a disk.</li>
<li>Provides a transparent abstraction to the process. The processes see the memory as a unique chunk.</li>
</ul>
<p>The component taking care of the virtual address space is the <em>Memory management unit (MMU)</em>.</p>
<h3 id="address-translation">Address translation</h3>
<p>The core functionality of the Memory management unit (MMU) is the translation from a virtual address to a physical address.</p>
<p>The translation is implemented using the <em>multi-level page table</em> structure. Defined in the paper with the following schema:</p>
<p><img src="https://samueleresca.net/content/images/2023/05/multi_level_page_schema.png" alt="Analysis of &apos;What Every Programmer Should Know About Memory&apos;" loading="lazy"></p>
<p>The virtual address is divided into 5 parts. The <code>Level 4 index</code> points to the <code>Level 4 Entry</code> which is a reference to the <code>Level 3 Directory</code>. The structure continues until the <code>Level 1 Directory</code> where the entry contains the reference to the physical page. The combination of the physical page address plus the offset is the physical address and the operation of recursively navigating the page table is defined as <em>page tree walking</em>.</p>
<p>The data structure for building the page table is kept in the main memory. Each process has its own page table. The CPU is notified of every creation or change of a page table. The page tree walking operation is expensive. For example, for a 4-level page tree, the operation needs at least 4 memory accesses. On top of that, the address computation cannot be parallelized because an address depends on the up-level directory.</p>
<p>Thus, it is necessary to precompute the mapping between the virtual address and the physical address. The pre-computed values are stored in the <em>Transaction Look-Aside Buffer (TLB)</em>.</p>
<p>The <em>Transaction Look-Aside Buffer (TLB)</em> is usually built with a set or a fully associative approach (depending on the purpose of the architecture). If the tag has a match in the cache, the physical address is computed by adding the offset specified in the virtual address. In case of a TLB cache miss, the CPU needs to execute the <em>page tree walking</em> process and store the result in the TLB cache.</p>
<p>Note that the TLB follow the same structure as the other CPU caches, so it can have built by many layers: L1TLB, L2TLB. Futhermore, there can be many TLB for many purposes: an instruction TLB and a data TLB.</p>
<h2 id="performance-considerations-on-tlb">Performance considerations on TLB</h2>
<p>TLB size is limited like for the CPU caches. The performances of TLB can be influenced by changing the size of the pages.</p>
<p>Larger pages reduce the number of address translations needed, meaning that fewer TLB entries are needed. Futhermore, in case of TLB miss, the page tree walk will be faster as the number of levels will be smaller.</p>
<p>The downside of this approach is that the large pages must be contiguous in the physical memory. This results in a more fragmented memory, and an increase in wasted memory.</p>
<p>Huge pages are beneficial in a scenario where performance matters and resources are not a concern. For example database servers or systems where latency matters. This series of posts provide a detailed overview on the use of huge pages:</p>
<p><a href="https://www.hudsonrivertrading.com/hrtbeat/low-latency-optimization-part-1/">Low Latency Optimization: Using Huge Pages on Linux (Part 1)</a></p>
<p><a href="https://www.hudsonrivertrading.com/hrtbeat/low-latency-optimization-part-2/">Low Latency Optimization: Using Huge Pages on Linux (Part 2)</a></p>
<h2 id="wrap-up">Wrap-Up</h2>
<p>This post analyzes some of the topics discussed in the paper <a href="https://people.freebsd.org/~lstewart/articles/cpumemory.pdf">What Every Programmer Should Know About Memory by Ulrich Drepper</a>. CPU caches, Virtual memory, TLB and the anatomy of the hardware plays a role in optimizing the performance of the system.</p>
<p>The next post (<a href="https://samueleresca.net/memory-management-optimizations-techniques">Memory management optimization techniques</a>) of this series is more hands-on and it explores the concepts introduced in the &quot;What programmers can do&quot; section of Ulrich&apos;s paper.</p>
<h2 id="references">References</h2>
<p><a href="https://people.freebsd.org/~lstewart/articles/cpumemory.pdf">What Every Programmer Should Know About Memory - Ulrich Drepper</a></p>
<p><a href="https://www.hudsonrivertrading.com/hrtbeat/low-latency-optimization-part-1/">Low Latency Optimization: Using Huge Pages on Linux (Part 1)</a></p>
<p><a href="https://www.hudsonrivertrading.com/hrtbeat/low-latency-optimization-part-2/">Low Latency Optimization: Using Huge Pages on Linux (Part 2)</a></p>
<p><a href="https://www.brendangregg.com/perf.html">perf tool</a></p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Techniques for fuzz testing]]></title><description><![CDATA[Fuzz testing is a broad topic with many approaches and strategies.

This post summarizes some techniques for fuzz testing and the learnings I have made. It also goes through some fuzz tests running on some cloud-native foundation projects, such as etcd.]]></description><link>https://samueleresca.net/techniques-for-fuzz-testing/</link><guid isPermaLink="false">63d9a4dc2bc5da0fb16ac193</guid><category><![CDATA[security]]></category><category><![CDATA[Distributed Systems]]></category><category><![CDATA[fuzzing]]></category><category><![CDATA[etcd]]></category><category><![CDATA[fuzz]]></category><dc:creator><![CDATA[Samuele Resca]]></dc:creator><pubDate>Mon, 05 Dec 2022 09:31:42 GMT</pubDate><media:content url="https://samueleresca.net/content/images/2022/11/greenwich_uni.jpg" medium="image"/><content:encoded><![CDATA[<div class="kg-card kg-callout-card kg-callout-card-blue"><div class="kg-callout-emoji">&#x1F4A1;</div><div class="kg-callout-text">This post refers to the following <a href="https://github.com/dotnet/designs/pull/273">.NET design proposal</a>.</div></div><div class="kg-card kg-callout-card kg-callout-card-blue"><div class="kg-callout-emoji">&#x1F4A1;</div><div class="kg-callout-text">This post refers to the following pull request <a href="https://github.com/etcd-io/etcd/pull/14561">etcd-io/etcd/pull/14561</a>.</div></div><img src="https://samueleresca.net/content/images/2022/11/greenwich_uni.jpg" alt="Techniques for fuzz testing"><p></p><!--kg-card-begin: markdown--><p>Recently, I have been working on a fuzzer for testing the resilience of a parser. The fuzzer detected some crashes in a parser running on top of well-established production code. That made me realize how critical this testing technique is for the security of a system.</p>
<p>Fuzz testing is a broad topic with many approaches and strategies.</p>
<p>This post summarizes some techniques for fuzz testing and the learnings I have made. It also goes through some fuzz tests running on some cloud-native foundation projects, such as <a href="https://github.com/etcd-io/etcd">etcd</a>.</p>
<p>The index below summarizes the post sections:</p>
<ul>
<li><a href="#anatomy-of-fuzzing">Anatomy of fuzzing</a></li>
<li><a href="#black-box-fuzzing">Black-box fuzzing</a></li>
<li><a href="#coverage-guided-fuzzing">Coverage-guided fuzzing</a></li>
<li><a href="#Blackbox-fuzzing-vs-coverage-guided-fuzzing">Blackbox fuzzing vs Coverage-guided fuzzing</a></li>
<li><a href="#fuzzing-in-a-distributed-systems-world">Fuzzing in a distributed systems world</a></li>
<li><a href="#use-case-etcd-fuzzing">[Use case] etcd fuzzing</a></li>
<li><a href="#use-case-net-fuzzing-design-proposal">[Use case] .NET fuzzing design proposal</a></li>
<li><a href="#final-thoughts-democratize-fuzz-testing">Democratize fuzz testing</a></li>
</ul>
<h2 id="anatomy-of-fuzzing">Anatomy of fuzzing</h2>
<p>Let&apos;s start simple.</p>
<p>The most basic example of a fuzzer might be a function that generates a random value. The generated value is then used as an input for the system under test:</p>
<pre><code class="language-python">while(true):
    # Generation of the fuzzing value
    random_value = generate_random()
    # Implementation under test
    under_test_func(random_value)
</code></pre>
<p>The snippet above generates a random value. The <code>under_test_func</code> is executed with the random value for an indefinite amount of iterations. The snippet above might crash given a certain input, depending on the resilience of the <code>under_test_func</code> function.</p>
<p>This is a very basic example of fuzzing. Usually, a fuzzer implementation is more complicated than that.</p>
<p>Before proceeding, let&apos;s clarify some terminology by describing an usual fuzzer architecture:</p>
<p><img src="https://samueleresca.net/content/images/2022/11/fuzz_schema.jpg" alt="Techniques for fuzz testing" loading="lazy"></p>
<ul>
<li>The <em>seed</em> is a hyperparameter of the fuzzing process. It is the initial input defined by the user that is used as a starting point to generate the fuzzing inputs.</li>
<li>The <em>fuzz engine</em> finds interesting inputs to send to a fuzz target. Usually, a fuzzing engine includes a mutator actor.</li>
<li>The <em>mutator</em> generates the <em>corpus</em> manipulating the <em>seed</em>. There are different techniques used by mutators.</li>
<li>The <em>corpus</em> is a set of inputs that are used by the fuzz target.</li>
<li>The <em>fuzz input</em> is a single input that is passed to the fuzz target.</li>
<li>The <em>fuzz target</em> is usually a function that, given a <em>fuzz input</em>, it makes a call to the <em>system under test</em>.</li>
<li>The <em>system under test</em> is the target of our fuzzing process.</li>
</ul>
<p>In case the <em>system under test</em> crashes, a new crash report file is generated.</p>
<h3 id="fuzz-target-example">Fuzz target example</h3>
<p>The schema above defines a <em>fuzz target</em>. In concrete, fuzz targets usually have an implementation like the one below:</p>
<script src="https://gist.github.com/samueleresca/aa9b6cd9b7e68f09edf6a9ce2d4454c7.js"></script>
<p>The above code is the target function coming from the LLVM toolchain. The <code>LLVMFuzzerTestOneInput</code> defines the fuzz target function. The fuzz target receives the <em>corpus</em> generated by the <em>mutator</em>.<br>
Note that, every fuzz engine defines its syntax, in general, the function will accept the <em>corpus</em> as parameters.</p>
<h2 id="black-box-fuzzing">Black-box fuzzing</h2>
<p>In Black-box fuzzing, the <em>fuzz engine</em> does not know anything about the system under test implementation. An example of black-box fuzzing might be a <em>fuzz target</em> that calls an application running on a container exposing an HTTP API.</p>
<p>Below is the schema that describes the scenario:</p>
<p><img src="https://samueleresca.net/content/images/2022/11/blackbox_fuzzing_schema.jpg" alt="Techniques for fuzz testing" loading="lazy"></p>
<p>The fuzz target calls the system under test through a client:</p>
<script src="https://gist.github.com/samueleresca/dbb5828aa38dd9d2de6308822c3194b7.js"></script>
<p>The fuzz target and the system under test are running in total isolation. The only information that they share is the network and the input that is sent from one of the <em>fuzz target</em> to the actual <em>system under test</em>.</p>
<p>Keep in mind that fuzzing across the network will slow things down. Futhermore, is hard to detect crashes in the <em>system under test</em> because is running as a separate instance and is not running the fuzzing process. see the <a href="#Fuzzing-in-a-distributed-systems-world">Fuzzing in a distributed systems world</a> for more details and solutions.</p>
<h2 id="coverage-guided-fuzzing">Coverage-guided fuzzing</h2>
<p>Blackbox fuzzing approaches the system under test in a blind way. It doesn&apos;t know anything about the internals of the system under testing. Thus, the fuzz input is not tuned to hit every code path. Coverage-guided fuzzing instruments the system under test to gain awareness of the code paths that the fuzz input is covering. More in general, the coverage-guided fuzzing process has two key steps<a href="#references">[1]</a>:</p>
<ul>
<li>Instrument the system under test to inject tracking instructions at compile time. The purpose is to collect metrics on the code paths covered.</li>
<li>Keep or discard fuzz inputs depending on the coverage metrics collected.</li>
</ul>
<h3 id="coverage-guided-fuzzing-in-depth">Coverage-guided fuzzing in depth</h3>
<p>The following section describes how AFL++, a widespread fuzzing framework, implements coverage-guided fuzzing<a href="#references">[2]</a>.</p>
<p>Given the following snippet of code:</p>
<pre><code class="language-python">x = read_num()
y = read_num()

if x &gt; y:
    print x + &quot;is greater than&quot; + y
    return
if x == y:
    print x + &quot;is equal to&quot; + y
    return
print x + &quot;is less than&quot; + y
</code></pre>
<p>We can assign an edge for each branching in the code:</p>
<pre><code class="language-python">0: (A) x = read_num()
1: (A) y = read_num()
2: (A) if x &gt; y:
3: (B)   print x + &quot;is greater than&quot; + y
4: (B)   return
5: (A) if x == y:
6: (C)    print x + &quot;is equal to&quot; + y
7: (C)    return
8: (D) print x + &quot;is less than&quot; + y
9: (E) end program
</code></pre>
<p>The flow of the code above can be summarized with the following control flow graph, where each branch is an edge of the graph:</p>
<pre><code>
        &#x250C;&#x2500;&#x2500;&#x2500;&#x2510;
   &#x250C;&#x2500;&#x2500;&#x2500;&#x2500;&#x2524; A &#x251C;&#x2500;&#x2500;&#x2500;&#x2500;&#x2510;
   &#x2502;    &#x2514;&#x2500;&#x252C;&#x2500;&#x2518;    &#x2502;
   &#x2502;      &#x2502;      &#x2502;
 &#x250C;&#x2500;V&#x2500;&#x2510;  &#x250C;&#x2500;V&#x2500;&#x2510;  &#x250C;&#x2500;V&#x2500;&#x2510;
 &#x2502; B &#x2502;  &#x2502; C &#x2502;  &#x2502; D &#x2502;
 &#x2514;&#x2500;&#x252C;&#x2500;&#x2518;  &#x2514;&#x2500;&#x252C;&#x2500;&#x2518;  &#x2514;&#x2500;&#x252C;&#x2500;&#x2518;
   &#x2502;      &#x2502;      &#x2502;
   &#x2502;    &#x250C;&#x2500;V&#x2500;&#x2510;    &#x2502;
   &#x2514;&#x2500;&#x2500;&#x2500;&#x2500;&gt; E &lt;&#x2500;&#x2500;&#x2500;&#x2500;&#x2518;
        &#x2514;&#x2500;&#x2500;&#x2500;&#x2518;
</code></pre>
<p>AFL++ uses the control-flow graphs in the instrumentation phase by injecting the following pseudo-code in each edge:</p>
<pre><code class="language-pseudo">cur_location = &lt;COMPILE_TIME_RANDOM&gt;;
shared_mem[cur_location ^ prev_location]++;
prev_location = cur_location &gt;&gt; 1;
</code></pre>
<p>The <code>cur_location</code> property identifies the current path within the code using a randomly generated value. The <code>shared_mem</code> takes a count of the amount of time a certain path <code>(source, dest)</code> (given by the XOR <code>cur_location ^ prev_location</code>) has been hit. On top of that, the <code>perv_location</code> value is overridden in each iteration with the right bitshift of the <code>cur_location</code> variable. In this way the injected code keeps track of the ordering: <code>A -&gt; B diff from B -&gt; A</code>.</p>
<p>In this way, AFL++ know which branch has been hit by a specific input and if it is worth continuing for that path.</p>
<h3 id="new-behaviours-detection">New behaviours detection</h3>
<p>As mentioned above, the coverage algorithm keeps track of two parameters:</p>
<ul>
<li>The tuples represent a path in the code</li>
<li>The number of times that path has been hit (<code>hit_count</code>)</li>
</ul>
<p>This information is used for triggering extra processing. When an execution contains a new tuple, the input of that execution is marked as interesting. The hit counts for the execution path are bucketed in the range:</p>
<pre><code>1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+ // Ranges of hit count.
</code></pre>
<p>If the algorithm detects a variation between hit count buckets then the input is marked as interesting. This helps to detect the variations in the execution of the instrumented code.</p>
<p>Even so, it is important to mention that coverage-guided fuzzing does not come for free. Instrumenting the system under test means adding an overhead at runtime. On some occasions, this overhead is not sustainable.</p>
<p>There has been some effort to optimize coverage-guided fuzzing recently. Most of the research focuses on better ways of instrumenting the code<a href="#references">[8]</a> and on more meaningful coverage metrics<a href="#references">[9]</a>.</p>
<h2 id="blackbox-fuzzing-vs-coverage-guided-fuzzing">Blackbox fuzzing vs Coverage-guided fuzzing</h2>
<p>We described two fuzzing techniques: black-box and coverage-guided fuzzing. This section gives some guidelines on which type of fuzzing we should use and when.</p>
<p>In general, coverage-guided fuzzing provides better coverage and is more effective. The downside is you need to instrument your target first. This might work well in self-contained libraries, but it might be hard to set up in more complex codebases.</p>
<p>The advice is to examine the system under test and understand the complexity of the codebase:</p>
<ul>
<li>For simple and independent codebases, such as libraries, I would focus on coverage-guided fuzzing.</li>
<li>For complex projects, hard to decompose, black-box fuzzing is the quicker approach, and it can still spot the weaknesses in the system under test. If you are spending a lot of time figuring out how to instrument your code for run fuzzing, you should probably start with a black-box approach.</li>
</ul>
<h2 id="fuzzing-in-a-distributed-systems-world">Fuzzing in a distributed systems world</h2>
<p>In the distributed system era, most of the services talks over the network. Fuzzing a network service is not easy for two main reasons:</p>
<ul>
<li>In a stateful world, the server can be in various states. On top of that, the exchange usually needs to fit a certain transmission protocol.</li>
<li>The network slows down the fuzzing.</li>
<li>It is hard to detect a crash on a system running on a different process.</li>
</ul>
<p>The general approach for fuzzing over the network is to change the source code to read from <code>stdin</code> instead of relying on the network. <a href="#references">[4]</a></p>
<p>For example, <a href="https://github.com/AFLplusplus/AFLplusplus/tree/stable/utils/socket_fuzzing">AFLplusplus/utils/socket_fuzzing</a> is an util embedded in AFL++ that provides a way to emulate a network socket by sending input in the stdin.</p>
<p>If changing the source code is not viable, keep in mind that fuzzing over the network is still an option.</p>
<p><a href="https://github.com/aflnet/aflnet">aflnet/aflnet</a>: AFLNet provides a suite for testing network protocols. It uses a <code>pcap</code> dump of a real client-server exchange to generate valid sequences of requests while it keeps monitoring the state of the server. The input sequence is then mutated in a way that does not interfere with the rules of the protocol.</p>
<p><a href="https://github.com/microsoft/restler-fuzzer">microsoft/RESTler</a> provides a way to test a remote-distributed black box <a href="#references">[6]</a>. Even if it is not possible to instrument the code of a distributed blackbox, RESTler takes a hybrid approach between a blackbox and coverage-guided fuzzing by measuring the fuzzing effectiveness by introducing a response error type metric. Every time that the distributed blackbox returns an HTTP error type (HTTP status codes from <code>4xx</code> to <code>5xx</code>) is considered a crash, and the coverage metrics triggered.</p>
<p>RESTler is HTTP REST-based. What can we do when the system under test utilizes a protocol that is not HTTP? Structure-aware fuzzing becomes handy when you want to compose a message following some rules.</p>
<h3 id="structure-aware-fuzzing">Structure-aware fuzzing</h3>
<p>Structure-aware fuzzing <a href="#references">[7]</a> is a technique that stands on top of a normal fuzzing process. It can be combined with black-box and coverage-guided fuzzing. Structure-aware fuzzing mutates corpus following a defined structure.</p>
<p>One of the most common tools that implement structure-aware fuzzing is <a href="https://github.com/google/libprotobuf-mutator">google/libprotobuf-mutator</a>.</p>
<p>This library runs on top of others fuzz engines and mutates the inputs following a protobuf definition. For example, the following <code>proto</code> file defines the structure of a message:</p>
<script src="https://gist.github.com/samueleresca/92698cd5f4efc9252c8102a479c33b46.js"></script>
<p>The <code>Msg</code> above can be reused as input for a <em>fuzz target</em> as shown below:</p>
<script src="https://gist.github.com/samueleresca/75db3e31839eef9913543691c85f77c9.js"></script>
<p>This approach allows defining fuzz inputs based on certain rules. So it excludes all the cases where the corpus provided by the fuzz engine is not compliant with a particular syntax or protocol.</p>
<p>The next section focuses on a concrete example of fuzzing on the <a href="https://github.com/etcd-io/etcd">etcd-io/etcd</a> OSS project.</p>
<h2 id="use-case-etcd-fuzzing">[Use case] etcd fuzzing</h2>
<p>I recently contributed to etcd fuzzing to solve the following issue: <a href="https://github.com/etcd-io/etcd/issues/13617">etcd/issues/13617</a>.<br>
The etcd process for receiving requests has two main steps:</p>
<ol>
<li>The incoming request goes through a validation step (<code>validator</code> step)</li>
<li>The incoming validated request is applied to the server (<code>apply</code> step)</li>
</ol>
<p><em>The goal was to ensure that API validators reject all requests that would cause panic.</em></p>
<p>In practice, below is the approach that has been taken:</p>
<ol>
<li>Generate an input request with fuzzing</li>
<li>Send the request to the validator<br>
2.1 If the validator rejects the request then the test pass &#x2705;<br>
2.2 If the validator does not reject the request proceeds with step 3 &#x2B07;&#xFE0F;</li>
<li>Send the request to the apply function<br>
3.1 If the apply function executes without panic then the test pass &#x2705;<br>
3.1 If the apply function panics then the test fails &#x1F480;</li>
</ol>
<p>The following PR implements the approach described above.</p>
<p><a href="https://github.com/etcd-io/etcd/pull/14561">Ensure that input validation between API and Apply is consistent #14561</a>.</p>
<p>The implementation uses the new toolchain fuzzing released with Go 1.18, see: <a href="https://go.dev/security/fuzz/">go.dev/security/fuzz</a>. Let&apos;s start by taking a look at one of the fuzz tests in the PR:</p>
<script src="https://gist.github.com/samueleresca/7984e9b877fc249bc2d2ceca1803dae8.js"></script>
<p>The above implementation shows a fuzz test case for an etcd <code>RangeRequest</code>. The test uses the fuzz capabilities provided by Golang.</p>
<ul>
<li><code>FuzzTnxRangeRequest</code> defines the test case.</li>
<li><code>f.Add</code> operation adds the seed corpus to the fuzz target.</li>
<li><code>f.Fuzz</code> defines the fuzz target. It takes as arguments the fuzzed parameters mutate based on the seed corpus.</li>
</ul>
<p>The fuzz test proceeds by calling the <code>checkRangeRequest</code> function, which implements the validation for the request. In case the validation returns an error, the rest of the test is skipped using <code>t.Skip</code>. In case the request is valid, the test proceeds by calling the apply method (<code>txn.Txn</code> function). The fuzz test will fail if the <code>txn.Txn</code> function panics, otherwise, the test is successful.</p>
<p>The PR also add a <code>fuzzing.sh</code> script that is used to execute the fuzz tests. The script is then embedded in CI using a GitHub Action definition. Below is the core command defined in the <code>fuzzing.sh</code> script:</p>
<pre><code class="language-bash">go test -fuzz &quot;${target}&quot; -fuzztime &quot;${fuzz_time}&quot;
</code></pre>
<p>The command executes a specific <code>target</code> (e.g.: <code>FuzzTxnRangeRequest</code>) with a specific <code>fuzztime</code>.</p>
<h2 id="use-case-net-fuzzing-design-proposal">[Use case] .NET fuzzing design proposal</h2>
<p>Another open-source contribution I made recently around fuzzing is the following design proposal for the .NET toolchain <a href="https://github.com/dotnet/designs/pull/273">#273 - Add proposal for out-of-the-box fuzzing</a>. The proposal was highly inspired by the introduction of fuzzing within the Go toolchain. The main reasons that triggered this proposal are:</p>
<ul>
<li>The lack of a uniform and easy way to perform fuzz testing in the .NET ecosystem</li>
<li>Increase the awareness of the fuzzing practice within the .NET consumers. Therefore, increase the overall security of the .NET applications and libraries</li>
</ul>
<p>The proposal got rejected with some good news: the .NET team is already exploring some possible paths forward, and some internal prototyping effort has already been made.</p>
<h2 id="final-thoughts-democratize-fuzz-testing">Final thoughts: Democratize fuzz testing</h2>
<p>A few months back, the Cloud Native Computer Foundation (CNCF) went through some effort in the fuzzing space:</p>
<p><a href="https://www.cncf.io/blog/2022/06/28/improving-security-by-fuzzing-the-cncf-landscape/">Improving Security by Fuzzing the CNCF landscape</a></p>
<p>The CNCF post goes through a quick introduction to fuzzing and the purposes of this technique. Futhermore, the post shows a list of fuzz tests implemented within the CNCF projects, such as Kubernetes, etcd, and Envoy.</p>
<p>The post&apos;s security reports highlight the various crashes has been found in the OSS projects of the CNCF foundation. Surfacing the need for these security practices in the OSS software.</p>
<p>The .NET fuzzing design proposal shown before is tentative to democratize fuzz testing in the .NET ecosystem. The easier is accessing to this security practice, the higher the chance that fuzzing is adopted in the development process. Thus, the shipped software has higher security and resilience standards.</p>
<p>Go 1.18 took a similar approach by including the fuzzing capabilities <a href="https://go.dev/security/fuzz/">within the language toolchain</a>. The Go native fuzzing proposal has brought up some interesting discussions available at <a href="https://github.com/golang/go/issues/44551">golang/go/issues/44551</a>.</p>
<p>LLVM took a similar approach long time ago by including libFuzzer in the toolchain.</p>
<p>Democratizing security practices such as fuzzing within the OSS ecosystem is becoming crucial. Especially with the increasing involvement of OSS projects as part of enterprise solutions.</p>
<h2 id="references">References</h2>
<p>[1]<a href="https://arxiv.org/abs/2203.06910">Qian, R., Zhang, Q., Fang, C., &amp; Guo, L. (2022). Investigating Coverage Guided Fuzzing with Mutation Testing. arXiv preprint arXiv:2203.06910.</a><br>
[2]<a href="https://lcamtuf.coredump.cx/afl/technical_details.txt">Technical &quot;whitepaper&quot; for afl-fuzz</a><br>
[3]<a href="https://github.com/google/fuzzing">google/fuzzing</a><br>
[5]<a href="https://github.com/AFLplusplus/AFLplusplus/blob/stable/docs/best_practices.md">AFLplusplus best_practices</a><br>
[6]<a href="https://patricegodefroid.github.io/public_psfiles/fse2020.pdf">Patrice Godefroid, Bo-Yuan Huang, and Marina Polishchuk. 2020. Intelligent REST API data fuzzing.</a><br>
[7]<a href="https://www.youtube.com/watch?v=S8JvzWDnjc0">Going Beyond Coverage-Guided Fuzzing with Structured Fuzzing - Black Hat</a><br>
[8]<a href="https://ieeexplore.ieee.org/document/8835316">S. Nagy and M. Hicks, &quot;Full-Speed Fuzzing: Reducing Fuzzing Overhead through Coverage-Guided Tracing,&quot; 2019 IEEE Symposium on Security and Privacy (SP), 2019, pp. 787-802, doi: 10.1109/SP.2019.00069.</a><br>
[9]<a href="https://ieeexplore.ieee.org/document/9230360">L. Simon and A. Verma, &quot;Improving Fuzzing through Controlled Compilation,&quot; 2020 IEEE European Symposium on Security and Privacy (EuroS&amp;P), 2020, pp. 34-52, doi: 10.1109/EuroSP48549.2020.00011.</a></p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[A practical approach to read write quorum systems [Part 2]]]></title><description><![CDATA[I published the post "A practical approach to read-write quorum systems" a few months ago. The post refers to the paper: Read-Write Quorum Systems Made Practical, and it goes through a concrete implementation of the tool called "Quoracle". I have decided to rewrite the tool in Golang to explore ]]></description><link>https://samueleresca.net/a-practical-approach-to-read-write-quorum-systems-part-2/</link><guid isPermaLink="false">63d9a4dc2bc5da0fb16ac191</guid><category><![CDATA[Distributed Systems]]></category><dc:creator><![CDATA[Samuele Resca]]></dc:creator><pubDate>Tue, 28 Dec 2021 15:42:21 GMT</pubDate><media:content url="https://samueleresca.net/content/images/2021/12/dublin-background-1.webp" medium="image"/><content:encoded><![CDATA[<div class="kg-card kg-callout-card kg-callout-card-blue"><div class="kg-callout-emoji">&#x2139;&#xFE0F;</div><div class="kg-callout-text">The post is a continuation of <a href="https://samueleresca.net/practical-approach-to-read-write-quorum-systems/">A practical approach to read-write quorum systems</a>.</div></div><img src="https://samueleresca.net/content/images/2021/12/dublin-background-1.webp" alt="A practical approach to read write quorum systems [Part 2]"><p></p><div class="kg-card kg-callout-card kg-callout-card-grey"><div class="kg-callout-emoji">&#x1F4A1;</div><div class="kg-callout-text">The code is available at <a href="https://github.com/samueleresca/quoracle-go">https://github.com/samueleresca/quoracle-go</a>.</div></div><p></p><!--kg-card-begin: markdown--><p>I published the post <a href="https://samueleresca.net/practical-approach-to-read-write-quorum-systems/">&quot;A practical approach to read-write quorum systems&quot;</a> a few months ago. The post refers to the paper <a href="https://mwhittaker.github.io/publications/quoracle.pdf">Read-Write Quorum Systems Made Practical - Michael Whittaker, Aleksey Charapko, Joseph M. Hellerstein, Heidi Howard, Ion Stoica</a>. It illustrates the implementation of &quot;Quoracle&quot;. Quoracle provides the optimal quorums in respect of either the <code>load</code>, <code>network</code> or <code>latency</code>.</p>
<p>I have decided to rewrite the tool in Golang to explore the ecosystem and the tooling of the language. This article goes through the Golang implementation of the <a href="https://github.com/mwhittaker/quoracle">original Python library</a>.</p>
<h2 id="quorum-expressions-definition">Quorum expressions definition</h2>
<p>First of all, let&apos;s discuss the implementation of the expressions. The original paper uses expressions and nodes to describe quorums:</p>
<pre><code>a, b, c = Node(&quot;a&quot;), Node(&quot;b&quot;), Node(&quot;c&quot;)
majority = QuorumSystem(reads=a*b + b*c + a*c)
</code></pre>
<p>The example above builds the majority quorums using the following pairs: <code>[a,b], [b,c], [a,c]</code>. Python can represent the expression above by overloading the <code>*</code> and the <code>+</code> operations. The <a href="https://github.com/mwhittaker/quoracle/blob/ee7400a8a992b3f9f8c17822b5bab15d537e7b46/quoracle/expr.py#L31">original quoracle library</a> uses the operations overload approach.</p>
<p>Go advocates simplicity, and <a href="https://golang.org/doc/faq#overloading">it does not embrace operator overloading</a>. So, it is necessary to proceed with a different approach.</p>
<p>In Golang, these methods describe the operations:</p>
<script src="https://gist.github.com/samueleresca/01c87c37464546766c718e786b322e93.js"></script>
<p>The <code>ExprOperator</code> interface defines the operations between two logical expressions. A logical expression between nodes represents many quorums.</p>
<p>Thus, it is possible to describe quorum as follows:</p>
<pre><code>a, b, c :=
NewNode(&quot;a&quot;), NewNode(&quot;b&quot;), NewNode(&quot;c&quot;)

// (a * b) + (b * c) + (a * c)
majority := NewQuorumSystemWithReads(a.Multiply(b).Add(b.Multiply(c)).Add(a.Multiply(c)))
</code></pre>
<p>The <code>ExprOperator</code> interface provides the same functionalities as the original library. In the example above, the pairs: <code>[a,b], [b,c], [a,c]</code> are majority quorums. The next section goes through the Golang definition of a <code>QuorumSystem</code> and how to use it.</p>
<h2 id="quorum-system-overview-definition">Quorum system overview definition</h2>
<p>Now that we know how to define an <code>Expr</code> of nodes, we can declare a read-write quorum system. The below implementation describes the <code>QuorumSystem</code> struct used in the library.</p>
<script src="https://gist.github.com/samueleresca/621ac3cfc41482447edb8d004ac71dc1.js"></script>
<p>The <code>reads</code> and <code>writes</code> fields represent the quorums. The <code>nameToNode</code> field keeps track of the different nodes in the quorum system. The <code>QuorumSystem</code> struct has the methods for calculating the quorum system&apos;s capacity, latency, load, and network load.</p>
<p>The <code>StrategyOptions</code> parameter struct represents the configurations for the strategy optimisation:</p>
<script src="https://gist.github.com/samueleresca/840a180bc9ab4735dfb20fb274654b10.js"></script>
<p>The <code>Optimize</code> property points to the optimisation target. The <code>LoadLimit</code>, <code>NetworkLimit</code>, <code>LatencyLimit</code> define an optional limit on the load, the network, and latency. The <code>ReadFraction</code> and the <code>WriteFraction</code> determine the workload distribution of the read and write operations.<br>
The <code>F</code> field represents the resilience of the quorum. A quorum is F-resilient for some integer <em>f</em>, despite removing any <em>f</em>  nodes from <em>r</em>, <em>r</em> is still a read/write quorum<a href="#references">[1]</a>.</p>
<p>The library defines the initialization functions for a new <code>QuorumSystem</code>:</p>
<script src="https://gist.github.com/samueleresca/d8d55cb34f10c8c143c0c9fccb1e0c54.js"></script>
<p>The above code omits some methods implementations for brevity. If the caller provides either the read or writes quorum, the constructor computes the logical dual operation of the other quorum and initialise a new <code>QuorumSystem</code> struct. If the caller provides both read and write quorums, the constructor checks the validity of the quorums and returns a new <code>QuorumSystem</code> struct with the corresponding quorums.</p>
<p>The following section shows how to translate the optimal strategy problem into a linear programming problem.</p>
<h2 id="optimal-strategy-problem-definition">Optimal strategy problem definition</h2>
<p>The original python implementation of quoracle uses the PuLP library and <a href="https://github.com/coin-or">coin-or</a>. The <a href="https://samueleresca.net/practical-approach-to-read-write-quorum-systems/">previous blog post</a> looked at how to use PuLP for optimisation problems in a python runtime.</p>
<p>The Golang implementation uses a library called <a href="https://github.com/lanl/clp">lanl/clp</a>. Also lanl/clp relies on <a href="https://github.com/coin-or">coin-or</a> for solving linear programming optimization problems.</p>
<p>The codebase defines a helper struct to build a linear programming problem:</p>
<script src="https://gist.github.com/samueleresca/68189f57c43c32e0a74c089c3d737bd2.js"></script>
<p>A <code>lpDefinition</code> struct contains the variables needed to describe a linear programming problem.<br>
Let&apos;s suppose that we have three six-faced dices. Two dice are not allowed to have the same value. The goal is to find a difference between the 1st and 2nd-largest dice smaller than the 2nd and 3rd dice. The following <code>lpDefinition</code> represents the problem:</p>
<script src="https://gist.github.com/samueleresca/01ee4261c772f1330bc27a469c62b1f6.js"></script>
<p>The <code>lpDefinition</code> above stores a value in the <code>Vars</code> array for each dice. The <code>Constraints</code> matrix represents the dice range constraint, from 1 to 6. The <code>Objective</code> matrix contains the two goals of the problem:</p>
<ol>
<li>Each dice must be different from the others (line 10 and 11);</li>
<li>The difference between the 1st and 2nd largest dice must be smaller than the one between the 2nd and the 3rd dice.</li>
</ol>
<p>The example above shows how to use the helper struct in the codebase to build a minimisation problem. Next, we will take a detailed look at the implementation for optimising the metrics. Depending on the optimisation target, the problem definition builds a different <code>lpDefinition</code> struct.</p>
<h3 id="load-optimization-definition">Load optimization definition</h3>
<p>Let&apos;s start by describing how the load optimization problem is implemented. To recap, the formula for the load defined in the paper<a href="#references">[1]</a> is:</p>
<p>$$ \frac{f_r}{cap_R(x)} \sum_{{r \in R | x \in r }} p_r +  \frac{1 - f_r}{cap_W(x)}  \sum_{{w \in W | x \in w }}  p_w \leq L_{f_r} $$</p>
<p>The following snippet of code implements the above formula, and it builds the LP problem:</p>
<script src="https://gist.github.com/samueleresca/7f1ee46ca07f34d196e38cff614a8735.js"></script>
<p>The snippet omits some code for brevity. The <code>buildLoadDef</code> local function encapsulates the logic for building the load optimisation problem.<br>
The function initialises the <code>Vars</code> and the <code>Constraints</code> from the <code>lpVariable</code>. Next, it adds the <code>l</code> variable and constraint that indicates the load(\(L_{fr}\)) for the specific read fraction. It builds the load formula for every <code>lpVariable</code> in the problem. For each <code>Node</code> in the quorum system, it applies the following expression in case of a read quorum:</p>
<pre><code>tmp[v.Index] += fr * v.Value / float64(*qs.GetNodeByName(n.Name).ReadCapacity)
</code></pre>
<p>otherwise, in the case of a write quorum, it proceeds by using:</p>
<pre><code>tmp[v.Index] += (1 - fr) * v.Value / float64(*qs.GetNodeByName(n.Name).WriteCapacity)
</code></pre>
<p>The <code>ReadCapacity</code> and the <code>WriteCapacity</code> are configurable for each <code>Node</code>.<br>
The code needs to maintain the same order in the <code>Vars</code> and the <code>Objectives</code> arrays. Thus, each <code>lpVariable</code> uses an <code>Index</code> to refer to the exact position of each element in the arrays.</p>
<h3 id="network-load-optimization-definition">Network load optimization definition</h3>
<p>This section describes the implementation of the network load. Let&apos;s start by refreshing the formula<a href="#references">[1]</a>:</p>
<p>$$ f_r ( \sum_{r \in R} p_r  \cdot |r|) + (1 - f_r) ( \sum_{w \in W} p_w  \cdot |w|) $$</p>
<p>\(|r|\) and \(|w|\) are the length of the read and write quorums sets. The library builds the network load minimization problem as follow:</p>
<script src="https://gist.github.com/samueleresca/5f404cd9d90d449a6f80ecaafa42fb35.js"></script>
<p>The above code initialises a new <code>Vars</code> and the <code>Constraints</code> fields for each quorum. Then, it applies the network load formula by multiplying the length of the quorum with <code>fr</code>. Also, the implementation adds a row in the <code>Objectives</code> matrix in case we specify a network limit.</p>
<h3 id="latency-optimization-definition">Latency optimization definition</h3>
<p>The last optimization target is the latency. The formula described in the paper defines the latency as:</p>
<p>$$ f_r ( \sum_{r \in R} p_r \cdot latency(r)) + (1 - f_r) ( \sum_{w \in W} p_w \cdot latency(w)) $$</p>
<p>Below is the optimization definition of the latency:</p>
<script src="https://gist.github.com/samueleresca/3a6a9f0a077ff699bf7afd996db28cd0.js"></script>
<p>The implementation creates a new <code>lpDefinition</code> populating the <code>Vars</code> and the <code>Constraints</code>. Then, it retrieves the latency for each <code>readQuourmVars</code> and <code>writeQuorumVars</code>. The latency of a quorum is the shortest time required to form a quorum after contacting the nodes in that quorum. The code calculates <code>l</code> using the <code>readQuorumLatency</code> and the <code>writeQuorumLatency</code> methods.</p>
<p>Next, it continues by applying the formula of the latency for the read quorums:</p>
<pre><code>obj[v.Index] = fr * v.Value * float64(l)
</code></pre>
<p>the code takes the same approach for the write quorums using the opposite workload:</p>
<pre><code>obj[v.Index] = (1 - fr) * v.Value * float64(l)
</code></pre>
<p>The following section describes how to translate the optimisation result into a new <code>Strategy</code>. Also, it shows how to execute the LP optimisation using the definitions seen in this section.</p>
<h2 id="strategy-initialisation">Strategy initialisation</h2>
<p>The previous section described how to build the problem definition. Now we can proceed by executing the optimisation. The snippet of code below describes the optimisation execution and the initialisation of the strategy:</p>
<script src="https://gist.github.com/samueleresca/213f057f4d2588565adeef22a1a7ad12.js"></script>
<p>As a first step, the code initialises a <code>NewSimplex</code>. The simplex algorithm is a popular linear programming algorithm. We want to find the least objective of our problem. Thus the code sets the optimisation direction as <code>clp. Minimise</code>.</p>
<p>The code proceeds by creating a new <code>lpDefinition</code> based on the optimisation definitions seen in the previous section. For example, if the optimisation target is <code>Network</code>, the code calls the <code>buildNetworkDef</code> function.</p>
<p>On top of the optimisation target, the code needs to add another objective: the total sum of the read and write probabilities must be 1. The <code>getTotalProbabilityObjectives</code> method takes care of that. It returns a new objective array where the read and write probabilities targets <code>1</code>.</p>
<p>The code executes the optimisation and checks that the resulting <code>status</code> is optimal. If the operation succeeds, the code gets back the optimal solution. Then, each quorum initialises a new <code>SigmaRecord</code> with the quorum and its probability of being selected.</p>
<p>Finally, it creates a new <code>Strategy</code> with the <code>SigmaR</code> (array of <code>SigmaRecord</code> for the read quorums) and the <code>SigmaW</code> (array of <code>SigmaRecord</code> for the write quorums).</p>
<p>Let&apos;s refresh the definition of strategy as mentioned in the paper:</p>
<p>$$ \sigma = (\sigma_R, \sigma_W) $$</p>
<p>\(\sigma_R\) and \(\sigma_W\) are probabilities distributions: \(\sigma_R(r)\) and \(\sigma_W(w)\) are respectively the probabilities that the strategy chooses a <em>read quorum</em> \(r\) and a <em>write quorum</em> \(w\).</p>
<p>quoracle-go represents a <code>Strategy</code> in a similar way using the following structs:</p>
<script src="https://gist.github.com/samueleresca/2a4a962ca28de13e7d31ff3ea8434ce7.js"></script>
<p>The <code>SigmaR</code> and <code>SigmaW</code> corresponds to the strategy&apos;s probability of choosing a specific quorum. Also, the <code>Strategy</code> struct maintains a hashmap storing the node and its likelihood of being selected.</p>
<p>The <code>Strategy</code> struct exposes some methods that retrieve some metrics and information. Below is the list of methods provided with the <code>Strategy</code> struct:</p>
<script src="https://gist.github.com/samueleresca/ee56fbfb9c839e85cb6e744ffb326b91.js"></script>
<p>The code above omits the implementation of the functions for brevity. The <code>GetReadQuorum</code>, <code>GetWriteQuorum</code> uses a probability distribution to return the quorums. The <code>Load</code>, <code>Capacity</code>, <code>NetworkLoad</code> and <code>Latency</code> methods return the respective metrics for a given read or write workload. The <code>NodeLoad</code>, <code>NodeUtilization</code> and <code>NodeThroughput</code> methods target specific <code>Node</code> in the quorum system. The implementations use the probability of a node getting selected to calculate the respective node metric.</p>
<h2 id="searching-for-the-optimal-quorum-strategy">Searching for the optimal quorum strategy</h2>
<p>We have seen how the codebase leverages linear programming to find the optimal strategy.</p>
<p>Let&apos;s reiterate one of the primary purposes of quoracle. Given the node&apos;s details, an optimisation target and workload distribution, it returns the optimised strategy.</p>
<p>The <code>Search</code> method implements the rule mentioned above. It calculates the optimal strategy by trying all the combinations of quorums. Whenever the optimal strategy for a given valid quorum returns a better target metric, the <code>Search</code> function saves that.</p>
<p>Below is the code implementation of the <code>Search</code> function:</p>
<script src="https://gist.github.com/samueleresca/2d1c84c9ea40ec98b1bfd94e441c650d.js"></script>
<p>The <code>dupFreeExprs</code> and the <code>doSearch</code> functions encapsulate the core logic.<br>
The <code>dupFreeExprs</code> returns all the possible combinations of quorums composed using a list of nodes.<br>
The <code>doSearch</code> uses the <code>dupFreeExprs</code> outcome to initialise a new quorum system and find an optimal strategy. The implementation keeps track of the <code>Strategy</code> and <code>QuorumSystem</code> with the most optimised metric.</p>
<p>The search process is time measured. When the operation reaches a specified timeout, then the search stops. The timeout-controlled function prevents long-hanging processes.</p>
<h2 id="wrap-up">Wrap up</h2>
<p>This post went through the Golang port of quoracle, describing the main components implemented in the codebase. The code is available at <a href="https://github.com/samueleresca/quoracle-go">samueleresca/quoracle-go</a>.<br>
The project had two primary purposes. First, to put into practice the core concepts described in the paper<a href="#references">[1]</a>. Secondly, to explore the Golang ecosystem and tooling.<br>
Some of the concepts might seem very theoretical, but it is essential to know the basics. Quorums are the foundation of distributed systems topics such as replication and consensus.</p>
<h2 id="references">References</h2>
<p>[1]<a href="https://mwhittaker.github.io/publications/quoracle.pdf">Read-Write Quorum Systems Made Practical - Michael Whittaker, Aleksey Charapko, Joseph M. Hellerstein, Heidi Howard, Ion Stoica</a><br>
[2]<a href="https://github.com/lanl/clp">lanl/clp provides a Go interface to the COIN-OR Linear Programming (CLP) library</a><br>
[3]<a href="https://github.com/mwhittaker/quoracle">github.com/mwhittaker/quoracle</a></p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[A practical approach to read-write quorum systems]]></title><description><![CDATA[Quorum systems allow consistency of replicated data; every time a group of servers needs to agree on something, a quorum is involved in the decisions. An example could be the leaderless databases, such as Dynamo. Read-write quorums define two configurable values, R and W. ]]></description><link>https://samueleresca.net/practical-approach-to-read-write-quorum-systems/</link><guid isPermaLink="false">63d9a4dc2bc5da0fb16ac190</guid><category><![CDATA[Distributed Systems]]></category><category><![CDATA[Quorum Systems]]></category><dc:creator><![CDATA[Samuele Resca]]></dc:creator><pubDate>Mon, 14 Jun 2021 11:51:41 GMT</pubDate><media:content url="https://samueleresca.net/content/images/2021/06/background_quorum_systems.jpg" medium="image"/><content:encoded><![CDATA[<div class="kg-card kg-callout-card kg-callout-card-grey"><div class="kg-callout-emoji">&#x2139;&#xFE0F;</div><div class="kg-callout-text">The 2nd part of this post is available: <a href="https://samueleresca.net/a-practical-approach-to-read-write-quorum-systems-part-2/">[Part 2] A practical approach to read-write quorum systems</a>.</div></div><img src="https://samueleresca.net/content/images/2021/06/background_quorum_systems.jpg" alt="A practical approach to read-write quorum systems"><p></p><!--kg-card-begin: markdown--><p>Quorum systems allow consistency of replicated data; every time a group of servers needs to agree on something, a quorum is involved in the decisions.</p>
<p>For example, leaderless databases such as Dynamo (not to be confused with DynamoDB) uses a consistency protocol based on quorums<a href="#references">[1]</a>. Furthermore, most of the consensus algorithms such as Zab, Raft, Paxos are based on quorums<a href="#references">[2]</a>.</p>
<p>Read-write quorums define two configurable parameters, <code>R</code> and <code>W</code>.<br>
<code>R</code> is the minimum number of nodes that must participate in a <em>read operation</em>, and <code>W</code> the minimum number of nodes that must participate in a <em>write operation</em>.</p>
<p>A few weeks ago, I came across this tweet:</p>
<!--kg-card-end: markdown--><!--kg-card-begin: html--><center><blockquote class="twitter-tweet" data-theme="dark"><p lang="en" dir="ltr">Majority quorums are pervasive in strongly consistent distributed systems yet there are many alternatives with better scalability and performance. Quoracle is a new open-source tool to find an optimal quorum system for a given distributed architecture. <a href="https://t.co/d4UpFNdQuf">https://t.co/d4UpFNdQuf</a> <a href="https://t.co/V2SaXYmkoP">pic.twitter.com/V2SaXYmkoP</a></p>&#x2014; Heidi Howard (@heidiann360) <a href="https://twitter.com/heidiann360/status/1381962220100222976?ref_src=twsrc%5Etfw">April 13, 2021</a></blockquote> </center> <script defer src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<br><!--kg-card-end: html--><!--kg-card-begin: markdown--><p>The tweet refers to the following publication:</p>
<p><a href="https://mwhittaker.github.io/publications/quoracle.pdf">Read-Write Quorum Systems Made Practical - Michael Whittaker, Aleksey Charapko, Joseph M. Hellerstein, Heidi Howard, Ion Stoica</a></p>
<p>The paper reviews some concepts of the quorum systems, and it presents a concrete tool named &quot;Quoracle&quot; that explores the trade-offs of the read-write quorum systems.</p>
<p>Quoracle provides an alternative to the majority quorum systems that are widely adopted in distributed systems. The majority quorum can be defined as</p>
<p>$$ \frac{n}{2} + 1 \space where \space n = n. \space of \space nodes $$</p>
<p>In the case of a read-write quorum systems the majority is represented in a similar way:</p>
<p>$$ r = w = \frac{n}{2} + 1 $$</p>
<p>where \(r\) and \(w\) are the read and write quorums.</p>
<p>Below there are some notes I took while I was reading the paper and a detailed look at the proposed implementation of Quoracle. The topics discussed are listed below:</p>
<ul>
<li><a href="#quorum-system-definitions">Quorum system definitions</a></li>
<li><a href="#quoracle-overview">Quoracle overview</a></li>
<li><a href="#optimal-strategy-problem-implementation">Optimal strategy problem implementation</a></li>
<li><a href="#strategy-representation">Strategy representation</a></li>
<li><a href="#handling-quorum-failures">Handling quorum failures</a></li>
<li><a href="#optimal-strategy-search">Optimal strategy search</a></li>
</ul>
<h2 id="quorum-system-definitions">Quorum system definitions</h2>
<p>The paper resumes the definitions and the characteristics of the read-write quorum systems. These concepts are helpful to understand the theory of quorum systems and the implementation of Quoracle.</p>
<p>Given a list of nodes represented as follow:</p>
<p>$$ X = {{x_1,..,x_n}} $$</p>
<p>A <em>read-write quorum system</em> is defined as \(Q = (R, W)\) where \(R\) represents the read quorums, and \(W\) represents the write quorums. \(R\) and \(W\) are sets of subsets of the list of nodes defined in \(X\).</p>
<p>An additional constraint asserts that every read and writes quorum (\(r\) and the \(w\)) must intersect. This is represented with the following definition:</p>
<p>$$ r \in R \space and \space w \in W, r \cap w \neq 0 $$</p>
<p>It is possible to illustrate a quorums systems with a grid as follow:</p>
<p><img src="https://samueleresca.net/content/images/2021/04/Screenshot-2021-04-23-at-18.09.51.png" alt="A practical approach to read-write quorum systems" loading="lazy"></p>
<p>The grid above describes a \( Q_{2x3} \) quorum system defined as \( Q_{2x3} = (abc + def, ab + bc + ac)\).</p>
<p>Note that multiplication represents the nodes that are part of the same set while the addition separates the groups. This approach will be helpful to define read and write sets in Quoracle.</p>
<h3 id="fault-tolerance">Fault tolerance</h3>
<p>The paper describes the <em>read fault tolerance</em> of a quorum system as the largest number, called \(f\), of nodes that can fail before a read quorum is still available. The same definition is applied to the <em>write fault tolerance</em>.</p>
<p>The <em>fault tolerance</em> of a quorum system is the minimum between the <em>read fault tolerance</em> and the <em>write fault tolerance</em>. For example, the \(Q_{2x3}\) quorum system defined above has a <em>read fault tolerance</em> of 1 because if one node is down, it means that only a single read quorum will be available. On the other hand, the <em>write fault tolerance</em> is 2 because if two nodes fail, then there will be only 1 write quorum available.</p>
<p>In this case, the <em>fault tolerance</em> of the quorum system is equal to 1.</p>
<h3 id="strategy">Strategy</h3>
<p>The paper describes the concept of <em>strategy</em> of a read-write quorum system: the strategy of a quorum system, represented in the paper as \(\sigma\), decides which quorum to contact.</p>
<p>The strategy of a read-write quorum system is represented as:</p>
<p>$$ \sigma = (\sigma_R, \sigma_W) $$</p>
<p>where \(\sigma_R\) and \(\sigma_W\) are probabilities distributions: \(\sigma_R(r)\) and \(\sigma_W(w)\) are respectively the probabilities that the strategy choose a <em>read quorum</em> \(r\) and a <em>write quorum</em> \(w\).</p>
<h3 id="load-capacity-and-the-read-fraction">Load, capacity and the read fraction</h3>
<p>The <em>read load of a node</em> \(x\), represented as<br>
\(load_{\sigma_R}(x)\) is the probability that the \(\sigma_R(r)\) choose a read quorum \(r\) that contains the node \(x\). The same approach can be taken for the \(load_{\sigma_W}(x)\) .</p>
<p>For calculating the <em>total load</em> of a node \(x\), we need to introduce another parameter: the <em>read fraction</em>.</p>
<p>The <em>read fraction</em>, represented as \(f_r\), is the percentage of reads vs. write operations in a workflow. For example, a workflow with 50% read operations and 50% write operations has a read fraction \( f_r = 0.5 \).</p>
<p>Therefore, the <em>total load of x</em> can be represented by the following definition:<br>
$$ load(x) = f_rload_{\sigma_R}(x) + (1 - f_r)load_{\sigma_W}(x) $$</p>
<p>The <em>load of the quorum system</em> is the load of the optimal strategy, defined as the strategy that brings the lowest load.</p>
<p>The capacity can be repsented as the inverse of the load:</p>
<p>$$ C = \frac{1}{L} \hspace{0.5cm} where \hspace{0.5cm} L = load $$</p>
<p>Higher is the load on the quorums; lower is the capacity of the quorum system.</p>
<p>In conclusion, these are some of the definitions illustrated in the paper. This section gives a good overview of the possible parameters that can influence the performance and the efficiency of read-write quorum systems. The following excerpt is more practical, and it shows the application of these definitions on the implementation of Quoracle.</p>
<h2 id="quoracle-overview">Quoracle overview</h2>
<p>The concepts introduced in the previous section help understand the implementation of the quoracle library described in the paper, available on GitHub: <a href="https://github.com/mwhittaker/quoracle">mwhittaker/quoracle</a>.</p>
<p>Note that the quoracle lib is distributed on PyPI, and it can be referred to in a project using the following command:</p>
<pre><code>pip install quoracle
</code></pre>
<p>The paper provides an initial example similar to this:</p>
<script src="https://gist.github.com/samueleresca/8aa60efe4bb69683ff62aed5d5787224.js"></script>
<p>The snippet above declares nodes using the <code>Node</code> class and the corresponding quorum system defined by the <code>QuorumSystem</code> class. The <code>Node</code> class is used to describe the nodes that are part of the \(X\) set:</p>
<script src="https://gist.github.com/samueleresca/96c54636ecfca06e5bb551d8c03f6c8e.js"></script>
<p>Each <code>Node</code> instance has a <code>name</code>, a <code>read_capacity</code>, a <code>write_capacity</code>, and a <code>latency</code> attribute.</p>
<p>The <code>read_capacity</code>, <code>write_capacity</code> and <code>latency</code> attributes represent heterogeneous nodes.</p>
<p>When a read/write capacity is specified for a node, it is used to normalise the node load during the linear optimisation; we will go through the detailed process later in this post.</p>
<p>By default, the <code>latency</code> of a Node is 1s. The attribute is used in the linear optimisation process for calculating the latency of the quorum systems.</p>
<p>The multiplication(<code>*</code>) and sum operation(<code>+</code>) are defined by the <code>Expr</code> class that is extended by the <code>Node</code> class:</p>
<script src="https://gist.github.com/samueleresca/cfab5d1f200148a1e292a595f5779595.js"></script>
<p>Each <code>Node</code> instance extends the <code>Expr</code> type, this approach allows to compose expressions of nodes, such as the following quorum definition:</p>
<pre><code class="language-python">&#xA0; a, b, c = Node(&apos;a&apos;), Node(&apos;b&apos;), Node(&apos;c&apos;)
&#xA0; q = QuorumSystem(reads=a * b + b * c + a * c)
</code></pre>
<p>Another key component is the <code>QuourmSystem</code> class: it encapsulates the read and write quorums of the system, and it defines some methods for calculating the <em>optimal strategy</em>: the <em>load</em>, the <em>latency</em>, and the <em>network load</em>.</p>
<p>The snippet below defines the <code>QuorumSystem</code> class implemented by the library and the signature of the key methods exposed by the class:</p>
<script src="https://gist.github.com/samueleresca/bb6fd0669e860c780379420979375a08.js"></script>
<p>The snippet above defines four methods for the class <code>QuorumSystem</code>: <code>load</code>, <code>capacity</code>, <code>network_load</code>, <code>latency</code>. We will have a detailed look at the implementation of the methods later in this section. First, however, all these methods have a similar signature: they require an <code>optimize</code> parameter representing the objective of the optimisation, either the <code>read_fraction</code> or the <code>write_fraction</code> parameters (already described <a href="#quorum-system-definitions">in the theory section</a>), and some additional constraints used by the optimisation of the quorum strategy, such as the <code>load_limit</code>, <code>network_limit</code>, <code>latency_limit</code> parameters.</p>
<p>As you may have noticed, the initialisation of the quorum system only provides the following read quorums:</p>
<pre><code>ab + bc + ac
</code></pre>
<p>In this case, the library will derive the optimal write quorums by applying a <code>dual</code> operation to the given quorum. For example, the result of the <code>dual</code> process on the <code>ab + bc + ac</code> quorum calculates:</p>
<pre><code>(a + b) * (b + c) * (a + c)
</code></pre>
<p>Once the <code>QuorumSystem</code> is initialised, it is possible to calculate the <code>load</code>, the <code>latency</code>, and the <code>network_load</code> of the system.</p>
<h2 id="optimal-strategy-problem-implementation">Optimal strategy problem implementation</h2>
<p>This section dig into the optimal strategy implementation. The <code>load</code>, <code>capacity</code>, <code>network_load</code> and <code>latency</code> methods in the <code>QuorumSystem</code> class use the concept of <em>optimal strategy</em>, referred in the paper as \( \sigma^{\ast} = (\sigma_R^{\ast}, \sigma_W^{\ast}) \).</p>
<p>A strategy can be optimal in respect of the load, the network, or the latency. The objective of the optimisation can be configured using the <code>optimize</code> parameter.</p>
<p>The internal implementation turns the optimisation into a linear programming problem, and it relies on <a href="https://coin-or.github.io/pulp/">PuLP</a> to find the optimal strategy.</p>
<p>In the following sections, we discover the different optimisation objectives provided by Quoracle.</p>
<h3 id="load-optimization">Load optimization</h3>
<p>In case the <code>optimize == LOAD</code> then the objective of the linear optimiation is the load. The implementation of the load optimization works as follow:</p>
<script src="https://gist.github.com/samueleresca/800044104b218afda46951e804be23fd.js"></script>
<p>The implementation minimizes the load using the following definition:</p>
<p>$$ \frac{f_r}{cap_R(x)} \sum_{{r \in R | x \in r }} p_r + &#xA0;\frac{1 - f_r}{cap_W(x)} &#xA0;\sum_{{w \in W | x \in w }} &#xA0;p_w \leq L_{f_r} $$</p>
<p>For each <code>read_fraction</code> (\(f_r\)) in the <code>QuorumSystem</code>, the <code>load</code> method proceeds by computing the load by calling <code>fr_load</code> method.</p>
<p>As a first step, the <code>fr_load</code> method defines the \(L_{f_r}\) variable, representing the load for that read fraction.<br>
It continues by adding the read quorum part of the definition above:</p>
<pre><code>x_load += fr * sum(vs) / self.node(x).read_capacity
</code></pre>
<p>the same for the write quorum part:</p>
<pre><code>x_load += (1 - fr) * sum(vs) / self.node(x).write_capacity
</code></pre>
<p>As a final step, it proceeds by finalising the constraint using the instruction:</p>
<pre><code>problem += (x_load &lt;= l, f&apos;{x}{fr}&apos;)
</code></pre>
<h3 id="network-load-optimization">Network load optimization</h3>
<p>The following expression defines the network load.</p>
<p>$$ f_r ( \sum_{r \in R} p_r &#xA0;\cdot |r|) + (1 - f_r) ( \sum_{w \in W} p_w &#xA0;\cdot |w|) $$</p>
<p>Where \(|r|\) and \(|w|\) are the length of the read and write quorums sets.</p>
<p>In concrete, the library implements the following code:</p>
<script src="https://gist.github.com/samueleresca/f1a43ab8acbc43b6f263a1aed2feaccb.js"></script>
<p>If <code>optimise == NETWORK</code>, quoracle will use the network function above as a target objective for the optimal strategy. The implementation follows the network load definition mentioned above. Each read and write quorum multiplies the quorum length with the <code>read_quorum_vars</code> and <code>write_quorum_vars</code>, and it sums the values together.</p>
<p>If \(f_r\) is a distribution, the network load is computed as a weighted average of all the network loads derived by every single value in the distribution.</p>
<h3 id="latency-optimization">Latency optimization</h3>
<p>The latency is another important aspect of a quorum system. In the real world, nodes are heterogeneous, and the latency depends on the network conditions and the status of the node that we are contacting.</p>
<p>The paper defines the latency as:</p>
<p>$$ f_r ( \sum_{r \in R} p_r \cdot latency(r)) + (1 - f_r) ( \sum_{w \in W} p_w \cdot latency(w)) $$</p>
<p>In practice, in case <code>optimize == LATENCY</code>, Quoracle uses the following approach for calculating the latency:</p>
<script src="https://gist.github.com/samueleresca/20dbee03b347035bda18d383e180a668.js"></script>
<p>The <code>Node</code> class has a configurable <code>latency</code>. The implementation builds the latency minimisation problem by multiplying the latency in seconds assigned to each node with the read and write quorums. If \(f_r\) represents a distribution, then the implementation uses the weighted average of the values in the distribution to get the final result.</p>
<h3 id="additional-constraints">Additional constraints</h3>
<p>As we saw previously, the target of linear optimisation depends on the <code>optimize</code> parameter. Therefore, the resulting <em>optimal strategy</em> will be optimal regarding the <em>load</em>, the <em>network load</em>, or the <em>latency</em>.</p>
<p>In addition, it is possible to specify some additional constraints for the optimal strategy: these constraints are limits on the <em>load</em>, the <em>network</em>, or the <em>latency</em>.</p>
<p>Note that it is not possible to optimise for load and at the same time specify a load limit to the strategy; the same restriction is valid for the network and the latency.</p>
<p>The additional constraints use the same definitions of load, network latency, and latency we saw previously. This example shows a final problem that defines an optimal load strategy with some constraints on the network and the latency.</p>
<p>Let&apos;s take for example the following optimization query:</p>
<pre><code class="language-python">q = QuorumSystem(reads=a * b + b * c + a * c)

q.load(optimize=LOAD, read_fraction=1, network_limit=3, latency_limit=datetime.timedelta(seconds=2))
</code></pre>
<p>The <code>QuorumSystem</code> is defined by some quorums composed of three nodes: a, b, c. We want to get the optimal load (<code>optimize==LOAD</code>), with a <code>read_fraction</code> of 1 (100% reads workflow), a <code>network_limit==3</code> and a latency of 3 seconds (by default, every <code>Node</code> has a latency of 1s if we not specify anything.). The resulting optimisation problem that finds the optimal strategy looks like this:</p>
<pre><code>optimal_strategy:
&#xA0; &#xA0; MINIMIZE
&#xA0; &#xA0; &#xA0; &#xA0; 1.0*l_1.0 + 0.0
&#xA0; &#xA0; SUBJECT TO
&#xA0; &#xA0; &#xA0; &#xA0; valid_read_strategy: r0 + r1 + r2 = 1
&#xA0; &#xA0; &#xA0; &#xA0; valid_write_strategy: w0 + w1 + w2 + w3 + w4 + w5 + w6 + w7 = 1
&#xA0; &#xA0; &#xA0; &#xA0; a1.0: - l_1.0 + r0 + r2 &lt;= 0
&#xA0; &#xA0; &#xA0; &#xA0; b1.0: - l_1.0 + r0 + r1 &lt;= 0
&#xA0; &#xA0; &#xA0; &#xA0; c1.0: - l_1.0 + r1 + r2 &lt;= 0
&#xA0; &#xA0; &#xA0; &#xA0; network_limit: 2 r0 + 2 r1 + 2 r2 &lt;= 3
&#xA0; &#xA0; &#xA0; &#xA0; latency_limit: r0 + r1 + r2 &lt;= 2
</code></pre>
<p>As you can see, the optimal strategy problem tries to minimise the load in respect of some constraints:</p>
<ul>
<li>valid read/write strategies: the sum of all the read/write probabilities is 1;</li>
<li>the <code>a1.0</code>, <code>b1.0</code>, <code>c1.0</code> are the constraints which represent the load optimization. In detail, the constraint <code>a1.0</code> representing the load for node <code>a</code>, use the constraint <code>r0 + r2 &lt;= l_1.0</code>;</li>
<li>the <code>network_limit</code> must be less than three as we specified in the <code>load</code> method call above;</li>
<li>the <code>latency_limit</code> must be less than 2s as we specified in the <code>load</code> method call above;</li>
</ul>
<p>The reference to the <code>optimal_strategy</code> problem is then optimised, and the resulting values, encapsulated in a <code>pulp.LpVariable</code> instance, are passed to a new <code>Strategy</code> type.</p>
<h2 id="strategy-representation">Strategy representation</h2>
<p>Quoracle uses a <code>Strategy</code> class to represent the concept of strategy (<code>&#x3C3;</code>) for the quorum system. The following snippet of code describes the implementation and the constructor of a <code>Strategy[T]</code> type:</p>
<script src="https://gist.github.com/samueleresca/e50f62f55c8bc5dc40cd695bc9378b64.js"></script>
<p>The class extends a <code>Generic[T]</code> type, and it wraps a reference to a <code>QuorumSystem[T]</code> type, a <code>sigma_r</code>, and a <code>sigma_w</code> attribute, the last two properties represents the definition of strategy we saw in the first section:</p>
<p>$$ \sigma = (\sigma_R, \sigma_W) \newline$$</p>
<p>The <code>Strategy</code> constructor initializes two dictionaries: the <code>x_read_probability</code> and <code>x_write_probability</code>. They provide a map between node x and its probability of getting selected as part of a quorum.</p>
<p>The <code>Strategy</code> class exposes the methods for retrieving the read and write quorums and the nodes that are part of the quorum systems:</p>
<script src="https://gist.github.com/samueleresca/d1c5e7bcffc42769e0e48c40ecb4825a.js"></script>
<p>The <code>get_read_quorum</code> and the <code>get_write_quorum</code> methods pick the read/write set depending on the probability of getting selected.</p>
<p>Finally, the <code>Strategy</code> class implements the methods for calculating key characteristics we discussed previously, such as: the <code>load</code>, the <code>capacity</code>, the <code>network_latency</code>, and the <code>latency</code>:</p>
<script src="https://gist.github.com/samueleresca/de9af4dda1315677b51668c196991c1c.js"></script>
<p>The implementation works similarly to the constraints definitions that define the optimal strategy problem we discussed in the previous sections. Again, it is possible to pass the <code>read_fraction</code> and the <code>write_fraction</code> distributions to represent heterogeneous workflows.</p>
<h2 id="handling-quorum-failures">Handling quorum failures</h2>
<p>In distributed systems, failures are frequent. The two main approaches adopted in quorum systems are:</p>
<ul>
<li>contact every node in the quorum system in case some of them are failing;</li>
<li>contact the minimum amount of nodes and eventually retry if a node fails (retry means losing a significant amount of time);</li>
</ul>
<p>The paper defines a dynamic value representing a quorum system&apos;s resilience, called <em>f-resilience</em>.</p>
<p>\(f\) represents the number of failures that are tolerated by a strategy.</p>
<p>Quoracle capabilities computes the read and write quorums that are <em>f-resilient</em> using DP. The logic is defined in the <code>_f_resilient_quorums</code> method:</p>
<script src="https://gist.github.com/samueleresca/7f9df308498a1c5cf3c3762f837602b6.js"></script>
<p>Given a \(f\) coefficient, a set of nodes, and expression (<code>Expr</code>) indicating the read and write quorums, the method returns the sets of <em>f-resilient</em> quorums by iterating over all the possible combinations of nodes that can form a valid f-resilient quorum. Note that it is impossible to build resilient quorums in some scenarios: for example, in case <code>f &gt; len(xs)</code>. Therefore, the method will return an empty set.</p>
<p>The concept of resilience, as described in the paper, comes with the cost of capacity.</p>
<h2 id="optimal-strategy-search">Optimal strategy search</h2>
<p>Quoarcle provides a way to search for an optimal quorum depending on a given set of parameters:</p>
<pre><code>def search(nodes: List[Node[T]],
&#xA0; &#xA0; &#xA0; &#xA0; &#xA0; &#xA0;read_fraction: Optional[Distribution] = None,
&#xA0; &#xA0; &#xA0; &#xA0; &#xA0; &#xA0;write_fraction: Optional[Distribution] = None,
&#xA0; &#xA0; &#xA0; &#xA0; &#xA0; &#xA0;optimize: str = LOAD,
&#xA0; &#xA0; &#xA0; &#xA0; &#xA0; &#xA0;resilience: int = 0,
&#xA0; &#xA0; &#xA0; &#xA0; &#xA0; &#xA0;load_limit: Optional[float] = None,
&#xA0; &#xA0; &#xA0; &#xA0; &#xA0; &#xA0;network_limit: Optional[float] = None,
&#xA0; &#xA0; &#xA0; &#xA0; &#xA0; &#xA0;latency_limit: Optional[datetime.timedelta] = None,
&#xA0; &#xA0; &#xA0; &#xA0; &#xA0; &#xA0;f: int = 0,
&#xA0; &#xA0; &#xA0; &#xA0; &#xA0; &#xA0;timeout: datetime.timedelta = datetime.timedelta(seconds=0)) \
&#xA0; &#xA0; &#xA0; &#xA0; &#xA0; &#xA0;-&gt; Tuple[QuorumSystem[T], Strategy[T]]:
&#xA0; &#xA0; &#xA0; &#xA0; &#xA0; &#xA0;
&#xA0; &#xA0; &#xA0; &#xA0; &#xA0; &#xA0;#...
</code></pre>
<p>The <code>nodes</code> parameter represents the nodes you have in the quorum system. As already discussed, the <code>read_fraction</code> could be a single value or a <code>Distribution</code> of percentages. Finally, the <code>optimize</code> parameter defines the objective of the optimisation.</p>
<p>The <code>load_limit</code>, <code>network_limit</code>, <code>latency_limit</code> are the additional constraints already explained in the <a href="#optimal-strategy-problem-implementation">Optimal strategy problem implementation</a> section.</p>
<p>The underlying implementation of the <code>search</code> function uses the types we already described, such as the <code>Strategy</code> and the<code>QuorumSystem</code> classes:</p>
<script src="https://gist.github.com/samueleresca/fba9d47adbb206794d47aec8fe79accb.js"></script>
<p>Given a set of <code>nodes</code>, the implementation calculates the list of all the de-duplicated expressions that can be composed with that set of nodes using the <code>_dup_free_exprs</code> function.</p>
<p>The implementation proceeds by calling the <code>do_search</code> function; it initialises a new instance of a <code>QuorumSystem</code> for each previously computed expression.</p>
<p>Once the <code>QuorumSystem</code> is initialised, it proceeds by calling the <code>strategy</code> method to get the <em>optimal strategy</em> for the current expression. If the target parameter for that strategy is more optimal than the previous strategy, it is promoted to an optimal strategy. Otherwise, the algorithm continues.</p>
<p>In terms of interactions, the <code>do_search</code> applies linear programming optimisation for each expression derived from the <code>_dup_free_exprs</code>. However, the computation is time-consuming: the <code>search</code> function accepts a timeout parameter limiting the search&apos;s computation time. The implementation checks if the timeout has hit during every iteration in case it stops the computation.</p>
<h2 id="final-thoughts">Final thoughts</h2>
<p>This post went through the concepts of quorum systems illustrated in the <a href="https://mwhittaker.github.io/publications/quoracle.pdf">Read-Write Quorum Systems Made Practical - Michael Whittaker, Aleksey Charapko, Joseph M. Hellerstein, Heidi Howard, Ion Stoica</a> paper, and it gave a detailed look at the Quoracle implementation. Quoracle is an excellent foundation to explore the thread-offs of the read-write quorums systems. A closer look at Quoracle implementation helps to understand the critical aspect of implementing quorums in distributed systems.</p>
<p>In the next couple of months, I will implement a Golang port of Quoracle for learning purposes. The port is in progress at the following: <a href="https://github.com/samueleresca/quoracle-go">samueleresca/quoracle-go</a>. I will probably write a more detailed post that gives a closer look at the Golang implementation.</p>
<p>Below there are the references used to write this post.</p>
<h2 id="references">References</h2>
<p>[1]<a href="https://sites.cs.ucsb.edu/~agrawal/fall2009/dynamo.pdf">Dynamo: Amazon&#x2019;s Highly Available Key-value Store</a></p>
<p>[2]<a href="https://www.usenix.org/legacy/event/atc10/tech/full_papers/Hunt.pdf">ZooKeeper: Wait-free coordination for Internet-scale systems</a></p>
<p>[3]<a href="https://martinfowler.com/articles/patterns-of-distributed-systems/quorum.html">Quorums - Martin fowler</a></p>
<p>[4]<a href="https://mwhittaker.github.io/publications/quoracle.pdf">Read-Write Quorum Systems Made Practical - Michael Whittaker, Aleksey Charapko, Joseph M. Hellerstein, Heidi Howard, Ion Stoica</a></p>
<!--kg-card-end: markdown--><p></p><p></p><p></p><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[Detecting node failures and the Phi accrual failure detector]]></title><description><![CDATA[Partial failure is an aspect of distributed systems; the asynchronous nature of the processes and the network infrastructure makes fault detection a complex topic. 
Failure detectors usually provide a way to identify and handle failures]]></description><link>https://samueleresca.net/detecting-node-failures-and-the-phi-accrual-failure-detector/</link><guid isPermaLink="false">63d9a4dc2bc5da0fb16ac18e</guid><category><![CDATA[Distributed Systems]]></category><category><![CDATA[Fault tolerance]]></category><category><![CDATA[Failure detectors]]></category><dc:creator><![CDATA[Samuele Resca]]></dc:creator><pubDate>Wed, 07 Apr 2021 16:02:01 GMT</pubDate><media:content url="https://samueleresca.net/content/images/2021/03/Screenshot-2021-03-23-at-10.33.02.jpg" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: markdown--><img src="https://samueleresca.net/content/images/2021/03/Screenshot-2021-03-23-at-10.33.02.jpg" alt="Detecting node failures and the Phi accrual failure detector"><p>Partial failure is an aspect of distributed systems; the asynchronous nature of the processes and the network infrastructure makes fault detection a complex topic.<br>
Failure detectors usually provide a way to identify and handle failures using a constant timeout: after a certain threshold, the failure detector declares the node offline.</p>
<p>This post is the result of some readings and notes related to an alternative approach for failure detection called &quot;&#x3C6; accrual failure detector&quot;.</p>
<p>Most of the concepts are from the following paper:</p>
<p><a href="https://ieeexplore.ieee.org/document/1353004">The &#x3C6; Accrual Failure Detector - Naohiro Hayashibara, Xavier D&#xE9;fago, Rami Yared and Takuya Katayama</a></p>
<p>A common way to detect node failures in asynchronous distributed systems is to use a heartbeat signal: let&apos;s suppose that we have a <code>p</code> process that pings a <code>q</code> monitor process. If <code>q</code> doesn&apos;t receive a heartbeat request from <code>p</code> after a certain delay (<code>&#x394;t</code>), then <code>q</code> can declare <code>p</code> failed.</p>
<p>How long should the timeout be before declaring <code>p</code> offline?</p>
<p>Since we have a binary signal, it is hard to distinguish an offline process from a slow one; we can end up with two different cases:</p>
<ul>
<li>A <em>short timeout</em> means that we can potentially declare the node offline even if it is not (<em>aggressive approach</em>), but we will detect offline nodes faster;</li>
<li>A <em>long timeout</em> reduces the risk of wrongly declaring nodes offline with the cons that the detection will be slower;</li>
</ul>
<p>Therefore, the timeout value is usually configured with an experimental approach and adjusted manually.</p>
<p>In addition to the process slowness, some additional variables play an important role in a heartbeat timeframe: the <em>network unbounded delays</em>. Different parts of the network infrastructure can slow down the communication between two nodes, such as the TCP retry mechanism or network congestion in a network switch.</p>
<p>An accrual failure detector calculates the variability of the response times based on a sample window, and it provides a dynamic way to identify a dependency failure.</p>
<h2 id="failuredetectorarchitecture">Failure detector architecture</h2>
<p>The paper describes the failure detectors with the following components:</p>
<ul>
<li>the <em>Monitoring</em> component receives the heartbeat from the network;</li>
<li>the <em>Interpretation</em> component defines the criteria that establish if a node should be considered available or not;</li>
<li>the <em>Action</em> component implements the decisions of what to do in case the node is not available based on the outcome of the <em>interpretation</em> component;</li>
</ul>
<p><img src="https://samueleresca.net/content/images/2021/03/accrualfailuredetector_arch-1.svg" alt="Detecting node failures and the Phi accrual failure detector" loading="lazy"></p>
<p>The image shows the different parts of a failure detector system.</p>
<p>The left schema represents a standard failure detector: the <em>monitoring</em> and the <em>interpretation</em> phases are implemented within the failure detection system. In this case, the failure detector returns a <em>suspicion</em> flag represented by a boolean value. The application logic does not perform any additional analysis on the value; it proceeds by executing an action depending on the <em>suspicion</em> result.</p>
<p>The schema on the right describes the anatomy of an accrual failure detector. The difference is that the accrual failure detector layer returns a <em>level of suspicious</em>, described by a dynamic numeric value. In this case, both the <em>interpretation</em> and the <em>action</em> are delegated to the application layer. This approach gives the application the freedom to perform different actions depending on the <em>suspicon level</em> and eventually prioritize the job reallocation using the level returned by the accrual failure detector.</p>
<h2 id="implementation">&#x3C6; implementation</h2>
<p>The paper proposes an implementation of the accrual failure detector called &quot;The <code>&#x3C6;</code> accrual failure detector&quot;.</p>
<p>The <code>&#x3C6;</code> is a value that is continuously adjusted depending on the current network conditions.</p>
<p>The <em>heartbeat signals</em> arrive from the network, and each heartbeat interval is stored in a <em>sample window collection</em> which has a fixed size. The <em>sample window collection</em> is used to estimate the distribution of the signals.</p>
<p>The <code>&#x3C6;</code> value is defined as follow:</p>
<p><img src="https://samueleresca.net/content/images/2021/03/phi_definition.png" alt="Detecting node failures and the Phi accrual failure detector" loading="lazy"></p>
<p><code>Tlast</code> is when the failure detector received the most recent heartbeat. <code>tnow</code> is the current timestamp. <code>Plater</code> is the probability that a heartbeat will arrive within an interval of time <code>tnow - Tlast</code>.</p>
<p>Since we are storing all the incoming intervals (<code>tnow - Tlast</code>) in the <em>sample window collection</em>, then <code>Plater</code> is calculated using the cumulative distribution function.</p>
<h2 id="pythonimplementation">Python implementation</h2>
<p>The <code>&#x3C6;</code> accrual failure detector concepts described in the paper are already implemented in <a href="https://github.com/akka/akka/blob/master/akka-remote/src/main/scala/akka/remote/PhiAccrualFailureDetector.scala">akka/akka</a>, and a slightly modified version is implemented <a href="http://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf">in Cassandra</a>.</p>
<p>In this section, I&apos;m describing a python port of the <a href="https://github.com/akka/akka/blob/master/akka-remote/src/main/scala/akka/remote/PhiAccrualFailureDetector.scala">akka implementation</a>. The source is available at the following repo: <a href="https://github.com/samueleresca/phi-accrual-failure-detector">samueleresca/phi-accrual-failure-detector</a>.<br>
The code implements a <code>&#x3C6;</code> accrual failure detector class called <code>PhiAccrualFailureDetector</code>.</p>
<p>Note that the implementation focuses on a single instance of the failure detector. Therefore, it ignores some components that are needed for handling multiple instances, such as a failure detector registry (see: <a href="https://github.com/akka/akka/blob/master/akka-remote/src/main/scala/akka/remote/DefaultFailureDetectorRegistry.scala">akka.remote.DefaultFailureDetectorRegistry</a>).</p>
<h3 id="samplingwindow">Sampling window</h3>
<p>The implementation of the <em>sampling window collection</em> is in the <code>HeartbeatHistory</code> class:</p>
<script src="https://gist.github.com/samueleresca/585485d7340009b9eea8c94e2cb699e6.js"></script>
<p>The <code>HeartbeatHistory</code> class defines a list of <code>intervals</code> and a <code>max_sample_size</code> attribute, which indicates the sampling window size. The class implements some methods that retrieve the <code>mean</code>, the <code>std_dev</code>, and the <code>variance</code> of the distribution.</p>
<p><code>HeartbeatHistory</code> overrides the sum operation by explicitly declaring the <code>__add__</code> definition. The method allows adding a new interval to the <code>intervals</code> collection.<br>
In case the size of the <code>intervals</code> exceeds the <code>max_sample_size</code> defined for the collection, the implementation proceeds by removing the oldest value in the list (see the <code>drop_oldest</code> method). This would guarantee the fixed size of the collection.</p>
<p>The <code>_HearbeatHistory</code> class is encapsulated by a <code>_State</code> class:</p>
<script src="https://gist.github.com/samueleresca/e4d156162b7feb6f9625e1230e10d422.js"></script>
<p>The <code>_State</code> class represents the accrual failure detector state, and it stores the heartbeat history and the latest heartbeat timestamp (<code>Tlast</code>). The class instance will be wrapped into an <code>AtomicReference</code> to guarantee the thread-safety.</p>
<h3 id="heartbeatmethod">Heartbeat method</h3>
<p>The <code>heartbeat</code> method handles the incoming signal from the network. The implementation is described here:</p>
<script src="https://gist.github.com/samueleresca/b0da0754e3bb86c12c822c0c80a8a3f4.js"></script>
<p>The code above omits some components of the class and focuses only on the <code>heartbeat</code> method.</p>
<p>The implementation uses the <code>get_time()</code> function to retrieve the current time in <code>ms</code>. The <code>get_time()</code> function is also useful to mock the current time in a testing phase.</p>
<p>In case the current state is not defined, it initializes the state with the first heartbeat, represented by the <code>first_heartbeat_estimate_ms</code> attribute.</p>
<p>It proceeds by calculating the interval between <code>tnow - Tlast</code> and stores the interval in the heartbeat history state.</p>
<p>The <code>state</code> is wrapped into an <code>AtomicReference</code> type to handle the cases of multiple threads trying to access the attribute concurrently. The implementation calls the <code>compare_and_set</code> method for comparing the old state with the expected one. If the states do not match, the method retries recursively by calling itself.</p>
<h3 id="calculatingthephivalue">Calculating the Phi value</h3>
<p>The <code>phi</code> method calculates and returns the actual value of <code>&#x3C6;</code> computed using the <code>HeartbeatHistory</code> instance encapsulated in the <code>state</code> attribute:</p>
<script src="https://gist.github.com/samueleresca/59747f7ddc43f677cd5832dc1e54a22e.js"></script>
<p>The core of the implementation is defined in the <code>self.calc_phi</code> function. Given the <code>time_diff</code>, the <code>mean</code>, and the <code>std_dev</code> of the distribution, the function computes the logistic approximation of the cumulative normal distribution (for the details, see &quot;A logistic approximation to the cumulative normal distribution&quot; at the bottom of the post). Once the <code>phi</code> value is calculated, it is returned to the caller.</p>
<p>The <code>self.calc_phi</code> function wraps the <code>math.exp</code> operation with a <code>try-except</code> block. If the operation reaches the digits limit and raises an <code>OverflowError</code> exception, it assigns a <code>float(inf)</code> value to <code>e</code>. In case the argument of the <code>math.exp</code> operation is a very large negative value, the result will be rounded to <code>0</code>.</p>
<p>Another aspect to notice is the different calculations made depending on the <code>timeDiff &gt; mean</code> condition. This is because of a floating-point loss precision concern well-described in the Akka original issue: <a href="https://github.com/akka/akka/issues/1821">akka/issues/1821</a>.</p>
<h2 id="usageexample">Usage example</h2>
<p>The following code describes a simple usage of the <code>PhiAccrualFailureDetector</code> class. It mocks the timings by overriding the <code>_get_time()</code> method defined in the class:</p>
<script src="https://gist.github.com/samueleresca/6fc431368aaf73260df8404fbe24c479.js"></script>
<p>The test above defines an <code>Iterable</code> of mocked times as follow:</p>
<pre><code>t0 = 0
t1 = 1000
t2 = 1100
t3 = 1200
t4 = 5200
t5 = 8200
</code></pre>
<p>And it executes a sequence of <code>heartbeat</code> methods, which will populate the <code>heartbeat_history</code> state of the accrual failure detector instance. When the test code calls the <code>is_available</code> method for the first time, the value of <code>&#x3C6;</code> is:</p>
<p><code>&#x3C6;: 0.025714293568000528</code></p>
<p>This is less than the <code>threshold</code> we defined in the class initialization; therefore, <code>is_available</code> will return <code>True</code>. When we call the <code>is_available</code> method for the second time (after skipping the <code>5200</code> time), we have a <code>&#x394;t = t5 - t3 = 8200 - 1200 = 7000</code>, which will lead to the following value of <code>&#x3C6;</code> :</p>
<p><code>&#x3C6;: 109.21058212993705</code></p>
<p>Therefore, the second <code>is_available</code> call returns <code>False</code> since the <code>threshold</code> will be greater than <code>threshold=3</code> we defined in the initialization.</p>
<h2 id="conclusion">Conclusion</h2>
<p>This post goes through some concepts of the <code>&#x3D5;</code> Accrual failure detector paper, and it describes a concrete python implementation available at the following link: <a href="https://github.com/samueleresca/phi-accrual-failure-detector">phi-accrual-failure-detector</a>. The code is using a fixed value (<code>phi_value &lt; threshold</code>) to decide if a node/process is available or not. Still, the resulting <code>&#x3C6;</code> value is dynamic, and the implementation can eventually consider assigning different values of availability depending on the resulting <code>&#x3C6;</code> value.</p>
<p>Below there are the references I used to write this post.</p>
<h2 id="references">References</h2>
<p><a href="https://ieeexplore.ieee.org/document/1353004">The &#x3D5; Accrual Failure Detector - Naohiro Hayashibara, Xavier D&#xE9;fago, Rami Yared and Takuya Katayama</a></p>
<p><a href="https://doc.akka.io/docs/akka/current/typed/failure-detector.html">Phi Accrual Failure Detector - Akka documentation</a></p>
<p><a href="https://github.com/akka/akka/blob/master/akka-remote/src/main/scala/akka/remote/PhiAccrualFailureDetector.scala">akka/akka source code</a></p>
<p><a href="https://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf">Cassandra - A Decentralized Structured Storage System</a></p>
<p><a href="https://core.ac.uk/display/41787448">A logistic approximation to the cumulative normal distribution</a></p>
<p><a href="https://medium.com/@arpitbhayani/phi-%CF%86-accrual-failure-detection-79c21ce53a7a">Phi &#x3C6; Accrual Failure Detection - @arpitbhayani</a></p>
<p><a href="https://unsplash.com/@diesektion">Cover photo by @diesektion</a></p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Large-Scale Data Quality Verification in .NET PT.1]]></title><description><![CDATA[The quality testing of large data-sets plays an essential role in the reliability of data-intensive applications. The business decisions of companies rely on machine learning models; for this reason, data quality has gained a lot of importance. ]]></description><link>https://samueleresca.net/large-data-quality-verification/</link><guid isPermaLink="false">63d9a4dc2bc5da0fb16ac18d</guid><category><![CDATA[.NET Core]]></category><category><![CDATA[dotnetcore]]></category><category><![CDATA[dotnet]]></category><dc:creator><![CDATA[Samuele Resca]]></dc:creator><pubDate>Wed, 02 Sep 2020 12:04:32 GMT</pubDate><media:content url="https://samueleresca.net/content/images/2020/08/Screenshot-2020-08-27-at-23.23.52.jpg" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: markdown--><img src="https://samueleresca.net/content/images/2020/08/Screenshot-2020-08-27-at-23.23.52.jpg" alt="Large-Scale Data Quality Verification in .NET PT.1"><p>The quality testing of large data-sets plays an essential role in the reliability of data-intensive applications. The business decisions of companies rely on machine learning models and data analysis; for this reason, data quality has gained a lot of importance. A few months ago, the <a href="https://github.com/awslabs/deequ">awslabs/deequ</a> library caught my attention.</p>
<p>The library helps to define unit tests for data, and it uses Apache Spark to support large data-sets. I started to dig into the implementation, and I&apos;m working on porting the library into the .NET ecosystem: <a href="https://github.com/samueleresca/deequ.NET">samueleresca/deequ.NET</a>.</p>
<h2 id="whydataqualityisimportant">Why data quality is important?</h2>
<p>One thing that I had noticed when I jumped on the machine learning world is that ordinary software engineering practices are not enough to guarantee the stability of the codebase. One of the main reasons is already well-described in the <a href="https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf">Hidden Technical Debt in Machine Learning Systems</a> paper.</p>
<p>In traditional software projects, the established tools and techniques for code analysis, unit testing, integration testing, and monitoring solve the common pitfalls derived by the code dependencies depts. Although these tools and techniques are still valid on a machine learning project, they are not enough. In a machine learning project, the ecosystem of components and technologies is broader:</p>
<p><img src="https://samueleresca.net/content/images/2020/08/Screenshot-2020-08-01-at-15.13.57.jpg" alt="Large-Scale Data Quality Verification in .NET PT.1" loading="lazy"></p>
<p>The machine learning code is a minimum part of the whole project. A lot of components are dedicated to the pre-processing/preparation/validation phases, such as the feature extraction part, the data collection, and the data verification. One of the main assertions made by the research paper mentioned above is that the <em>data dependencies cost more than code dependencies</em>. Therefore, the versioning of the upstream data-sets and the quality testing data needs a considerable effort, and it plays an essential role in the reliability of the machine learning project.</p>
<h2 id="implementationdetails">Implementation details</h2>
<p><a href="https://pdfs.semanticscholar.org/3606/af912be399c789c40c1a1a5f16bd9bb84b6e.pdf">The automating large-scale data quality verification research</a> that inspired the deequ library describes the common pitfalls behind the data quality verification and provides a pattern for testing large-scale data-sets. It highlights three data quality dimensions: the <em>completeness</em>, the <em>consistency</em>, and the <em>accuracy</em> of the data.</p>
<p>The <em>completeness</em> represents the degree to which an entity can have all the values needed to describe a real-world object. For example, in the case of relational storage, it is the presence or not of null values.</p>
<p>The <em>consistency</em> refers to the semantic rules of data. More in detail, to all the rules that are related to a data type, a numeric interval, or a categorical column. The <em>consistency</em> dimension also describes the rules that involve multiple columns. For example, if the category value of a record is <code>t-shirt</code>, then the size could be in the range <code>{S, M, L}</code>.</p>
<p>On the other side, the <em>accuracy</em> focuses the on syntactic correctness of the data based on the definition domain. For example, a <code>color</code> field should not have a value <code>XL</code>.  Deequ uses these dimensions as the main reference to understand the data quality of a data-set.</p>
<p>The next sections go through the main components that the original deequ library uses, and it shows the corresponding implementation in the <a href="https://github.com/samueleresca/deequ.NET">deequ.NET</a> library.</p>
<h3 id="checkandconstraintdeclaration">Check and constraint declaration</h3>
<p>The library uses a declarative syntax for defining the list of <em>checks</em> and the related <em>constraints</em> that are used to assert the data quality of a data-set. Every constraint is identified by a <em>type</em> that describes the purpose, and a set of <em>arguments</em>:</p>
<p><img src="https://samueleresca.net/content/images/2020/08/Screenshot-2020-08-02-at-17.06.19.jpg" alt="Large-Scale Data Quality Verification in .NET PT.1" loading="lazy"></p>
<p>The declarative approach of the library asserts the quality of the data-set in the following way:</p>
<script src="https://gist.github.com/samueleresca/7a5fe4a609db092b2aee881c7bc0fbd0.js"></script>
<p>The <code>VerificationSuite</code> class exposes the api needed to load the data-set (<code>OnData</code>) and to declare the list of checks (<code>AddCheck</code>).<br>
Every check specifies a description, the list of constraints, and a <code>CheckLevel</code>, which defines the severity of the checks.<br>
Once we declared a list of <code>Check</code> instances, we can proceed by calling the <code>Run</code> method that lazily executes the operations on the data and returns a <code>VerificationResult</code> instance.</p>
<h3 id="verificationoutput">Verification output</h3>
<p>As mentioned above the verification output is represented by a <code>VerificationResult</code> type. In concrete, this is the core C# definition of the <code>VerificationResult</code>:</p>
<script src="https://gist.github.com/samueleresca/0299c298a3699d5f163b06ad8135bd72.js"></script>
<p>The code above introduces the concept of the <code>CheckResult</code> type. The <code>CheckResult</code> class describes the result derived from a check, and it has the following implementation:</p>
<script src="https://gist.github.com/samueleresca/9c95fed3e05f8f61aab32c711a9c37f0.js"></script>
<p>For each executed <code>Check</code>, there is an associated <code>CheckResult</code> that contains the <code>Status</code> of the check and a list of <code>ConstraintResults</code> bound with that check. Therefore, once the <code>VerificationSuite</code> has been executed, it is possible to access the actual results of the checks:</p>
<script src="https://gist.github.com/samueleresca/489245d38336f3d6aef7085630a3d2c5.js"></script>
<p>The <code>Status</code> field represents the overall status of the <code>VerificationResult</code>. In case of failure, it is possible to iterate every single <code>CheckResult</code> instance and extract the list of <code>ConstraintsResults</code>. Furthermore, we can print out a message for every constraint that is failing and the actual reason for the failure.</p>
<p>At the foundation of each constraint execution, there is an analyzer that interfaces with the Apache Spark APIs. In the <a href="https://github.com/samueleresca/deequ.NET">deequ.NET</a> implementation the spark API are provided by the <a href="https://github.com/dotnet/spark">dotnet/spark</a> library. In the following section, we will see how the analyzer classes are abstracted from the rest of the layers of the library.</p>
<h3 id="analyzers">Analyzers</h3>
<p>Analyzers are the foundation of the deequ. They are the implementation of the operators that compute the metrics used by the constraints instances. For each metric, the library has multiple analyzer implementations that refer to the Apache Spark operators. Therefore, all the logic for communicating with Spark is encapsulated in the analyzers layer.<br>
More in detail, the library uses the following interface to define a generic analyzer definition:</p>
<script src="https://gist.github.com/samueleresca/921ce516e39ebae2a6a0f6a469b907cf.js"></script>
<p>The interface declares a set of operations part of each analyzer lifecycle:</p>
<ul>
<li><code>ComputeStateFrom</code> executes the computation of the state based on the <code>DataFrame</code>;</li>
<li><code>ComputeMetricFrom</code> computes and returns the <code>IMetric</code> depending on the state you are passing in;</li>
<li><code>Preconditions</code> returns a set of assertions that must be satisfied by the schema of the <code>DataFrame</code>;</li>
<li><code>Calculate</code> runs the Preconditions, calculates, and returns an <code>IMetric</code> instance with the result of the computation. In addition to that, it optionally accepts an <code>IStateLoader</code> and an <code>IStatePersiter</code> interfaces that can be used to load/persist the state into storage.</li>
</ul>
<p>Every analyzer implements the <code>IAnalyzer</code> interface to provide the core functionalities needed to run the operations in a distributed manner using the underlying Spark implementation. In addition to the <code>IAnalyzer</code>, the library also defines three additional interfaces: the <code>IGroupingAnalyzer</code>, the <code>IScanShareableAnalyzer</code>, and the <code>IFilterableAnalyzer</code> interface.</p>
<p>The <code>IScanShareableAnalyzer</code> interface identifies an analyzer that runs a set of aggregation functions over the data, and that share scans over the data. The <code>IScanShareableAnalyzer</code> enriches the analyzer with the <code>AggregationFunctions</code> method used to retrieve the list of the aggregation functions and the <code>FromAggragationResult</code> method that is used to return the state calculated from the execution of the aggregation functions.</p>
<p>The <code>IGroupingAnalyzer</code> interface identifies the analyzers that groups the data by a specific set of columns. It defines the <code>GroupingColumns</code> method to the analyzer to retrieve the list of grouping columns.</p>
<p>The <code>IFilterableAnalyzer</code> describes the analyzer that implements a filter condition on the fields, and it enriches each implementation with the <code>FilterCondition</code> method.</p>
<p>Let&apos;s continue with an example of the implementation of the <code>MaxLength</code> analyzer. As the name suggests, the purpose of this analyzer is to verify the max length of a column in the data-set:</p>
<script src="https://gist.github.com/samueleresca/ae0624d30ace040e3cdda41970fa154a.js"></script>
<p>The class defines two properties: the <code>string Column</code> and the <code>Option&lt;string&gt; Where</code> condition of the analyzer. The <code>Where</code> condition is returned as the value of the <code>FilterCondition</code> method.<br>
The <code>AggregationFunctions</code> method calculates the <code>Length</code> of the field specified by the <code>Column</code> attribute, and it applies the <code>Max</code> function to the length of the specified <code>Column</code>. The Spark API exposes both the <code>Length</code> and the <code>Max</code> functions used in the <code>AggregationFunctions</code> method.<br>
Also, the class implements the <code>AdditionalPrecoditions</code> method, which checks if the <code>Column</code> property of the class is present in the data set and if the field is of type string. Finally, the analyzer instance will be then executed by the <code>ComputeStateFrom</code> method implemented in the <code>ScanShareableAnalyzer</code> parent class:</p>
<script src="https://gist.github.com/samueleresca/794619e6822c9d599c95e49f9de83e82.js"></script>
<p>The <code>IState</code> resulting from the execution of the above method is then eventually combined with the previous states persisted in the memory and converted in a resulting <code>IMetric</code> instance in the <code>CalculateMetric</code> method implemented in the <a href="https://github.com/samueleresca/deequ.net/blob/5343a4b362260117af33a3fa7cab5ddc34f8062d/src/deequ/Analyzers/Analyzer.cs#L64">Analyzer.CalculateMetric</a> method implementation.</p>
<h3 id="incrementalcomputationofmetrics">Incremental computation of metrics</h3>
<p>In a real-world scenario, ETLs usually import batches of data, and the data-sets continuously grow in size with new data. Therefore, it is essential to support situations where the resulting metrics of the analyzers can be stored and calculated using an incremental approach. The research paper that inspired deequ describes the incremental computation of the metrics in the following way:</p>
<p><img src="https://samueleresca.net/content/images/2020/08/Screenshot-2020-08-19-at-20.38.48.png" alt="Large-Scale Data Quality Verification in .NET PT.1" loading="lazy"></p>
<p>On the left, you have the batch computation that is repeated every time the input data-set grows (&#x394;D). This approach needs access to the previous data-sets, and it results in a more computational effort.<br>
On the right side, the data-set grows (&#x394;D) is combined with the state (<em>S</em>) of the previous computation. Therefore, the system needs to recompute the metric every time a new batch of data is processed.</p>
<p>The incremental computation method we described is achievable using the APIs exposed by deequ.</p>
<p>The following example demonstrate how to implement the incremental computation using the following sample:</p>
<table>
<thead>
<tr>
<th style="text-align:center">id</th>
<th style="text-align:center">manufacturerName</th>
<th style="text-align:center">countryCode</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:center">1</td>
<td style="text-align:center">ManufacturerA</td>
<td style="text-align:center">DE</td>
</tr>
<tr>
<td style="text-align:center">2</td>
<td style="text-align:center">ManufacturerB</td>
<td style="text-align:center">DE</td>
</tr>
<tr>
<td style="text-align:center">3</td>
<td style="text-align:center">ManufacturerC</td>
<td style="text-align:center">DE</td>
</tr>
<tr>
<td style="text-align:center">4</td>
<td style="text-align:center">ManufacturerD</td>
<td style="text-align:center">US</td>
</tr>
<tr>
<td style="text-align:center">5</td>
<td style="text-align:center">ManufacturerE</td>
<td style="text-align:center">US</td>
</tr>
<tr>
<td style="text-align:center">6</td>
<td style="text-align:center">ManufacturerG</td>
<td style="text-align:center">CN</td>
</tr>
</tbody>
</table>
<p>and the snippet of code defined here:</p>
<script src="https://gist.github.com/samueleresca/14edcc54c6d5766e339a4e29e44f7817.js"></script>
<p>The <code>LoadData</code> loads the data schema defined in the table above into three different data sets using the <code>countryCode</code> as a partition key. Also, the code defines a new check using the following constraint methods: <code>IsComplete</code>, <code>ContainsURL</code>, <code>IsContainedIn</code>. The resulting analyzers (obtained by calling the <code>RequiredAnalyzers()</code> method) are then passed into a new instance of the <code>Analysis</code> class.<br>
The code also defines 3 different <code>InMemoryStateProvider</code> instances and it executes the <code>AnalyzerRunner.Run</code> method for each country code: <code>DE</code>, <code>US</code>, <code>CN</code> by passing the corresponding <code>InMemoryStateProvider</code>.</p>
<p>The mechanism of aggregated states (<code>AnalysisRunner.RunOnAggregatedStates</code> method) provides a way to merge the 3 in-memory states: <code>dataSetDE</code>, <code>dataSetUS</code> and <code>dataSetCN</code> into a unique table of metrics. It is important to notice that the operation does not trigger any re-computation of the data sample.</p>
<p>Once we have a unique table of metrics, it is also possible to increment only one partition of the data. For example, let&apos;s assume that the <code>US</code> partition changes and the data increase, the system only recompute the state of the changed partition to update the metrics for the whole table:</p>
<script src="https://gist.github.com/samueleresca/799104fb8acef5b65143093f1b662118.js"></script>
<p>It is essential to notice that the schema of the data must be unique for every data-set state you need to aggregate. This approach results in a lighter computation effort when you have to refresh the metrics of a single partition of your data-set.</p>
<h2 id="handlethescalafunctionalapproach">Handle the Scala functional approach</h2>
<p>The official <a href="https://github.com/awslabs/deequ">awslabs/deequ</a> implementation is written in Sala, which is also the official language of Apache Spark. The strong object-oriented nature of C# adds more difficulties in replicating some of the concepts used by the original Scala <a href="https://github.com/awslabs/deequ">deequ</a> library. An example could be the widespread use of the <code>Try</code> and <code>Option</code> monads. Fortunately, it is not the first time that someone ports a Scala library into C# .NET: the <a href="https://github.com/akkadotnet/akka.net">Akka.NET</a> (port of Akka) has a <a href="https://github.com/akkadotnet/akka.net/wiki/Scala-to-C%23-Conversion-Guide">handful guide</a> that gives some conversion suggestion for doing that. Akka.NET repository also provides some implementation utils such as the <code>Try&lt;T&gt;</code> and <code>Option&lt;T&gt;</code> monads for C#, which are also used by the deequ.NET code.</p>
<h2 id="finalthoughts">Final thoughts</h2>
<p>This post described the initial work that I did to port the deequ library into the .NET ecosystem. We have seen an introduction to some of the components that are part of the architecture of the library, such as the <em>checks</em> part, the <em>constraint API</em>, the <em>analyzers layer</em>, and the <em>batch vs. incremental computation</em> approach.</p>
<p><img src="https://samueleresca.net/content/images/2020/08/Screenshot-2020-08-25-at-22.36.05-2.jpg" alt="Large-Scale Data Quality Verification in .NET PT.1" loading="lazy"></p>
<p>I&apos;m going to cover the rest of the core topics of the library in a next post, such as <em>metrics history</em>, <em>anomaly detectors</em>, and <em>deployment part</em>.</p>
<p>In the meantime, this the repository where you can find the actual library implementation <a href="https://github.com/samueleresca/deequ.net">samueleresca/deequ.NET</a>, and this is the original <a href="https://github.com/awslabs/deequ">awslabs/deequ</a> library.</p>
<h3 id="references">References</h3>
<p><a href="http://www.vldb.org/pvldb/vol11/p1781-schelter.pdf">Automating large-scale data quality verification.</a></p>
<p><a href="https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf">Hidden Technical Debt in Machine Learning Systems.</a></p>
<p><a href="https://github.com/akkadotnet/akka.net/wiki/Scala-to-C%23-Conversion-Guide">akka.net - Scala to C# Conversion Guide.</a></p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Notes on threading]]></title><description><![CDATA[ This post shares some personal notes related to multithreading in C#. ]]></description><link>https://samueleresca.net/multithreading-in-c-notes/</link><guid isPermaLink="false">63d9a4dc2bc5da0fb16ac18c</guid><category><![CDATA[.NET Core]]></category><category><![CDATA[C#]]></category><category><![CDATA[threading]]></category><category><![CDATA[ASP.NET]]></category><category><![CDATA[ASP.NET Core]]></category><dc:creator><![CDATA[Samuele Resca]]></dc:creator><pubDate>Thu, 07 May 2020 13:02:25 GMT</pubDate><media:content url="https://samueleresca.net/content/images/2020/08/Screenshot-2020-08-25-at-11.45.03.png" medium="image"/><content:encoded><![CDATA[<img src="https://samueleresca.net/content/images/2020/08/Screenshot-2020-08-25-at-11.45.03.png" alt="Notes on threading"><p>This post shares some personal notes related to multithreading in C#. Although it describes several concepts of multithreading in C#, some of the topics can be similar in other typed languages. The following notes are the results of readings, interview prep that I had in the past years. The post implements the concepts on a concrete <code>LRUCache</code> example. The code related to the post is <a href="https://github.com/samueleresca/Blog.LRUCacheThreading">available on GitHub</a>.</p><!--kg-card-begin: markdown--><p>Below the topics described in this post:</p>
<ul>
<li><a href="#lowlevelapithreadinitialization">Low-level API - thread initialization</a></li>
<li><a href="#threadpoolthreadinitialization">Thread Pool - thread initialization</a></li>
<li><a href="#threadpoolfromataskperspective">Thread pool from a Task perspective</a>
<ul>
<li><a href="#starvationissue">Starvation issue</a></li>
</ul>
</li>
<li><a href="#threadsafetyandsynchronization">Thread safety and sychronization</a>
<ul>
<li><a href="#lrucacheandlockingconstructs">LRUCache and locking constructs</a></li>
</ul>
</li>
<li><a href="#exampleofsignalingbetweenthreads">Example of signaling between threads</a></li>
</ul>
<!--kg-card-end: markdown--><!--kg-card-begin: markdown--><h2 id="lowlevelapithreadinitialization">Low-level API - thread initialization</h2>
<!--kg-card-end: markdown--><!--kg-card-begin: markdown--><p>There are different ways of initializing threads. .NET provides a different level of abstractions for the thread initializations. First of all, an intuitive approach is to declare a <code>new Thread</code> instance in your code:</p>
<pre><code class="language-csharp">using System.Threading;

namespace Blog.LRUCacheThreadSafe
{
    class Program
    {
        static void Main(string[] args)
        {
            new Thread (myMethod).Start();
        }
    }
}
</code></pre>
<p>The <code>Thread</code> class exposes properties to check the state of the instance of the thread. At the same time, it is possible to alternate the state of the thread by calling the methods exposed by the instance, e.g., the <code>Start()</code> and <code>Stop()</code> methods start and stop the execution of the threads. It also is possible to wait for the completion of a thread by using the <code>Join()</code> method exposed by the <code>Thread</code> instance:</p>
<pre><code class="language-csharp">using System;
using System.Threading;

namespace Blog.LRUCacheThreadSafe
{
    class Program
    {
        static void Main(string[] args)
        {
            Thread thread = new Thread (myMethod);
            thread.Start();
            thread.Join();
            Console.WriteLine (&quot;myMethod execution ended&quot;);
        }
    }
}
</code></pre>
<p>The <code>Join()</code> method provides a way to wait until the execution of the <code>t</code> instance is completed. After that, the executions proceed on the main thread, in this case, the <code>Program.Main</code> method. Every time we use the <code>Join()</code> method, the calling thread is blocked.</p>
<p>One important fact to notice is that the <code>new Thread</code> initialization does not use the thread pool, we can verify that by calling the <code>IsThreadPoolThread</code> property exposed by the <code>t</code> object.</p>
<p>Because of the lack of abstraction over the low-level nature of the <code>Thread</code> class, Microsoft introduced multiple abstractions over this class. Most of them rely on the thread pool to improve the efficiency of the code. Furthermore, it is strongly suggested to use the higher-level APIs to achieving a parallel workflow in .NET.<br>
As we will see later, the <code>Thread</code> class is still useful to cover some of the core concepts related to the synchronization and messaging between threads.</p>
<!--kg-card-end: markdown--><!--kg-card-begin: markdown--><h2 id="threadpoolthreadinitialization">Thread Pool - thread initialization</h2>
<!--kg-card-end: markdown--><!--kg-card-begin: markdown--><p>The creation process of the new <code>Thread</code> comes with a considerable overhead in terms of time complexity and resources. For this reason, Microsoft built an abstraction that delegates the management of the OS threads to the <em>Thread Pool</em> implemented in the CLR. One of the <em>Thread pool</em> main feature is to optimize the creation and destruction of the threads by re-using the threads. Therefore, it is more suitable for short-running tasks.<br>
The <em>Thread Pool</em> way of working can be described in the following way:</p>
<p><img src="https://samueleresca.net/content/images/2020/03/Screenshot-2020-03-15-at-19.31.06.png" alt="Notes on threading" loading="lazy"></p>
<p>There is a queue of tasks that are processed by N threads present in the <em>Thread pool</em>. The queuing operation is implemented using the following method:</p>
<p><code>ThreadPool.QueueUserWorkItem(() =&gt; { ... });</code></p>
<p>More in detail, the queuing process uses the <em>thread injection and retirement algorithm</em> described in <a href="http://aviadezra.blogspot.com/2009/06/net-clr-thread-pool-work.html">this post</a>. which proceeds with the following workflow:</p>
<p><img src="https://samueleresca.net/content/images/2020/03/Screenshot-2020-03-15-at-21.45.46.png" alt="Notes on threading" loading="lazy"></p>
<p>When a new task is queued, and there are no available threads, the ThreadPool verify that the running threads have reached the maximum number of threads and it waits until a running thread is completed. Otherwise, it checks if the amount of the running threads is less than the minimum, and in that case, it creates a new thread.<br>
Furthermore, if the number of running threads is equal to the allowed minimum, then it suspends the creation of new threads.</p>
<p>The thread pool can be configured with a maximum / minimum number of threads using the following approach:</p>
<p><code>ThreadPool.SetMaxThreads(int workerThreads, int completionPortThreads);</code><br>
<code>ThreadPool.SetMinThreads(int workerThreads, int completionPortThreads);</code></p>
<p>To understand these methods is also essential to distinguish between two different types:</p>
<ul>
<li>the <code>workerThreads</code> parameter sets the number of <em>worker threads</em> which refers to all the CPU-bound threads used by the application logic;</li>
<li>the <code>completionPortThreads</code> (not used in UNIX-like OS) parameter sets the amount of <em>I/O threads</em> which refers to all the I/O-bound operations that use resources other than CPUs;</li>
</ul>
<p>These two types are commonized with the introduction of the <code>System.Threading.Tasks</code> namespace.</p>
<h2 id="threadpoolfromataskperspective">Thread pool from a Task perspective</h2>
<p>Starting from CLR 4, the Thread pool way of queuing Tasks from <code>System.Threading.Task</code> perspective has been optimized to improve the handling of multiple asynchronous nested tasks. The CLR 4 introduced a double queue architecture.</p>
<ul>
<li><em>global queue</em> used by the main thread to queue tasks in a FIFO order;</li>
<li><em>local queues</em>, once for each thread worker, it is used to queue the work in the context of a specific task, and it works like a stack (LIFO order);</li>
</ul>
<p><img src="https://samueleresca.net/content/images/2020/03/Screenshot-2020-03-17-at-21.13.13.png" alt="Notes on threading" loading="lazy"></p>
<p>The above image (1) describes the process of queuing two tasks: <code>Task1</code> and <code>Task2</code>. <code>Task2</code> has a nested task, called <code>Task3</code>. The <code>Task1</code> and <code>Task2</code> are queued in the <em>global queue</em> and picked up respectively by the <code>Worker Thread1</code> and the <code>Worker Thread 2</code>, while <code>Task3</code> is queued in the local queue of the <code>Worker Thread 1</code> (2).<br>
Now two things can happen:</p>
<ul>
<li>if the <code>Worker thread 1</code> ends processing <code>Task 1</code> before the other worker threads are free, it is going to pick up the <code>Task 3</code> from the <em>local queue</em>.</li>
<li>if the <code>Worker thread 2</code> end processing <code>Task 2</code> before the other workers, it going to look into the <em>local queue</em> for another task, if the local queue is empty, it will proceed by checking the global queue, and finally, if the global queue is empty it going to pick up the <code>Task 3</code> from the <em>local queue</em> of the <code>Worker Thread 1</code>, this is defined as <em>work-stealing</em>.</li>
</ul>
<p>Recently the corefx of .NET Core has implemented the possibilty to direct the creation of the new thread pools to the local queues in the <code>ThreadPool.QueueWorkItem</code> method: <a href="https://github.com/dotnet/corefx/pull/24396">dotnet/corefx/pull/24396</a>.</p>
<h3 id="starvationissue">Starvation issue</h3>
<p>The <em>Thread pool starvation</em> became more frequent with the increase of the need for high-scale async services. The <em>Thread pool starvation</em> main symptom is your service that is not able to handle the increasing number of incoming requests, but you can still see the CPUs are not fully utilized.</p>
<p>This symptom can point your attention to 3rd party components calls. Furthermore, it is also possible to see a method call that takes a long time to complete the execution.</p>
<p>However, sometimes the stall can be caused by the lack of threads that can run the next step in servicing the request, so it merely stalls, waiting for the availability of thread. This time is registered as execution time for the asynchronous method; therefore, it seems like that the operation randomly takes longer than it should.</p>
<p>There is the following <a href="https://labs.criteo.com/2018/10/net-threadpool-starvation-and-how-queuing-makes-it-worse/">blog post of Criteo Labs</a> that explain in detail the issue. Let&apos;s take the example they describe in the post:</p>
<script src="https://gist.github.com/samueleresca/273f0660746de41c67d8d1b068896d9a.js"></script>
<p>The above implementation uses the <code>System.Threading.Task</code> APIs to simulate a thread starvation issue through some blocking calls. Mainly, this set of rules describes the enqueuing behaviors:</p>
<ul>
<li>
<p>Every time you execute <code>Task.Factory.StartNew</code> from the main thread, the task will be added to the <em>global queue</em>;</p>
</li>
<li>
<p>The <code>ThreadPool.QueueWorkItem</code> adds the execution on the <em>global queue</em> unless you specify the <code>preferLocal</code> parameter, see <a href="https://github.com/dotnet/corefx/pull/24396">#24396</a>;</p>
</li>
<li>
<p>The <code>Task.Yield</code> instructions add the <code>Task</code> to the <em>global queue</em> unless you are not on the default task scheduler;</p>
</li>
<li>
<p>Every time you execute <code>Task.Factory.StartNew</code> from inside a worker thread, it will add the task to <em>local queue</em>;</p>
</li>
<li>
<p>In all other cases, the <code>Task</code> is scheduled on the <em>local queue</em>;</p>
</li>
</ul>
<p>In the code above, the main thread spawns a new <code>Process</code> method execution every <code>200 msec</code> in the global queue. The <code>Process</code> methods trigger another thread in the local queue using the <code>Task.Run</code> method, and it waits for the completion of the execution. When the thread pool spawns a new thread, it follows this check sequence, as mentioned previously:</p>
<ul>
<li>check if there are items in its <em>local queue</em>;</li>
<li>check if there are items in the <em>global queue</em>;</li>
<li>check the <em>local queues</em> of other threads (work-stealing);</li>
</ul>
<p>The <code>Producer</code> method has already queued another set of tasks in the global queue, and the ratio of the creation of new threads in the global queue is higher than the creation of new threads, there is a saturation of tasks.<br>
You can see it by executing and checking the output of the application, that in most cases is blocked:</p>
<pre><code>Ended - 18:38:20
Ended - 18:38:20
Ended - 18:38:20
Ended - 18:38:21
Ended - 18:38:21
Ended - 18:38:21
Ended - 18:38:22
Ended - 18:38:22
</code></pre>
<p>Furthermore, we can see the number of threads of the process that increase during the execution time, without stabilizing, and the CPU usage that stays low:</p>
<p><img src="https://samueleresca.net/content/images/2020/03/Screenshot-2020-03-21-at-19.09.07.png" alt="Notes on threading" loading="lazy"></p>
<p>In this case, the <code>Process</code> method is executed every second and is queued into the global queue. The <code>Main</code> method spawns more threads more quickly than the capacity of the thread pool to generate new threads. Therefore the application reaches a condition of starvation. Because the bottleneck is derived by the global queue, a common thought could be to de-prioritize it, see <a href="https://github.com/dotnet/runtime/issues/28295">#28295</a>, but the work-stealing between local queues come with a cost. In some cases, it may cause performance degradation in the system. Asynchronous operations are becoming more common these days; it is essential to pay attention to this kind of issue and try to avoid mixing asynchronous with synchronous stacks. Moving from synchronous stack to async can be dangerous, see the <a href="https://devblogs.microsoft.com/devopsservice/?p=17665">post-mortem Azure DevOps Service Outages in October 2018</a>.</p>
<!--kg-card-end: markdown--><!--kg-card-begin: markdown--><h2 id="threadsafetyandsynchronization">Thread safety and synchronization</h2>
<p>A snippet of code is <em>thread safe</em> when called by multiple threads does not cause race conditions. Race conditions mainly happen when multiple threads share the same resources. It is crucial to highlight that:</p>
<ul>
<li>every variable stored in the <em>stack</em> is thread-safe, since the stack is related to a single thread;</li>
<li>everything else, stored in the <em>heap</em> can potentially be shared by multiple threads;</li>
</ul>
<p>Furthermore, we can take as reference some simple examples. Everything declared as a local variable is in general thread-safe:</p>
<script src="https://gist.github.com/samueleresca/44b3e47dc12bdec391e1d63488fd869b.js"></script>
<p>In this case, the variable has a primitive type, and it is locally declared. Therefore, it is out-of-box thread-safe because it is stored in the stack.</p>
<p>Let&apos;s proceed with the following example that declares a reference type inside the method:</p>
<script src="https://gist.github.com/samueleresca/c390db286074976357c684d22a7ea0e0.js"></script>
<p>The pointer stored in the stack refers to an instance of the object stored in the heap memory. Since the object is never returned and it is only used by the <code>Initializer</code> and the <code>Handler</code> methods, you can declare it thread-safe. Let&apos;s consider the situation where the <code>Handler</code> assign the object to an attribute of the class, as follow:</p>
<script src="https://gist.github.com/samueleresca/2f068c5ba284291785915ee641ce4d5e.js"></script>
<p>The attributes of a class are stored by default along with the object in the heap. Therefore the above <code>Handler(string value)</code> method is not thread-safe; if it is used by multiple threads can lead you into a <em>race condition</em> situation:</p>
<pre><code class="language-csharp">ThreadUnsafeClass threadUnsafeClass = new ThreadUnsafeClass();
new Thread(() =&gt; threadUnsafeClass.Handler(&quot;value1&quot;)).Start();
new Thread(() =&gt; threadUnsafeClass.Handler(&quot;value2&quot;)).Start();
</code></pre>
<p>On the opposite side, you can use techniques such as immutability in order to avoid race conditions:</p>
<pre><code class="language-csharp">new Thread(() =&gt; new ThreadUnsafeClass().Handler(&quot;value1&quot;)).Start();
new Thread(() =&gt; new ThreadUnsafeClass().Handler(&quot;value2&quot;)).Start();
</code></pre>
<p>In the above snippet the <code>Thread</code> types are dealing with separate instances of the <code>ThreadUnsafeClass</code> type, so the two threads will not share the same data.<br>
Another way to avoid race conditions in the code is the use of the <em>locking constructs</em> provided by the framework.</p>
<h3 id="lrucacheandlockingconstructs">LRUCache and locking constructs</h3>
<p>The locking constructs are used to coordinate the usage of shared resources. To describe the lock constructs provided by the .NET Core, we going to use the following example:</p>
<script src="https://gist.github.com/samueleresca/351a1fc6e15485467ffcefb3dd0ecafa.js"></script>
<p>The above code describes a common <em>LRU Cache</em> implementation. The code defines a <code>Dictionary</code> called <code>_records</code>, which will contain the <code>id -&gt; value</code> of each cache record. The <code>_freq</code> attribute stores the order of the last-accessed records by referring to the indexes of the record. The <code>LRUCache&lt;T&gt;</code> type defines two methods: <code>Get</code>, <code>Set</code>, and a <code>_capacity</code> attribute that represents the capacity of the LRU cache. In the example, we are ignoring some concurrent collections provided by the CLR.<br>
The following tests verify the behaviors implemented in the <code>LRUCache&lt;T&gt;</code>:</p>
<script src="https://gist.github.com/samueleresca/90887e796221f5a641486bfc493eb04a.js"></script>
<p>The tests lock some of the behaviors of the <code>LRUCache</code> by replicating the following actions:</p>
<ul>
<li>the cache removes least accessed records when the capacity is reached;</li>
<li>the cache stores correctly the integer values;</li>
<li>the cache prioritization changes when the cache reads a record;</li>
</ul>
<p>Let&apos;s suppose that we want to access the collection from multiple threads, as follow:</p>
<script src="https://gist.github.com/samueleresca/348970b6c6ebfdbc8150d6ba58fb08ba.js"></script>
<p>The code mentioned above executes a set of <code>Task</code> on the same <code>LRUCache</code> instance, the execution will result in the following exception:</p>
<pre><code>Unhandled exception. System.InvalidOperationException: Operations that change non-concurrent collections must have exclusive access. A concurrent update was performed on this collection and corrupted its state. The collection&apos;s state is no longer correct.
   at System.Collections.Generic.Dictionary`2.FindEntry(TKey key)
   at System.Collections.Generic.Dictionary`2.get_Item(TKey key)
   at Blog.LRUCacheThreadSafe.LRUCache`1.Set(Int32 key, T val) in /Users/samuele.resca/Projects/Blog.LRUCacheThreadSafe/Blog.LRUCacheThreadSafe/LRUCache.cs:line 52
   at Blog.LRUCacheThreadSafe.Program.&lt;&gt;c__DisplayClass0_0.&lt;Main&gt;b__0() in /Users/samuele.resca/Projects/Blog.LRUCacheThreadSafe/Blog.LRUCacheThreadSafe/Program.cs:line 13
   at System.Threading.Tasks.Task.InnerInvoke()
   at System.Threading.Tasks.Task.&lt;&gt;c.&lt;.cctor&gt;b__274_0(Object obj)
</code></pre>
<p>Each <code>Task</code> element runs the <code>Set</code> operation on the <code>LRUCache</code> instance that results in a race condition between multiple Tasks. To avoid this kind of exception, we can proceed by implementing a <em>locking process</em> using the constructs available in .NET. The <code>lock</code> operator guarantee that a single thread can access the snippet of code in the parenthesis. This type uses the <code>Monitor.Enter</code> and <code>Monitor.Exit</code> constructs and it requires a reference type instance that works as <em>synchronization object</em>; Therefore, we can wrap our <code>Get</code> and <code>Set</code> methods in the following way:</p>
<script src="https://gist.github.com/samueleresca/1180078ea7fdb57920d423b6a93fc5cd.js"></script>
<p>Now the <code>LRUCache&lt;T&gt;</code> class defines a <code>_locker</code> object that it is used to perform the locking operation. Every reference type can be used as a synchronization object; in the case above, we are using an <code>object</code> type. The <code>lock</code> constructs wrap the implementation of the <code>Get</code> and <code>Set</code> methods to guarantee the access at only one thread simultaneously.</p>
<p>An alternative way is to use the <code>Semaphore</code> / <code>SemaphoreSlim</code> constructors. They implement a number-based restriction which guarantees the concurrent access to the specific number of threads defined in the constructor, by implementing a queuing mechanism on the code to execute. The <code>Semaphore</code> type constructor accepts a name to use in case you need a multi-process lock. The <code>SemaphoreSlim</code> has been introduced recently: it is optimized for parallel-programming, and it exposes the asynchronous methods, but it does not support the cross-process. The following implementation describes the use of the <code>SemaphoreSlim</code> constructor:</p>
<script src="https://gist.github.com/samueleresca/1eaafc152aee1e3415a1c141b53c9ecb.js"></script>
<p>The initialization of the <code>_sem</code> attribute defines the number of concurrent threads that can access the resources: in this case, we set a maximum of 1 thread per operation. Furthermore, we use the <code>try {} finally {}</code> block to manage the locking/unlocking of the resources:</p>
<ul>
<li>the <code>_sem.Wait()</code> stops the thread until the stop requirements of the semaphore definition are met;</li>
<li>the <code>_sem.Release()</code> instruction exit from the semaphore and proceed with the release of the lock;</li>
</ul>
<p>It is important to notice that the <code>try {} finally {}</code> block is used in order always to ensure that the code releases the semaphore. Once we have entered the semaphore using <code>_sem.Wait()</code> we need always to call the <code>_sem.Release()</code> method, otherwise the application code will be stuck waiting for the semaphore to be released. e.g.:</p>
<script src="https://gist.github.com/samueleresca/3a1bfd50b4f75a9c5be4d15b52ed2666.js"></script>
<p>In the case described above, the <code>_sem.Release()</code> method is only called in the main execution branch of the application. Therefore, the application will be stuck in case the condition at line 7 (<code>!_records.ContainsKey(key)</code>) is satisfied because the execution will never exit from the semaphore.</p>
<p>Another synchronization constructor that enables a more granular approach in controlling the threads is the <code>ReaderWriterLockSlim</code>. This type provides a way to differentiating locking depending on the type of access performed on the resource. In most cases, instances are thread-safe for reading operations but not for concurrent updates. The <code>ReaderWriterLockSlim</code> provides the <code>EnterReadLock</code> / <code>ExitReadLock</code> methods that allows the other concurrent reading operations, and it provides the <code>EnterWriteLock</code> / <code>ExitWriteLock</code> methods that excludes both the reading and writing operation on a specific resource. To use this kind of process, we can proceed in the following way:</p>
<script src="https://gist.github.com/samueleresca/1d661d15acbf1aab48b8943ff6acfc79.js"></script>
<p>The code mentioned above describes the implementation of the <code>ReaderWriterLockSlim</code> into the <code>Get</code> method of the <code>LRUCache</code> type. The lock is applied in the following way:</p>
<ol>
<li>Apply the read lock using <code>EnterReadLock</code> method;</li>
<li>Check if the <code>_records</code> member contains the key;</li>
<li>Exit from the read lock using <code>ExitReadLock</code>;</li>
<li>Apply the write lock using the <code>EnterWriteLock</code> method;</li>
<li>Update the table of the frequencies, by moving the element in the last position;</li>
<li>Exit from the write lock using <code>ExitWriteLock</code>;</li>
<li>Finally, it returns the <code>CacheValue</code> by entering in the read lock;</li>
</ol>
<p>As you can see, there are a couple of steps where it happens a transition between the <em>read lock</em> and the <em>write lock</em>. For this reason, the <code>ReaderWriterLockSlim</code> introduced a new lock type that can be activated by using the <code>EnterUpgadeableReadLock</code> method. The upgradable lock starts as a normal read lock, and it can be upgraded to write lock, e.g.:</p>
<script src="https://gist.github.com/samueleresca/77d618fca0f6e5466455916386af81ed.js"></script>
<p>The code above mentioned uses the <code>EnterUpgadeableReadLock</code> to initialize the read lock on the <code>Get</code> method, and it upgrades the lock to the write mode when it needs to force the writing mode:</p>
<ol>
<li>Apply the read lock using <code>EnterReadLock</code> method;</li>
<li>Check if the <code>_records</code> member contains the key;</li>
<li>Apply the write lock using the <code>EnterWriteLock</code> method;</li>
<li>Update the table of the frequencies, by moving the element in the last position;</li>
<li>Exit from the write lock using <code>ExitWriteLock</code>;</li>
<li>Finally, it returns the <code>CacheValue</code> and exit from the read lock;</li>
</ol>
<p>The above implementation scale-up or scale-down the lock restriction depending on the actual access type that is running as of next instruction.</p>
<!--kg-card-end: markdown--><!--kg-card-begin: markdown--><h2 id="exampleofsignalingbetweenthreads">Example of signaling between threads</h2>
<p>The following example implements some multithreading testing on the <code>LRUCache&lt;T&gt;</code> implementation. The test uses the <code>ManualResetEventSlim</code> class to coordinate multiple concurrent operations on the <code>LRUCache</code> type. The <code>ManualResetEventSlim</code> can be used as a signaling event between threads; furthermore, it blocks and unblocks the execution of the threads.</p>
<p>The following code describes the implementation of the test:</p>
<script src="https://gist.github.com/samueleresca/9129081c532c7ad4c97d9c8e1f8d137f.js"></script>
<p>The above test class declares a <code>should_supports_operations_from_multiple_threads</code> test method. The method performs the following operations:</p>
<ol>
<li>Split the test into two phases by declaring the <code>setPhase</code> and <code>getPhase</code> using the <code>ManualResetEventSlim</code> instance type;</li>
<li>Initializes an array of type <code>Thread</code>. For each thread in the array, it performs a <code>Set</code> and a <code>Get</code> operation on the <code>LRUCache&lt;T&gt;</code> type, and it uses the <code>ManualResetEventSlim</code> instances for blocking the threads;</li>
<li>Once the array of <code>Thread</code> is declared, the code calls the <code>setPhase.Set()</code> method to trigger the <code>_cache.Set</code> operations. The same approach is taken for the <code>_cache.Get</code> operations;</li>
</ol>
<p>Furthermore, each thread proceeds as follow:</p>
<p><img src="https://samueleresca.net/content/images/2020/05/Screenshot-2020-05-02-at-14.48.43.png" alt="Notes on threading" loading="lazy"></p>
<p>The threads wait until the <code>setPhase.Set</code> method is called by the main thread; at this point, every thread runs the <code>_cache.Set</code> operation on the <code>LRUCache&lt;T&gt;</code> instance. The main thread keeps the count of the executed set by incrementing a <code>progressCounter</code>. The main thread proceeds by calling the <code>getPhase.Set</code> method once the <code>progressCounter</code> reaches the number of initialized threads.<br>
In the following way, we are going to have a <code>nThreads</code> that will execute the <code>_cache.Set</code> operation at the same time; after that, they will be blocked until the last set operation is performed. Finally, they <code>_cache.Get</code> method will be called to read all the values.</p>
<p>As expected, the implementation executes all the <code>_cache.Set</code> operations before the <code>_cache.Get</code> execution.</p>
<p>Depending on the type <code>LRUCache&lt;T&gt;</code> we initialize, the test fails or not.<br>
Because the <a href="https://github.com/samueleresca/Blog.LRUCacheThreading/blob/master/Blog.LRUCacheThreading/LRUCache.cs">LRUCache</a> type doesn&apos;t support multithreading we will receive a <code>System.InvalidOperationException</code>. If we use the <a href="https://github.com/samueleresca/Blog.LRUCacheThreading/blob/master/Blog.LRUCacheThreading/LRUCacheReaderWriterLock.cs">LRUCacheReaderWriterLock</a> type, we will receive the following outcome from the <code>_outputConsole</code> helper:</p>
<script src="https://gist.github.com/samueleresca/dc01da1417ef81707f2bd0e84b02922e.js"></script>
<p>Another important aspect to consider is that we are incrementing the <code>progressCounter</code> in the following way:</p>
<pre><code>Interlocked.Increment(ref progressCounter);
</code></pre>
<p>Because the counter is declared outside the scope of the threads, and multiple threads use it, we are using the <code>Interlocked.Increment</code> dependency to force an atomic operation. The <code>Interlocked.Increment</code> accepts as a reference a variable to increment using an atomic approach.</p>
<p>Although, read and writes on <code>int</code> can be considered as atomic:</p>
<blockquote>
<p>CLI shall guarantee that read and write access to properly aligned memory<br>
locations no larger than the native word size (the size of type native int) is atomic when all the write accesses to a location are the same size.<br>
<a href="http://www.ecma-international.org/publications/standards/Ecma-335.htm">ECMA-335 standard</a></p>
</blockquote>
<p>the increment operation is not atomic; for this reason, the <code>Interlocked</code> constructor can be helpful in these cases.</p>
<h2 id="finalthoughts">Final thoughts</h2>
<p>The post provides some of the notions around threading in C#. It describes topics like <em>starvation</em>, <em>event handling</em>, <em>blocking synchronization</em>, and it applies these concepts to an <code>LRUCache</code> implementation example. You can find the code on the <a href="https://github.com/samueleresca/Blog.LRUCacheThreading/">following repository</a>. The information provided in this post are the foundation of some of the topics that are NOT covered like <em>Dataflow</em>, <em>Channels</em>, <em>Blocking collections</em>, and more in general, all the Task parallel library provided by .NET.</p>
<p>Below some of the references related to the post:</p>
<ul>
<li><em><a href="https://github.com/dotnet/corefx/pull/24396">#24396 - Expose/test ThreadPool.QueueUserWorkItem(..., bool preferLocal)</a></em></li>
<li><em><a href="https://devblogs.microsoft.com/devopsservice/?p=17665">Postmortem: Azure DevOps Service Outages in October 2018</a></em></li>
<li><em><a href="https://labs.criteo.com/2018/10/net-threadpool-starvation-and-how-queuing-makes-it-worse/">.NET Threadpool starvation, and how queuing makes it worse</a></em></li>
<li><em><a href="http://aviadezra.blogspot.com/2009/06/net-clr-thread-pool-work.html">.NET CLR Thread Pool Internals</a></em></li>
<li><em><a href="http://www.ecma-international.org/publications/standards/Ecma-335.htm">ECMA-335 standard</a></em></li>
</ul>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Getting started with Apache Spark using .NET Core]]></title><description><![CDATA[ The following article covers what I have learned about Apache Spark core architecture.]]></description><link>https://samueleresca.net/getting-started-with-apache-spark-using-net-core/</link><guid isPermaLink="false">63d9a4dc2bc5da0fb16ac18b</guid><category><![CDATA[.NET Core]]></category><category><![CDATA[Apache Spark]]></category><dc:creator><![CDATA[Samuele Resca]]></dc:creator><pubDate>Mon, 30 Sep 2019 06:44:51 GMT</pubDate><media:content url="https://samueleresca.net/content/images/wordpress/2019/09/EB6kFM_WwAAT6yF.jpeg" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: html-->
<img src="https://samueleresca.net/content/images/wordpress/2019/09/EB6kFM_WwAAT6yF.jpeg" alt="Getting started with Apache Spark using .NET Core"><p>Since a few months, I&#x2019;ve started to focus my attention on the Data / Big data technologies both for work and individual reasons. </p>



<p>The following post covers what I have learned about Apache Spark core, and its architecture. Above all, it introduces the <a href="https://dotnet.microsoft.com/apps/data/spark">.NET Core library for Apache Spark</a>, which aims to bring the Apache Spark tools into the .NET ecosystem.  I&#x2019;ve already written about .NET in the data a few months ago: <a aria-label="https://samueleresca.net/2019/04/data-analysis-using-f-and-jupyter-notebook/ (opens in a new tab)" href="https://samueleresca.net/2019/04/data-analysis-using-f-and-jupyter-notebook/">Data analysis using F# and Jupyter Notebook</a>.</p>



<p>This post covers the following topics:</p>



<ul><li>Intro to Apache Spark;</li><li>Apache Spark architecture: RDD, Dataframes,  drive-workers way of working;</li><li>Querying using SparkSQL using the .NET Core lib;</li><li>Considerations around the use of .NET with Apache Spark;</li></ul>



<h2>Intro to Apache Spark</h2>



<p>Apache Spark is a framework to process data in a distributed way. After 2014, it is considered the successor of Hadoop to handle and manipulate Big Data. Besides, the Spark success can be attributed not only to its performance but also to the rich and always growing ecosystem that support and contribute to the evolution of this technology. Moreover, the Apache Spark APIs are readable, testable and easy to understand.</p>



<p>Nowadays, Apache Spark and its libraries provide an ecosystem that can support a team or a company in data analysis, streaming analysis, machine learning and finally graph processing.</p>



<p>Apache Spark provides a wide set of modules:</p>



<ul><li><a href="https://github.com/apache/spark/tree/master/sql">Spark SQL</a> for data analysis over relational data;</li><li><a href="https://spark.apache.org/docs/latest/streaming-programming-guide.html#overview">Spark Streaming</a> for the streaming of data;</li><li><a href="https://spark.apache.org/mllib/">MLlib</a> for the machine learning;</li><li><a href="https://spark.apache.org/graphx/">GraphX</a> for distributed graph processing;</li></ul>



<p>All these modules and libraries stand on top of the Apache Spark Core API.</p>



<h2>Introduction to Spark for .NET Core</h2>



<p>.NET Core is the multi-purpose, open-source and cross-platform framework built by Microsoft.</p>



<p>Microsoft is investing a lot on .NET Core ecosystem. Further, the .NET team is bringing .NET technologies into the data world. Furthermore, the last year, Microsoft released a machine learning framework for .NET Core, <a href="https://github.com/dotnet/machinelearning">available on Github</a>, and recently, they shipped the APIs for Apache Spark also <a href="https://github.com/dotnet/spark">available on Github</a>. </p>



<p>For this reason, I directed my attention on Apache Spark and its structure.</p>



<h3>Setup the project</h3>



<p>Let&#x2019;s see how to manage Apache Spark using .NET Core framework.</p>



<p>The example described in this post uses the following code available <a href="https://github.com/samueleresca/blog.apachesparkgettingstarted">on GitHub</a> and the Seattle Cultural Space Inventory dataset available on <a href="https://www.kaggle.com/city-of-seattle/seattle-cultural-space-inventory">Kaggle</a>. Moreover, the project is a simple console template created by using the following .NET Core command:</p>



<p><code>dotnet new console -n blog.apachesparkgettingstarted</code></p>



<p>The command mentioned above creates a new .NET Core console application project. In addition, we should also add the Apache Spark APIs, by executing the following CLI command in the root of the project:</p>



<p><code>dotnet add package Microsoft.Spark</code></p>



<p>The instruction adds the Apache Spark for .NET to the current project.</p>



<h2>Apache Spark core architecture</h2>



<p>As said earlier, Apache Spark Core APIs are the foundation of the additional modules and features provided in the framework. The following section will cover some of the base concepts of the Spark architecture. </p>



<p>First of all, let&#x2019;s take an overview of the Apache Spark <em>driver-workers</em> concept, the following schema describes the mechanics behind Spark:</p>



<figure class="wp-block-image"><img src="https://samueleresca.net/content/images/wordpress/2019/09/Screenshot-2019-09-05-at-22.07.27.png?fit=720%2C242&amp;ssl=1" alt="Getting started with Apache Spark using .NET Core" class="wp-image-4168" srcset="https://samueleresca.net/content/images/wordpress/2019/09/Screenshot-2019-09-05-at-22.07.27.png 2194w, https://samueleresca.net/content/images/wordpress/2019/09/Screenshot-2019-09-05-at-22.07.27-320x107.png 320w, https://samueleresca.net/content/images/wordpress/2019/09/Screenshot-2019-09-05-at-22.07.27-768x258.png 768w, https://samueleresca.net/content/images/wordpress/2019/09/Screenshot-2019-09-05-at-22.07.27-960x322.png 960w" sizes="(max-width: 2194px) 100vw, 2194px"><figcaption>Driver &#x2013; Worker mechanics</figcaption></figure>



<p>Every Spark application is usually composed by the <em>Driver</em> and a set of <em>Workers</em>. The <em>Driver</em> is the coordinator entity of the system, and it distributes the chuck of work across different <em>Workers</em>.  Apache Spark 2.0.0 onwards, <code>SparkSession</code> type provides the single entry point to interact with underlying Spark functionality. Moreover, <code>SparkSession</code> type has the following responsibilities:</p>



<ul><li>Create the tasks to assign to the different workers;</li><li>Locate and load the data used by the application;</li><li>handle eventual failures;</li></ul>



<p>This is an example of a C# code that initializes a <code>SparkSession</code>:</p>



<script src="https://gist.github.com/samueleresca/fbeaa67912332e6b4a2c33eff4878e24.js"></script>




<p>The code mentioned above gets or create a new <code>SparkSession</code> instance with the name &#x201C;My application&#x201D;. The retrieved <code>SparkSession</code> instance provides all the necessary APIs.</p>



<h3>RDDs, DataFrames core concept</h3>



<p>Apache Sparks supports developers with a great set of API and collection objects. The following section will describe the two main set of APIs provided by Spark: <em>RDD </em>and<em> Dataframes</em>;</p>



<p>The <em>RDD</em> is the foundations for the <em>Dataframes</em> collection. RDD stands for <em>Resilient Distributed Dataset (RDD)</em>, it is the basic abstraction in Spark. <em>RDD</em> represents an immutable, partitioned collection of elements that can be operated on in parallel. RDDs are designed as resilient and distributed. Therefore they distribute the work across multiple nodes and partitions, and they are able to handle failure by re-calculating the partition that failed. </p>



<p>The <em>Dataframes</em> are built on top RDD. <em>Dataframes</em> organizes the collections into named columns, which provides an higher-level of abstractions.</p>



<p>The subsequent example shows the definition of a <em>Dataframes</em>:</p>



<script src="https://gist.github.com/samueleresca/358776472f364a125798130983334843.js"></script>




<p>The code mentioned above, read data from the following <a href="https://www.kaggle.com/city-of-seattle/seattle-cultural-space-inventory">Seattle Cultural Space Inventory dataset</a>. The snippet loads the data into the <em>Dataframe</em> by automatically infer the columns. After that, the <em>Dataframe</em> collection provides a rich set of APIs which can be used to apply operations on the data, for example:</p>



<script src="https://gist.github.com/samueleresca/4d5e629879a2d4318cced785a9336f76.js"></script>




<p>The preceding snippet combines the <code>Select</code> and <code>Filter</code> operations by selecting the <em>Name</em> and the <em>Phone</em> columns from the Dataframe, and by filtering out all the rows without the <em>Phone</em> column populated. Both the <code>Select</code> and <code>Filter</code> methods are part of the <em>transformation</em> APIs of  Apache Spark. </p>



<h4>Transformation and Action APIs</h4>



<p>For instance, we can group the <em>RDD</em>s and <em>Dataframe</em>s APIs into two groups: </p>



<ul><li><em>Transformation</em> APIs are all the functions that manipulate the data. All the APIs such as <code>Select</code>, <code>Filter</code>, <code>GroupBy</code>, <code>Map</code> . Furthermore, all the functions that return another <em>RDD</em> or <em>Dataframe</em> are part of the <em>transformation</em> APIs;</li><li><em>Action</em> APIs are all the method that performs an action on data. They usually return a void result;</li></ul>



<p>Moreover, we should notice that all the <em>transformations</em> APIs uses a <em>lazy-evaluation</em> approach. Hence, all the transformations on data are not immediately executed. In other words, the Apache Spark engine stores the set of operations in a <em>DAG (Directed acyclic graph) </em>and rebuild them in a lazy way.</p>



<figure class="wp-block-image"><img src="https://samueleresca.net/content/images/wordpress/2019/09/Screenshot-2019-09-08-at-21.47.43-960x513.png" alt="Getting started with Apache Spark using .NET Core" class="wp-image-4186" srcset="https://samueleresca.net/content/images/wordpress/2019/09/Screenshot-2019-09-08-at-21.47.43-960x513.png 960w, https://samueleresca.net/content/images/wordpress/2019/09/Screenshot-2019-09-08-at-21.47.43-320x171.png 320w, https://samueleresca.net/content/images/wordpress/2019/09/Screenshot-2019-09-08-at-21.47.43-768x410.png 768w, https://samueleresca.net/content/images/wordpress/2019/09/Screenshot-2019-09-08-at-21.47.43.png 1506w" sizes="(max-width: 960px) 100vw, 960px"></figure>



<p>The schema mentioned above describes the distributed data transformation approach used by Apache Spark. Let&#x2019;s suppose that our code executes an action on the <em>Dataframe</em> collection, such as the <code>Show()</code> method:</p>



<script src="https://gist.github.com/samueleresca/1d736fea4e221adb48d60bdd1ea90a8c.js"></script>




<p>As you can see from the code mentioned above, the <code>Show()</code> doesn&#x2019;t return anything, which means that, unlike the <code>Select()</code> and <code>Filter()</code> methods, it is an action that effectively triggers the execution of the <em>DataFrame</em> evaluation.</p>



<p>The <code>Show()</code> method simply displays rows of the <code>DataFrame</code> data in tabular form.  </p>



<h3>Apache Spark execution and tooling</h3>



<p>Let&#x2019;s proceed by executing the snippet of code we implemented in the previous section. Apache Spark provides an out-of-box called <code>spark-submit</code> tool command,  in order to submit and to execute the code. </p>



<p>Besides, it is possible to submit our .NET Core code to Apache Spark using the following command:</p>



<p><code>spark-submit --class org.apache.spark.deploy.dotnet.DotnetRunner  --master local microsoft-spark-2.4.x-0.4.0.jar dotnet &lt;compiled_dll_filename&gt;</code></p>



<p>The above command uses the <code>spark-submit</code> tool to send and execute the dll which is provided as parameter:</p>



<script src="https://gist.github.com/samueleresca/79226b09571ce1cc8ba0b337c28fe6dd.js"></script>




<p>As you can see, the result of the executions expose the first 20 populated numbers.</p>



<h2>Spark.NET under the hood</h2>



<p>You may have noticed, that every time you add the <code>Microsoft.Spark</code> package, it also brings the following jar packages: <code>microsoft-spark-2.4.x-0.4.0.jar</code> and <code>microsoft-spark-2.3.x-0.4.0.jar</code>. Both of them are used to communicate with the native Scala APIs of Apache Spark. Furthermore, if we take a closer look at the source code, <a href="https://github.com/dotnet/spark/tree/master/src/scala">available on GitHub</a>, we can understand the purposes of these two packages:</p>



<script src="https://gist.github.com/samueleresca/fbaa26cb5f7aa8d68361e8bacf86dde7.js"></script>




<p>The <code>DotnetBackend</code> Scala class behave as an interpreter between the .NET Core APIs and the native Scala APIs of Apache Spark.  More in detail, this approach is taken also by other non-native languages that use Spark, such as R language.</p>



<p>The .NET team and community are already pushing for the native adoption of   .NET binding as part of Apache Spark: <a href="https://issues.apache.org/jira/browse/SPARK-27006">https://issues.apache.org/jira/browse/SPARK-27006</a>. Moreover, it seems that they are making <a href="https://issues.apache.org/jira/browse/SPARK-27006?focusedCommentId=16792939&amp;page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16792939">some progress</a>:</p>



<blockquote class="wp-block-quote"><p>Thank you for reading this far and we look forward to seeing you at the SF Spark Summit in April where we will be presenting our early progress on enabling .NET bindings for Apache Spark.&#xA0;</p></blockquote>



<h2>Final thoughts</h2>



<p>The following article gives a quick introduction to Apache Spark, and a quick Getting started with Apache Spark using .NET Core. Microsoft is investing a lot in .NET Core, and most important, is investing in opensource. Languages like Python, Scala, and R are definitely more established in the Data world, However, the implementation of the Spark library for and the ML.NET framework are also the facts that Microsoft is investing a lot to bring .NET into the data world. </p>



<p>Cover image by <a href="https://twitter.com/tubemapper">Tube mapper</a>.</p>
<!--kg-card-end: html-->]]></content:encoded></item><item><title><![CDATA[Web assembly and Blazor: state of the art]]></title><description><![CDATA[ I had a first look to Blazor and, more in general, to the Web assembly technologies in 2017. The same year, I've written about this topic in the following blog post: Web assembly in .NET Core. After two years, Blazor is]]></description><link>https://samueleresca.net/web-assembly-and-blazor-state-of-the-art/</link><guid isPermaLink="false">63d9a4dc2bc5da0fb16ac18a</guid><category><![CDATA[.NET]]></category><category><![CDATA[.NET Core]]></category><category><![CDATA[dotnet]]></category><category><![CDATA[dotnetcore]]></category><category><![CDATA[WebAssembly]]></category><dc:creator><![CDATA[Samuele Resca]]></dc:creator><pubDate>Tue, 11 Jun 2019 07:05:18 GMT</pubDate><media:content url="https://samueleresca.net/content/images/wordpress/2019/06/Screenshot-2019-06-09-at-21.03.10.png" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: html-->
<img src="https://samueleresca.net/content/images/wordpress/2019/06/Screenshot-2019-06-09-at-21.03.10.png" alt="Web assembly and Blazor: state of the art"><p>I had a first look to Blazor and, more in general, to the Web assembly technologies in 2017. The same year, I&#x2019;ve written about this topic in the following blog post: <a href="https://samueleresca.net/2017/08/web-assembly-in-net/">Web assembly in .NET Core</a>. After two years, Blazor is near to its first official release, it is no longer experimental, and it is becoming part of the .NET ecosystem. The following article gives some quick updates on Blazor framework.</p>



<h2>How Blazor works?</h2>



<p>First of all, let&#x2019;s have a look what&#x2019;s behind Blazor and how it works using the new Web assembly. The following schema shows the foundations of Blazor:</p>



<figure class="wp-block-image"><img src="https://samueleresca.net/content/images/wordpress/2019/05/Screenshot-2019-05-27-at-22.47.24.png?fit=720%2C476&amp;ssl=1" alt="Web assembly and Blazor: state of the art" class="wp-image-4128" srcset="https://samueleresca.net/content/images/wordpress/2019/05/Screenshot-2019-05-27-at-22.47.24.png 1722w, https://samueleresca.net/content/images/wordpress/2019/05/Screenshot-2019-05-27-at-22.47.24-320x211.png 320w, https://samueleresca.net/content/images/wordpress/2019/05/Screenshot-2019-05-27-at-22.47.24-768x508.png 768w, https://samueleresca.net/content/images/wordpress/2019/05/Screenshot-2019-05-27-at-22.47.24-960x634.png 960w" sizes="(max-width: 1722px) 100vw, 1722px"></figure>



<p><a href="https://webassembly.org/">Web assembly</a> stands at the base of the pyramid and it defines a binary standard format which allows running byte code into the browser. Furthermore, one of the </p>



<p>Web assembly is a standard not chained to the .NET ecosystem, but it has been the first step to bring .NET into the client side development.</p>



<p>The other core actor behind Blazor is the Mono framework. Mono is a .NET runtime, and it is part of the .NET runtimes maintained by Microsoft and the community. Mono is designed for portability, therefore it has been compiled into web assembly starting with the following PR: <a href="https://github.com/mono/mono/pull/5924">https://github.com/mono/mono/pull/5924</a> </p>



<p>Finally, the top layer there is Blazor. Blazor is the UI framework that defines the startup process of the UI, and also it implements the infrastructure that allows components to communicate together. Starting from .NET Core 3.0, Blazor will be shipped as a part of the framework.</p>



<h2>Overview of a Blazor app</h2>



<p>It is possible to create a new Blazor template using the following instructions:</p>



<pre class="wp-block-preformatted">dotnet new -i Microsoft.AspNetCore.Blazor.Templates::3.0.0-preview5-19227-01</pre>



<pre class="wp-block-preformatted">dotnet new blazor -n &lt;web_app_name&gt;</pre>



<p></p>



<p>The first command installs the Blazor template pack using the version 3.0.0-<code>preview5-199227-01</code> of .NET Core. The second command creates a new base project in the current folder with the <code>web_app_name</code>. </p>



<p>The resulting project and file system will be similar to this:</p>



<script src="https://gist.github.com/samueleresca/1e021f2f5c734878422d09635c2a9e9c.js"></script>




<p>There are some key parts to notice in the project structure. First of all, the <code>Program</code> and the <code>Startup</code> classes: the first one has the following implementation:</p>



<script src="https://gist.github.com/samueleresca/3724a775c04badc1d568101bbc1e7e4b.js"></script>




<p>As you can see the above-mentioned snippet uses the <code>BlazorWebAssemblyHost</code> class to initialize a new host using the <code>Startup</code> class. This approach works very similar manner to the approach used in ASP.NET Core applications, but instead of returning an <code>IWebHost</code> type it returns a new instance of the <code>IWebAssemblyHostBuilder</code> interface.</p>



<p>The following code acts using the following namespace <code>Microsoft.AspNetCore.Blazor.Hosting</code> and resolves the Startup class using the <a href="https://github.com/aspnet/AspNetCore/blob/cb21edc5009b254388fc6da93d212f679087c326/src/Components/Blazor/Blazor/src/Hosting/WebAssemblyHostBuilderExtensions.cs#L33">following code</a>.</p>



<p>Let&#x2019;s proceed by having a look at the Startup class which is decidedly simpler compared with the <code>Startup</code> class of an ASP.NET Core application:</p>



<div class="wp-block-embed__wrapper">
<script src="https://gist.github.com/samueleresca/ff49af4002c0555a366f1f552ac29016.js"></script>
</div>



<p>The <code>Configure</code> method resolves an instance of the <code>IComponentsApplicationBuilder</code> interface, and it invokes the <code>AddComponent</code> method in order to initialize the <code>App</code> component.</p>



<p>The <code>AddComponent</code> accepts a generic type which represents the main component, and a DOM selector which corresponds to the tag that is used in the <code>index.html</code> page to render the component.</p>



<h3>Component-centric structure</h3>



<p>Blazor, just like a common UI framework,  has a component-centric structure. Components are all the UI elements that compose the pages. In the same way, components can be nested and reused in other parts of the UI.</p>



<p>Every file with the .razor extension is a component. Components render the HTML elements but also can contain UI logic and event handling, for example, let&#x2019;s have a look to the <code>FetchData.razor</code> file:</p>



<script src="https://gist.github.com/samueleresca/378c5e4873b30ce3dce7a6c28ab52b60.js"></script>




<p>The following component fetches some weather forecast data present in the application using an AJAX request, and it renders data in form of a table. As a first step, the component uses the <code>@inject</code> directive to declare a new HTTP client.  Secondly, it declares some HTML elements to render in the page, e.g.: the table which contains the forecast data, and it finally declares the UI logic:</p>



<script src="https://gist.github.com/samueleresca/7b2f6284262d9be9b846f8af4885ae16.js"></script>




<p>The code mentioned above defines a <code>WeatherForecast</code> type and an array which will contain the information fetched from the data, secondly, it declares an <code>override async Task OnInitAsync()</code> function that uses the <code>HttpClient</code> injected in the component to perform an HTTP call to our data. The <code>OnInitAsync</code> function is one of the built-in lifecycle methods implemented by default in the base class of the component.</p>



<h4>Built-in lifecycle methods</h4>



<p>The following table describes the lifecycle methods which are part of the <a href="https://github.com/aspnet/AspNetCore/blob/a36a57df656e7a4bb39901beca5068bc5acc83b3/src/Components/Components/src/ComponentBase.cs">ComponentBase.cs</a>, and can be customized by the extended classes: </p>



<table class="wp-block-table is-style-stripes"><tbody><tr><td>Lifecycle methods</td><td>Description</td></tr><tr><td><code>OnInit /OnInitAsync</code></td><td>The method executes code at the initialization step of the component.</td></tr><tr><td><code>OnParametersSet /OnParametersSetAsync</code></td><td>These two methods called when a component has received parameters from its parent caller and the values are assigned to properties.&#xA0;These methods are executed every time the component is rendered.</td></tr><tr><td><code>OnAfterRender/OnAfterRenderAsync</code></td><td>These methods are called after a component has finished rendering. The elements and the components references are populated at this point.</td></tr><tr><td><code>SetParameters</code></td><td>The method can set a custom code that interprets the incoming parameters value in any way required</td></tr></tbody></table>



<p></p>



<h3>Routing</h3>



<p>Another essential aspect to notice form the above-described component is the <code>@page &quot;/fetchdata&quot;</code> directive. This directive is part of the routing mechanism of Blazor. By using the same approach of the routing of ASP.NET Core, it is also possible to add custom parameters in the <code>@page</code> value: something similar to <code>@page &quot;/fetchdata/{day}&quot;</code>.</p>



<h2>Client-side vs Server-side Hosting model</h2>



<p>Blazor provides two different hosting models: the <em>client-side</em> one and the <em>server-side</em>.</p>



<p>The <em>client-side</em> hosting model downloads all the .NET dependencies on the client, therefore it doesn&#x2019;t have any server-side dependency. It provides full web assembly support and also supports offline scenarios. It is possible to create a client-side Blazor app using the following command:</p>



<pre class="wp-block-preformatted">dotnet new blazor -n &lt;webapp_name&gt;</pre>



<p></p>



<p>The <em>server-side</em> hosting model is more light-way in terms of resources download on the client. It uses SignalR and web socket technologies to create a communication channel between the client and the server. Therefore, the code runs on the server; the client sends messages at each operation. It also supports old browsers, but it doesn&#x2019;t have offline support. It is possible to create a server-side Balzor app using the following command:</p>



<pre class="wp-block-preformatted">dotnet new blazorserverside -n &lt;webapp_name&gt;</pre>



<p></p>



<p>The main concrete characteristic of between the client-side and server-side hosting models resides in the <code>Program.Main </code>method. The following is the snippet related to a client-side app:</p>



<script src="https://gist.github.com/samueleresca/5620911b98c6e30fbb2c38e8fe1bb2d1.js"></script>




<p>This one is related to a server-side app:</p>



<script src="https://gist.github.com/samueleresca/946f39b18bcb6ec5ffc93c2a8e8e8484.js"></script>




<p>As you can see, the first one returns a reference to the <code>IWebAssemblyHost</code> instance, the second one to an <code>IHostBuilder</code> instance.</p>



<p>Plus, in case of a server-side app, the <code>Startup</code> class also adds a service to the <code>IServiceProvider</code> collection using the <code>services.AddServerSideBlazor()</code> :</p>



<script src="https://gist.github.com/samueleresca/674d7d93f31b44fe72a2d4b90f12e079.js"></script>




<p>The resulting execution of the two hosting models behaves in two different ways. In the case of the client-side approach we can see the following resulting network behavior:</p>



<figure class="wp-block-image"><img src="https://samueleresca.net/content/images/wordpress/2019/05/Screenshot-2019-05-26-at-14.07.52.png?fit=720%2C409&amp;ssl=1" alt="Web assembly and Blazor: state of the art" class="wp-image-4113" srcset="https://samueleresca.net/content/images/wordpress/2019/05/Screenshot-2019-05-26-at-14.07.52.png 2096w, https://samueleresca.net/content/images/wordpress/2019/05/Screenshot-2019-05-26-at-14.07.52-320x182.png 320w, https://samueleresca.net/content/images/wordpress/2019/05/Screenshot-2019-05-26-at-14.07.52-768x436.png 768w, https://samueleresca.net/content/images/wordpress/2019/05/Screenshot-2019-05-26-at-14.07.52-960x545.png 960w" sizes="(max-width: 2096px) 100vw, 2096px"><figcaption>Client-side network behavior</figcaption></figure>



<p>The client-side app downloads the <code>blazor.webassembly.js</code> file the mono.wasm file, which is the Mono framework compiled for the web assembly, and it downloads all the .NET dll used by the application: <code>System.dll</code>, <code>System.Core.dll</code>, <code>System.Net.Http.dll</code> &#x2026;;</p>



<p>On the other side, the server-side app uses a web-socket approach. Therefore the payload downloaded with the page is minimum:</p>



<figure class="wp-block-image"><img src="https:///content/images/wordpress/2019/05/Screenshot-2019-05-26-at-14.09.18.png?fit=720%2C194&amp;ssl=1" alt="Web assembly and Blazor: state of the art" class="wp-image-4116" srcset="https://samueleresca.net/content/images/wordpress/2019/05/Screenshot-2019-05-26-at-14.09.18.png 2096w, https://samueleresca.net/content/images/wordpress/2019/05/Screenshot-2019-05-26-at-14.09.18-320x86.png 320w, https://samueleresca.net/content/images/wordpress/2019/05/Screenshot-2019-05-26-at-14.09.18-768x207.png 768w, https://samueleresca.net/content/images/wordpress/2019/05/Screenshot-2019-05-26-at-14.09.18-960x259.png 960w" sizes="(max-width: 2096px) 100vw, 2096px"><figcaption>Server-side network behavior<br></figcaption></figure>



<p>Each interaction with the page triggers a new message in the web socket channel:</p>



<figure class="wp-block-image"><img src="https:///content/images/wordpress/2019/05/Screenshot-2019-05-26-at-14.09.31.png?fit=720%2C236&amp;ssl=1" alt="Web assembly and Blazor: state of the art" class="wp-image-4117" srcset="https://samueleresca.net/content/images/wordpress/2019/05/Screenshot-2019-05-26-at-14.09.31.png 2094w, https://samueleresca.net/content/images/wordpress/2019/05/Screenshot-2019-05-26-at-14.09.31-320x105.png 320w, https://samueleresca.net/content/images/wordpress/2019/05/Screenshot-2019-05-26-at-14.09.31-768x251.png 768w, https://samueleresca.net/content/images/wordpress/2019/05/Screenshot-2019-05-26-at-14.09.31-960x314.png 960w" sizes="(max-width: 2094px) 100vw, 2094px"><figcaption>Web-socket channel messages</figcaption></figure>



<h2>Final thoughts</h2>



<p>Starting in 2017, Blazor is becoming a standard citizen of the .NET ecosystem. Both Microsoft .NET team and the community are investing a lot of time in this project. You can find 3rd party libraries and other material about Blazor here: <a href="https://github.com/AdrienTorris/awesome-blazor#libraries--extensions">https://github.com/AdrienTorris/awesome-blazor#libraries&#x2013;extensions</a>. </p>



<p><em>Cover image by </em><a href="http://www.corradozeni.it/"><em>Corrado Zeni</em></a></p>
<!--kg-card-end: html-->]]></content:encoded></item><item><title><![CDATA[Data analysis using F# and Jupyter notebook]]></title><description><![CDATA[In the last hackathon at @justeattech, I've played a lot around machine learning using ML.NET and .NET Core. Furthermore, the idea that a .NET dev can implement machine learning without switching language is cool.]]></description><link>https://samueleresca.net/data-analysis-using-f-and-jupyter-notebook/</link><guid isPermaLink="false">63d9a4dc2bc5da0fb16ac189</guid><category><![CDATA[.NET]]></category><category><![CDATA[.NET Core]]></category><category><![CDATA[FSharp]]></category><category><![CDATA[ML.NET]]></category><dc:creator><![CDATA[Samuele Resca]]></dc:creator><pubDate>Tue, 23 Apr 2019 07:27:13 GMT</pubDate><media:content url="https://samueleresca.net/content/images/wordpress/2019/04/image.jpg" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: html-->
<img src="https://samueleresca.net/content/images/wordpress/2019/04/image.jpg" alt="Data analysis using F# and Jupyter notebook"><p>During the last hackathon at <a href="https://medium.com/just-eat-tech">@justeattech</a>, I&#x2019;ve played a lot around machine learning using <a href="https://dotnet.microsoft.com/apps/machinelearning-ai/ml-dotnet">ML.NET</a> and .NET Core. Furthermore, the idea that a .NET developer is able to implement machine learning without switching language is cool. <a href="https://dotnet.microsoft.com/apps/machinelearning-ai/ml-dotnet">ML.NET</a> has still a lot of space of improvement, but it could be a powerful framework to deal with machine learning. </p>



<figure style="margin: 0 auto;width: 80%;" class="wp-block-embed-twitter aligncenter wp-block-embed is-type-rich is-provider-twitter"><div class="wp-block-embed__wrapper">
<blockquote class="twitter-tweet" data-width="550" data-dnt="true"><p lang="en" dir="ltr">I&apos;ve played a lot around <a href="https://twitter.com/MLdotnet?ref_src=twsrc%5Etfw">@MLdotnet</a>, during the current hackathon in <a href="https://twitter.com/justeat_tech?ref_src=twsrc%5Etfw">@justeat_tech</a>. The idea that a .NET dev can implement machine learning without switching language is cool, but there is a lot of space of improvement, such as <a href="https://t.co/SbjpU2PdkV">https://t.co/SbjpU2PdkV</a><a href="https://twitter.com/hashtag/MachineLearning?src=hash&amp;ref_src=twsrc%5Etfw">#MachineLearning</a> <a href="https://twitter.com/hashtag/MLNET?src=hash&amp;ref_src=twsrc%5Etfw">#MLNET</a></p>&#x2014; Samuele Resca (@samueleresca) <a href="https://twitter.com/samueleresca/status/1106542236165132289?ref_src=twsrc%5Etfw">March 15, 2019</a></blockquote><script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
    </div>
    </figure>


<p>The following post focuses on some general knowledge around data gathering and data analysis. Furthermore, it explains some basics tools to perform data analysis using F# and Jupyter notebook.</p>



<h2>The importance of the data foundations</h2>



<p>The data set gathering is the more critical step around machine learning. Data is the foundation of all the further process, and it the principal step of the ML workflow. </p>



<p>Therefore, it is crucial to understand the data that we are going to use and to train a machine learning model. For that reason, it is important to prototype and explores data before the start.</p>



<h2>Lyrics data analysis</h2>



<p>The purpose of the following example is to give some basic notions about the data analysis process. As a software engineer mainly focused on .NET Core, I will use the technologies around the .NET ecosystem. Therefore, the example will use F# as the primary language and some related libraries to handle data. The example is also available at the following Github repository: <a href="https://github.com/samueleresca/LyricsClassifier">https://github.com/samueleresca/LyricsClassifier</a></p>



<p>It is essential to consider that most of the concepts of the following steps are independent of the language or the libraries we use.  Moreover, almost all the languages and development frameworks come with some open-source tools for machine learning and data analysis. Here is a complete of machine learning libraries and frameworks: <a href="https://github.com/josephmisiti/awesome-machine-learning">https://github.com/josephmisiti/awesome-machine-learning</a></p>



<p>More in specific, the following example will use the following libraries:</p>



<ul><li><code>XPlot.Plotly</code>: XPlot is a cross-platform data visualization library that supports creating charts using Google Charts and Plotly. The library provides a composable domain-specific language for building charts and specifying their properties;</li><li><code>MathNet.Numerics</code>: Math.NET Numerics aims to provide methods and algorithms for numerical computations in science, engineering, and everyday use. Covered topics include special functions, linear algebra, probability models, random numbers, interpolation, integration, regression, optimization problems and more;</li><li><code>FSharp.Data</code>: the F# Data library implements everything you need to access data in your F# applications and scripts. It contains F# type providers for working with structured file formats (CSV, HTML, JSON, and XML) and for accessing the WorldBank data. It also includes helpers for parsing CSV, HTML and JSON files and for sending HTTP requests;</li><li><code>ML.NET</code>:  ML.NET is a machine learning framework built for .NET developers;</li></ul>



<h3>Data schema</h3>



<p>The example will use a dataset of lyrics available on <a href="https://www.kaggle.com/gyani95/380000-lyrics-from-metrolyrics">Kaggle</a>. The data set contains a list of songs of different genres and from several artists. The data has a straightforward schema, which can be represented using the following F# type:</p>



<script src="https://gist.github.com/samueleresca/81f0962bded837d06fb057f04b2a5c73.js"></script>




<p>The <code>Song</code> field refers to the title of the song, the <code>Artist</code> field contains the artist name, the <code>Year</code> is the release date, the <code>Genre</code> field contains the genre of the song and finally, the <code>Lyrics</code> field refers to the lyrics of the song.</p>



<p>Finally, let&#x2019;s take a look to a preview of the input data:</p>



<script src="https://gist.github.com/samueleresca/24839b600764d648098be12da258b96b.js"></script>




<p></p>



<p></p>



<h3>A first look at the data</h3>



<p>The data set gathering is the more critical step around machine learning. Data is the foundation of all the further process, and it the main dependency of the ML workflow. </p>



<p>We should always keep in mind is that data is the primary and the critical part for all the subsequent step. Just like some software engineering design processes use the structure of the data to build the domain model of the system, in the same way, we should start from our data to have a global view on the content.</p>



<p>Let&#x2019;s start by analyzing the lyrics dataset to find out the possible correlations. For this propose, the example uses <a href="https://jupyter.org">Jupyter notebook</a>. <a href="https://jupyter.org">Jupiter notebook</a> is a useful tool which allows you to create and share documents that contain live code, equations, visualizations, and narrative text. You can find the source code of Jupyter notebook on GitHub: <a href="https://github.com/jupyter/notebook">https://github.com/jupyter/notebook</a>. By default, Jupyter notebook supports Python as a primary language. For this example, we can enable the support of F# by using the following library using  <a href="https://github.com/fsprojects/IfSharp">https://github.com/fsprojects/IfSharp</a>.</p>



<p>As a first step, we can start Jupyter and create a new notebook in our preferred folder. Then, we can proceed by importing the F# libraries describes above in the first cell of the notebook:</p>



<script src="https://gist.github.com/samueleresca/680a4afff910c03048b51f0df4b21134.js"></script>




<p>The snippet uses the Paket package manager to load the libraries used in the notebook. After that, we can proceed by opening the namespaces used by the notebook and defines the input type which reflects the structure of the dataset:</p>



<script src="https://gist.github.com/samueleresca/8dd653ede26d06a72a084eeeb7d34073.js"></script>




<p>Once we defined the <code>LyricInput</code> type we can proceed by reading the <code>lyrics.csv</code> file and clean up our dataset:</p>



<script src="https://gist.github.com/samueleresca/531b636c2ada582d2218372575eb6fe7.js"></script>




<p>The following snippet uses the <code>FSharp.Data</code> library to load the CSV file, and it performs some filtering and data cleaning on our lyrics:</p>



<ol><li>It removes all the samples with empty lyrics;</li><li>It removes all the samples equals to <code>[Instrumental]</code>;</li><li>Finally, it maps the rows with the <code>LyricInput</code> type defined above;</li></ol>



<p>Let&#x2019;s proceed by make a quick analysis of the critical feature of the dataset and see all the possible correlation. The following code is for rendering two <code>Chart.Pie</code> related to the <code>Genre</code> and the <code>Year</code> feature:</p>



<script src="https://gist.github.com/samueleresca/8cf7e73c909a59427c262f3604198502.js"></script>




<p>The above snippet uses the <code>XPlot.Plotly</code> library to render the following charts:</p>



<figure class="wp-block-image is-resized"><img src="https:///content/images/wordpress/2019/04/Screenshot-2019-04-19-at-22.59.28.png?fit=720%2C500&amp;ssl=1" alt="Data analysis using F# and Jupyter notebook" class="wp-image-4026" width="580" height="401" srcset="https://samueleresca.net/content/images/wordpress/2019/04/Screenshot-2019-04-19-at-22.59.28.png 1644w, https://samueleresca.net/content/images/wordpress/2019/04/Screenshot-2019-04-19-at-22.59.28-320x222.png 320w, https://samueleresca.net/content/images/wordpress/2019/04/Screenshot-2019-04-19-at-22.59.28-960x666.png 960w" sizes="(max-width: 580px) 100vw, 580px"></figure>



<p>The above charts describe the dataset lyrics by genre. In the same way, we can group the songs by using <code>Year</code> field, in order to understand the distribution over time:</p>



<figure class="wp-block-image"><img src="https://samueleresca.net/content/images/wordpress/2019/04/Screenshot-2019-04-19-at-23.00.09.png?fit=720%2C469&amp;ssl=1" alt="Data analysis using F# and Jupyter notebook" class="wp-image-4027" srcset="https://samueleresca.net/content/images/wordpress/2019/04/Screenshot-2019-04-19-at-23.00.09.png 1620w, https://samueleresca.net/content/images/wordpress/2019/04/Screenshot-2019-04-19-at-23.00.09-320x208.png 320w, https://samueleresca.net/content/images/wordpress/2019/04/Screenshot-2019-04-19-at-23.00.09-768x500.png 768w, https://samueleresca.net/content/images/wordpress/2019/04/Screenshot-2019-04-19-at-23.00.09-960x625.png 960w" sizes="(max-width: 1620px) 100vw, 1620px"></figure>



<p></p>



<p></p>



<p></p>



<h3>Feature and data engineering using ML.NET </h3>



<p>Let&#x2019;s continue by focusing on the <code>Lyrics</code> field by visualizing the frequency of the words used in the lyrics, both at the global level and also by genre. First of all, we should start a tokenization process. This process runs using the following snippet of code:</p>



<script src="https://gist.github.com/samueleresca/74bf581877a5971ad5aded7a4a2e4bc1.js"></script>




<p>The code snippet defines a list of <code>stopwords</code> and a list of <code>symbol</code>. These variables are used by the <code>tokenizeLyrics</code> function which returns the list of words related to a lyric.</p>



<p>Besides, the <code>tokenizeLyrics</code> function uses the text transformation methods provided by ML.NET. More in detail, the <code>tokenizeLyrics</code> function defines a new <code>MLContext</code> object which is provided by the <code>Microsoft.ML</code> namespace. Next, the function runs the <code>mlContext.Data.LoadFromEnumerable</code> method to load the lyrics sequence into the <code>mlContext</code>. The <code>tokenizeLyrics</code> function calls some utilities provided by the <code>mlContext.Tranforms.Text</code> static class:</p>


<ul>
<li><code>FeaturizeText(&quot;FeaturizedLyrics&quot;, &quot;Lyrics&quot;)</code> transform the Lyrics text column, in that case, the <code>Lyrics</code> field, into featurized float array that represents counts of n-grams and char-gram;</li>
<li><code>NormalizeText(&quot;NormalizedLyrics&quot;, &quot;Lyrics&quot;)</code> normalizes incoming text of the input column by changing case, removing diacritical marks, punctuation marks and/or numbers and outputs new text in the output column;</li>
<li><code>TokenizeWords(&quot;TokenizedLyric&quot;, &quot;NormalizedLyrics&quot;, symbols)</code>tokenizes incoming text in the input column using the separators provided as input. Then, it assigns the outputs the tokens to the output column;</li>
<li><code>RemoveStopWords(&quot;LyricsWithNoCustomStopWords&quot;, &quot;TokenizedLyric&quot;, stopwords)</code> removes the list of stopwords from incoming token streams provided as input and it outputs the token streams in the output column;</li>
<li><code>RemoveDefaultStopWords(&quot;LyricsWithNoStopWords&quot;, &quot;TokenizedLyric&quot;)</code> behaves in the same way of the <code>RemoveStopWords</code> except that it uses a default list of stopwords, thus it is also possible to specify them in different languages</li>
</ul>


<p>It is also important to notice that the columns on the right are the input columns, and the ones on the left contain the output. Furthermore, it also possible to use the <code>.Append</code> method to compose a dataset of multiple columns, each of them, will contain a resulting output column.</p>



<p>Finally, the last step of the <code>tokenizeLyrics</code> function is to transform the data and put all the tokenized words together using the following instructions:</p>



<script src="https://gist.github.com/samueleresca/a97e368bf931a2499ccaaf200bc90ff6.js"></script>




<p>After that, it is possible to call the <code>tokenizeLyrics</code> function as follow:</p>



<script src="https://gist.github.com/samueleresca/eab2a771476cbaa7bfab5d4ab1ed4523.js"></script>




<p>The resulting chart shows all the top 20 most frequent words presents in all the lyrics:</p>



<figure class="wp-block-image"><img src="https:///content/images/wordpress/2019/04/Screenshot-2019-04-20-at-13.36.35.png?fit=720%2C371&amp;ssl=1" alt="Data analysis using F# and Jupyter notebook" class="wp-image-4042" srcset="https://samueleresca.net/content/images/wordpress/2019/04/Screenshot-2019-04-20-at-13.36.35.png 1694w, https://samueleresca.net/content/images/wordpress/2019/04/Screenshot-2019-04-20-at-13.36.35-320x165.png 320w, https://samueleresca.net/content/images/wordpress/2019/04/Screenshot-2019-04-20-at-13.36.35-768x395.png 768w, https://samueleresca.net/content/images/wordpress/2019/04/Screenshot-2019-04-20-at-13.36.35-960x494.png 960w" sizes="(max-width: 1694px) 100vw, 1694px"></figure>



<p>Furthermore, it is also possible to check the top 20 most frequent words by <code>Genre</code> field using the following snippet:</p>



<script src="https://gist.github.com/samueleresca/fe60dc6b205133a22a8c9aca392f654f.js"></script>




<p>For example, in that case, the resulting chart is the most popular word frequencies in Hip-Hop lyrics:</p>



<figure class="wp-block-image"><img src="https:///content/images/wordpress/2019/04/Screenshot-2019-04-20-at-13.41.31.png?fit=720%2C353&amp;ssl=1" alt="Data analysis using F# and Jupyter notebook" class="wp-image-4044" srcset="https://samueleresca.net/content/images/wordpress/2019/04/Screenshot-2019-04-20-at-13.41.31.png 1870w, https://samueleresca.net/content/images/wordpress/2019/04/Screenshot-2019-04-20-at-13.41.31-320x157.png 320w, https://samueleresca.net/content/images/wordpress/2019/04/Screenshot-2019-04-20-at-13.41.31-768x376.png 768w, https://samueleresca.net/content/images/wordpress/2019/04/Screenshot-2019-04-20-at-13.41.31-960x470.png 960w" sizes="(max-width: 1870px) 100vw, 1870px"></figure>



<h2>Final thoughts</h2>



<p>This post provides some general knowledge around data analysis using Jupyter notebook and F#. It shows how Jupyter notebook can be used to fast prototype and understand the data models.  Moreover, ML.NET provides the tools to perform feature engineering on our data and set up the data model for the initialization. In the next post, we will see how to train a model that detects a genre depending on the song lyrics.  The above example is available on GitHub at the following URL: <a href="https://github.com/samueleresca/LyricsClassifier">https://github.com/samueleresca/LyricsClassifier</a></p>



<h3>References:</h3>



<p><a href="https://www.kaggle.com/corizzi/lyrics-genre-analysis-machine-learning">https://www.kaggle.com/corizzi/lyrics-genre-analysis-machine-learning</a></p>



<p><a href="https://medium.com/luteceo-software-chemistry/statistical-analysis-using-f-and-jupyter-notebooks-2e2f31ee4cc1">https://medium.com/luteceo-software-chemistry/statistical-analysis-using-f-and-jupyter-notebooks-2e2f31ee4cc1</a></p>



<p></p>



<p><em>Cover image by </em><a href="https://www.benschneiderphoto.com/"><em>&#xA0;Benjamin Benschneider</em></a></p>
<!--kg-card-end: html-->]]></content:encoded></item><item><title><![CDATA[Test .NET Core AWS Lambda]]></title><description><![CDATA[ The following post shows some techniques about test .NET Core AWS Lambda, more in specific, it focuses on testing AWS Lambda]]></description><link>https://samueleresca.net/testing-net-core-aws-lambda/</link><guid isPermaLink="false">63d9a4dc2bc5da0fb16ac188</guid><dc:creator><![CDATA[Samuele Resca]]></dc:creator><pubDate>Wed, 20 Feb 2019 21:27:36 GMT</pubDate><media:content url="https://samueleresca.net/content/images/wordpress/2019/02/cutty-sark-7-national-maritime-museum.jpg" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: html--><img src="https://samueleresca.net/content/images/wordpress/2019/02/cutty-sark-7-national-maritime-museum.jpg" alt="Test .NET Core AWS Lambda"><p>The following post shows some techniques about test .NET Core AWS Lambda, more in specific, it focuses on testing .NET Core AWS Lambda using the <a href="https://github.com/aws/aws-lambda-dotnet/tree/master/Tools/LambdaTestTool">LambdaTestTool</a>. I&#x2019;ve already spoken about serverless, .NET Core, and AWS Lambda in the following article: <a href="https://samueleresca.net/2018/12/fast-growing-architectures-with-serverless-and-net-core/">Fast growing architectures with serverless and .NET Core</a>. This post will be focused more on the testing side.</p>
<h2>The testing issues in a serverless computing</h2>
<p>As said in the <a href="https://samueleresca.net/2018/12/fast-growing-architectures-with-serverless-and-net-core/">previous post about serverless</a>&#xA0;it is hard to test serverless applications. Especially when their complexity grows, it is hard to detect issues before running them on the cloud. Furthermore, from a development perspective, it is hard to deal with the problems if every time we need to re-deploy our services to verify the result.</p>
<h2>Existing testing tools</h2>
<p>Some existing testing tools help us to deal with the serverless systems. Speaking about AWS, things like&#xA0;<a href="https://github.com/localstack/localstack">https://github.com/localstack/localstack</a>&#xA0;provides a way to emulate the AWS stack on your machine by installing it or by running a docker image. The point of these frameworks is that they consume a lot in terms of resources and they&#xA0;are not always suitable for integration testing.</p>
<h1>Testing lambda using LambdaTestTool</h1>
<p>The <a href="https://github.com/aws/aws-lambda-dotnet/tree/master/Tools/LambdaTestTool">LambdaTestTool</a> is a utility produced by the AWS .NET Team which provides a useful and light way approach to the Lambda testing. Furthermore, it provides a way to debugging your lambda locally by triggering some input events and attaching the debugger of your preferred IDE or code editor.</p>
<p>Recently, the AWS .NET Team also released a new version of the&#xA0;<a href="https://github.com/aws/aws-lambda-dotnet/tree/master/Tools/LambdaTestTool">LambdaTestTool</a>&#xA0;with the following PR:</p>
<p><a href="https://github.com/aws/aws-lambda-dotnet/pull/364">https://github.com/aws/aws-lambda-dotnet/pull/364</a></p>
<p>which enables the debugging also for all the lambdas that use&#xA0;the <a href="https://serverless.com">serverless framework</a>.</p>
<h2>Installation</h2>
<p>The&#xA0;<a href="https://github.com/aws/aws-lambda-dotnet/tree/master/Tools/LambdaTestTool">LambdaTestTool</a> is implemented as part of the dotnet tool suite (I&#x2019;ve already written about dotnet tools in the following post: <a href="https://samueleresca.net/2019/02/artless-http-server-using-net-core/">Artless HTTP server using .NET Core</a>), it is cross-platform, and can be installed by using the following command:</p>
<p><code>dotnet tool install -g Amazon.Lambda.TestTool-2.1</code></p>
<p>Moreover,&#xA0;it is possible to update the tool at the latest version using the following instruction:</p>
<p><code>dotnet tool update -g Amazon.Lambda.TestTool-2.1</code></p>
<p>The above-mentioned code installs the new dotnet tool on your local machine. Therefore, it is possible to use it to run the lambda locally by configuring your IDE or code editor properly.</p>
<h2>Debug the lambda</h2>
<p>The&#xA0;<a href="https://github.com/aws/aws-lambda-dotnet/tree/master/Tools/LambdaTestTool">LambdaTestTool</a> runs an ASP.NET instance which returns the following interface:</p>
<p><img class="aligncenter size-large wp-image-3953" src="https://samueleresca.net/content/images/wordpress/2019/02/Screenshot-2019-02-19-at-22.22.40-960x761.png" alt="Test .NET Core AWS Lambda" width="720" height="571" srcset="https://samueleresca.net/content/images/wordpress/2019/02/Screenshot-2019-02-19-at-22.22.40-960x761.png 960w, https://samueleresca.net/content/images/wordpress/2019/02/Screenshot-2019-02-19-at-22.22.40-320x254.png 320w, https://samueleresca.net/content/images/wordpress/2019/02/Screenshot-2019-02-19-at-22.22.40-768x609.png 768w, https://samueleresca.net/content/images/wordpress/2019/02/Screenshot-2019-02-19-at-22.22.40.png 2004w" sizes="(max-width: 720px) 100vw, 720px"></p>
<p>The UI provides a way to select the <em>config file</em> of the lambda, the <em>function</em> in the project to execute, the <em>credentials</em>, the <em>AWS region</em> and the message to send as the input of the function.</p>
<p>It is possible to configure the&#xA0;<a href="https://github.com/aws/aws-lambda-dotnet/tree/master/Tools/LambdaTestTool">LambdaTestTool</a>&#xA0;on multiple editors, such as Visual Studio, Visual Studio Code, Rider and Visual Studio for Mac. The next section shows the configurations for Visual Studio Code and Rider.</p>
<h3>Visual studio code</h3>
<p>The same approach can be taken with a Visual Studio Code, by adding the following section in the <code>launch.json</code> file:</p>
<script src="https://gist.github.com/samueleresca/0e86f841e93ab3041499c8c816054b7a.js"></script>
<h3>Rider</h3>
<p>For example, in the case of Rider IDE, it is possible to use the tool by creating a new configuration in the IDE:</p>
<p><img class="aligncenter  wp-image-3945" src="https://samueleresca.net/content/images/wordpress/2019/02/Screenshot-2019-02-18-at-08.41.08-960x610.png" alt="Test .NET Core AWS Lambda" width="464" height="295" srcset="https://samueleresca.net/content/images/wordpress/2019/02/Screenshot-2019-02-18-at-08.41.08-960x610.png 960w, https://samueleresca.net/content/images/wordpress/2019/02/Screenshot-2019-02-18-at-08.41.08-320x203.png 320w, https://samueleresca.net/content/images/wordpress/2019/02/Screenshot-2019-02-18-at-08.41.08-768x488.png 768w" sizes="(max-width: 464px) 100vw, 464px"></p>
<p>The <em>Exe path</em> field refers to the path to the <code>dotnet tool</code>:</p>
<p><code>&lt;home-directory&gt;/.dotnet/tools/.store/amazon.lambda.testtool-2.1/&lt;nuget-version&gt;/amazon.lambda.testtool-2.1/&lt;nuget-version&gt;/tools/netcoreapp2.1/any/Amazon.Lambda.TestTool.dll</code></p>
<p>The <em>working directory</em> field is the root folder of the lambda project.</p>
<h2>Summary</h2>
<p>It is possible to get more information about the <em>LambdaTestTool</em> on Github:&#xA0;<a href="https://github.com/aws/aws-lambda-dotnet/tree/master/Tools/LambdaTestTool">https://github.com/aws/aws-lambda-dotnet/tree/master/Tools/LambdaTestTool</a>. The tool is still in a preview phase, but it is a very useful and fast way when we want to verify and testing a .NET Core AWS Lambda. What&#x2019;s still missing, is a quick way to integrate the package and run it directly in the code, for example, as a part of a unit testing process.</p>
<p>Cover image by&#xA0;<a href="https://www.artfund.org/">The Cutty Sark &#x2013; &#xA9; National Maritime Museum</a></p>
<!--kg-card-end: html-->]]></content:encoded></item></channel></rss>