<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[Prémices Kamasuwa]]></title><description><![CDATA[Full-stack Software Engineer.]]></description><link>https://nkpremices.com/</link><image><url>https://nkpremices.com/favicon.png</url><title>Prémices Kamasuwa</title><link>https://nkpremices.com/</link></image><generator>Ghost 4.48</generator><lastBuildDate>Wed, 18 Mar 2026 09:20:30 GMT</lastBuildDate><atom:link href="https://nkpremices.com/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[Building a Cron Job Scheduler Using Redis and Node.js]]></title><description><![CDATA[Discover how to build an advanced event-driven scheduler using Redis’ key expiration feature for real-time task management.]]></description><link>https://nkpremices.com/building-a-cron-job-scheduler-using-redis-and-node-js/</link><guid isPermaLink="false">66d2fa91fe48690568456873</guid><category><![CDATA[2024]]></category><category><![CDATA[Blog]]></category><category><![CDATA[Tech]]></category><category><![CDATA[SystemDesign]]></category><category><![CDATA[Typescript]]></category><category><![CDATA[Javascript]]></category><category><![CDATA[Redis]]></category><category><![CDATA[NodeJS]]></category><category><![CDATA[Scheduling]]></category><category><![CDATA[Cron]]></category><category><![CDATA[TaskQueue]]></category><dc:creator><![CDATA[Prémices N. Kamasuwa]]></dc:creator><pubDate>Sat, 31 Aug 2024 12:54:51 GMT</pubDate><media:content url="https://nkpremices.com/content/images/2024/08/Screenshot-2024-08-31-at-15.45.59.png" medium="image"/><content:encoded><![CDATA[<img src="https://nkpremices.com/content/images/2024/08/Screenshot-2024-08-31-at-15.45.59.png" alt="Building a Cron Job Scheduler Using Redis and Node.js"><p>Authors: &#xA0;<a href="https://github.com/truestbyheart">Daniel Charles Mwangila</a> and <a href="https://nkpremices.com/about">Pr&#xE9;mices Kamasuwa </a></p><!--kg-card-begin: html--><div style="display: flex; flex-direction: row; gap: 1.5rem;">
  <div class="github-card" data-github="truestbyheart" data-width="400" data-height="151" data-theme="default"></div>
<script src="//cdn.jsdelivr.net/github-cards/latest/widget.js"></script>
<div class="github-card" data-github="nkpremices" data-width="400" data-height="151" data-theme="default"></div>
<script src="//cdn.jsdelivr.net/github-cards/latest/widget.js"></script>
</div><!--kg-card-end: html--><h2 id="introduction">Introduction</h2><p>In this article, we&apos;ll explore how to build a robust and flexible scheduling system by combining the strengths of Redis and Node.js. This approach not only replicates the functionality of traditional cron jobs but also enhances it by leveraging Redis to manage task scheduling through expiration events. Redis, known for its high-speed in-memory data storage, becomes an optimal solution for real-time applications that demand precise and efficient task execution.</p><p>Throughout this guide, we&apos;ll walk through the development of an event-driven scheduling system that utilizes Redis&apos; key expiration feature to trigger tasks at predefined intervals. By the end, you will have a deeper understanding of how Redis can be harnessed for more complex real-time scenarios, enabling you to build systems that go beyond basic task scheduling.</p><h2 id="prerequisites"><strong>Prerequisites</strong></h2><p>To follow along with this tutorial, ensure you have the following:</p><ul><li><strong>Node.js</strong> is installed on your local machine. If not, follow <a href="https://nodejs.org/en/">this guide to install Node.js</a>.</li><li><strong>Redis</strong> is installed locally or accessible via a remote server. You can find installation instructions <a href="https://redis.io/download">here</a>.</li><li>Familiarity with <strong>TypeScript</strong> and <strong>Node.js</strong>. If you need to brush up, consider reviewing <a href="https://www.typescriptlang.org/docs/">the TypeScript documentation</a> and <a href="https://nodejs.org/en/docs/">Node.js documentation</a>.</li><li>A code editor, such as <a href="https://code.visualstudio.com/download">Visual Studio Code</a>.</li><li>Basic knowledge of Redis operations, particularly how to set and get keys and configure keyspace notifications.</li></ul><h2 id="step-1-%E2%80%94-setting-up-your-nodejs-application"><strong>Step 1 &#x2014; Setting Up Your Node.js Application</strong></h2><p>Let&apos;s start by setting up the Node.js environment for our project.</p><h3 id="11-create-a-new-project-directory"><strong>1.1. Create a New Project Directory</strong></h3><p>First, create a directory for your project and navigate into it:</p><!--kg-card-begin: markdown--><pre><code class="language-shell">mkdir redis-scheduler
cd redis-scheduler
</code></pre>
<!--kg-card-end: markdown--><h3 id="12-initialize-a-new-nodejs-project"><strong>1.2. Initialize a New Node.js Project</strong></h3><p>Initialize a new Node.js project using the following command:</p><!--kg-card-begin: markdown--><pre><code class="language-bash">npm init -y
</code></pre>
<!--kg-card-end: markdown--><p>This command creates a <code>package.json</code> file with default settings.</p><h3 id="13-install-necessary-nodejs-packages"><strong>1.3. Install Necessary Node.js Packages</strong></h3><p>Install the required packages for this tutorial:</p><!--kg-card-begin: markdown--><pre><code class="language-bash">npm install redis dotenv date-fns
</code></pre>
<!--kg-card-end: markdown--><p>These packages include:</p><ul><li><strong>redis</strong>: For interacting with Redis.</li><li><strong>dotenv</strong>: To manage environment variables.</li><li><strong>date-fns</strong>: For date manipulation.</li></ul><p>Next, install the development dependencies:</p><!--kg-card-begin: markdown--><pre><code class="language-bash">npm install @types/node ts-node typescript
</code></pre>
<!--kg-card-end: markdown--><p>These packages will help you work with TypeScript and Node.js more effectively.</p><h3 id="14-initialize-a-typescript-project"><strong>1.4. Initialize a TypeScript Project</strong></h3><p>Initialize a TypeScript configuration file:</p><!--kg-card-begin: markdown--><pre><code class="language-bash">tsc --init
</code></pre>
<!--kg-card-end: markdown--><p>This command creates a <code>tsconfig.json</code> file in your project root, enabling TypeScript support.</p><h3 id="15-project-structure"><strong>1.5. Project Structure</strong></h3><p>To organize your code effectively, set up the following directory structure:</p><pre><code class="language-bash">redis-scheduler/
&#x2502;
&#x251C;&#x2500;&#x2500; src/
&#x2502;   &#x251C;&#x2500;&#x2500; interfaces/
&#x2502;   &#x2502;   &#x2514;&#x2500;&#x2500; Task.ts
&#x2502;   &#x251C;&#x2500;&#x2500; utils/
&#x2502;   &#x2502;   &#x2514;&#x2500;&#x2500; redisClient.ts
&#x2502;   &#x251C;&#x2500;&#x2500; handlers/
&#x2502;   &#x2502;   &#x251C;&#x2500;&#x2500; classHandler.ts
&#x2502;   &#x2502;   &#x2514;&#x2500;&#x2500; functionHandler.ts
&#x2502;   &#x251C;&#x2500;&#x2500; scheduler.ts
&#x2502;   &#x2514;&#x2500;&#x2500; listener.ts
&#x2502;
&#x251C;&#x2500;&#x2500; .env
&#x251C;&#x2500;&#x2500; package.json
&#x251C;&#x2500;&#x2500; tsconfig.json
&#x2514;&#x2500;&#x2500; README.md</code></pre><p>This structure helps keep your project organized, making it easier to maintain and extend in the future.</p><h2 id="step-2-%E2%80%94-implementing-the-task-interface"><strong>Step 2 &#x2014; Implementing the Task Interface</strong></h2><p>Create an interface to define the structure of a task. This interface ensures that every task follows a consistent format, which is crucial for the scheduler to function correctly.</p><p>Create a new file <code>src/interfaces/Task.ts</code>:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">export interface ExecutionPath {
    file_path: string;   // The path to the file containing the task&apos;s function
    class_name: string;  // The class name where the function resides
    function_name: string; // The specific function to execute
}

export interface Task {
    task_id: string;     // Unique identifier for the task
    title: string;       // Human-readable name for the task
    interval: string;    // The interval at which the task should run (e.g., &apos;5m&apos;, &apos;1h&apos;)
    last_run?: Date;     // The last time the task was executed
    next_run: Date;      // The next scheduled run time
    execution_path: ExecutionPath; // Details on where and what to execute
}
</code></pre>
<!--kg-card-end: markdown--><h3 id="explanation-of-task-interface-fields"><strong>Explanation of Task Interface Fields:</strong></h3><ul><li><strong>task_id</strong>: A unique identifier for the task, ensuring that each task can be tracked separately.</li><li><strong>title</strong>: A descriptive name for the task. Useful for logging and debugging.</li><li><strong>interval</strong>: Specifies how frequently the task should run. This could be in seconds (<code>s</code>), minutes (<code>m</code>), hours (<code>h</code>), days (<code>d</code>), or weeks (<code>w</code>).</li><li><strong>last_run</strong>: An optional field that records the last execution time, useful for monitoring.</li><li><strong>next_run</strong>: The next time the task is scheduled to run.</li><li><strong>execution_path</strong>: Specifies where to find the function to execute, including file path, class, and function names.</li></ul><h2 id="step-3-%E2%80%94-implementing-the-redis-client"><strong>Step 3 &#x2014; Implementing the Redis Client</strong></h2><p>To interact with Redis, we need to set up a client. This client will be used to set and get tasks and listen for expiration events.</p><p>Create a new file <code>src/utils/redisClient.ts</code>:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">import { createClient } from &quot;redis&quot;;
import dotenv from &quot;dotenv&quot;;
dotenv.config();

// Retrieve Redis connection URI from the environment variables
console.log(process.env.REDIS_URI);
export const redisClient = async () =&gt; {
    const client = createClient({ url: process.env.REDIS_URI })
        .on(&apos;error&apos;, (err) =&gt; console.error(`Redis Client Error: ${err}`))
        .on(&apos;connect&apos;, () =&gt; console.info(&apos;Connected to Redis&apos;))
        .on(&apos;ready&apos;, () =&gt; console.info(&apos;Redis is ready&apos;));

    await client.connect();
    return client;
};

// Function to retrieve data from Redis
export const getFromRedis = async (key: string): Promise&lt;{ [key: string]: any } | null&gt; =&gt; {
    const client = await redisClient();
    try {
        const value = await client.get(key);
        if (value) return JSON.parse(value);
        return null;
    } catch (error) {
        console.error(`Error getting key: ${key} from Redis: ${error}`);
        return null;
    } finally {
        await client.disconnect();
    }
}

// Function to store data in Redis with an optional expiration time
export const setToRedis = async (key: string, value: string, expireIn?: number): Promise&lt;void&gt; =&gt; {
    const client = await redisClient();
    try {
        await client.set(key, value);
        if (expireIn) {
            await client.expire(key, expireIn);
        }
    } catch (error) {
        console.error(`Error setting key: ${key} to Redis: ${error}`);
    } finally {
        await client.disconnect();
    }
}
</code></pre>
<!--kg-card-end: markdown--><h3 id="explanation"><strong>Explanation:</strong></h3><ul><li><strong>redisClient</strong>: Initializes a connection to Redis and manages connection events.</li><li><strong>getFromRedis</strong>: Fetches a value from Redis by key and parses it from JSON.</li><li><strong>setToRedis</strong>: Sets a value in Redis with an optional expiration time, which is crucial for scheduling tasks to run at specific intervals.</li></ul><h2 id="step-4-%E2%80%94-implementing-the-scheduler-class"><strong>Step 4 &#x2014; Implementing the Scheduler Class</strong></h2><p>The scheduler class will handle scheduling logic, including calculating the next run time and executing tasks.</p><p>Create a new file <code>src/scheduler.ts</code>:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">import { addSeconds, addMinutes, addHours, addDays, addWeeks } from &apos;date-fns&apos;;
import path from &apos;path&apos;;
import { setToRedis, getFromRedis } from &apos;./utils/redisClient&apos;;
import { Task } from &apos;./interfaces/Task&apos;;

export class Scheduler {

    // Calculate the next run time based on the interval string
    private calculateNextRun(interval: string): { next_run: Date; interval: number } {
        const now = new Date();
        const match = interval.match(/^(\\d+)(s|m|h|d|w)$/);

        if (!match) {
            throw new Error(&apos;Invalid interval format&apos;);
        }

        const value = parseInt(match[1], 10);
        const unit = match[2];
        let next_run;
        const seconds_per_unit = { s: 1, m: 60, h: 3600, d: 86400, w: 604800 };

        switch (unit) {
            case &apos;s&apos;: next_run = addSeconds(now, value); break;
            case &apos;m&apos;: next_run = addMinutes(now, value); break;
            case &apos;h&apos;: next_run = addHours(now, value); break;
            case &apos;d&apos;: next_run = addDays(now, value); break;
            case &apos;w&apos;: next_run = addWeeks(now, value); break;
            default: throw new Error(&apos;Unsupported time unit&apos;);
        }

        return {
            next_run: next_run,
            interval: value * seconds_per_unit[unit],
        };
    }

    // Update the next run time in Redis
    private async updateNextRun(task: Task) {
        const { next_run, interval } = this.calculateNextRun(task.interval);
        task.next_run = next_run;
        await setToRedis(`SCHEDS:${task.task_id}`, JSON.stringify(task), interval);
    }

    // Execute a function dynamically based on its path
    private async executeFunctionFromPath(task: Task) {
        const { file_path, class_name, function_name } = task.execution_path;
        try {
            const modulePath = path.resolve(&apos;./src/handlers&apos;, `${file_path}`);
            console.info(`Importing module from path: ${modulePath}`);
            const module = await import(modulePath);

            // Get the class and create an instance
            const instance = new module[class_name]();
            if (typeof instance[function_name] !== &apos;function&apos;) {
                throw new Error(`Function ${function_name} not found in class ${class_name}`);
            }

            // Execute the function
            console.info(&apos;Executing scheduled task function...&apos;);
            await instance[function_name]();
            console.info(&apos;Function executed successfully&apos;);
        } catch (error) {
            console.error(&apos;Error executing function from path:&apos;, error);
        }
    }

    // Main function to run the scheduler
    async runScheduler(scheds_key: string) {
        const task_id = scheds_key.split(&apos;:&apos;)[1];
        const task = await getFromRedis(`SCHEDS:${task_id}`) as Task;

        if (!task) {
            console.error(`Task with id: ${task_id} not found`);
        } else {
            await this.executeFunctionFromPath(task);
            await this.updateNextRun(task);
        }
    }
}

export const scheduler = new Scheduler();
</code></pre>
<!--kg-card-end: markdown--><h3 id="explanation-1"><strong>Explanation:</strong></h3><ul><li><strong>calculateNextRun</strong>: Calculates the next run time based on the given interval (e.g., &quot;5m&quot; for 5 minutes). It also calculates the interval in seconds for Redis&#x2019; <code>expire</code> function.</li><li><strong>updateNextRun</strong>: Updates the task&apos;s next run time in Redis to ensure it triggers again at the correct interval.</li><li><strong>executeFunctionFromPath</strong>: Dynamically loads and executes a function from a specified file and class path, providing flexibility in what tasks can be scheduled.</li><li><strong>runScheduler</strong>: The main function that orchestrates the retrieval and execution of scheduled tasks, then updates their next run time.</li></ul><h2 id="step-5-%E2%80%94-implementing-a-sample-task-handler"><strong>Step 5 &#x2014; Implementing a Sample Task Handler</strong></h2><p>For demonstration purposes, let&apos;s create a simple handler that outputs &quot;Hello from the handler&quot; to the console after a 5-second delay.</p><p>Create a new file <code>src/handlers/classHandler.ts</code>:</p><h3 id="explanation-2"><strong>Explanation:</strong></h3><ul><li><strong>HelloHandler</strong>: A class with an <code>execute</code> method that simulates a delayed task by waiting 5 seconds before logging a message to the console.</li></ul><h2 id="step-6-%E2%80%94-implementing-the-redis-listener"><strong>Step 6 &#x2014; Implementing the Redis Listener</strong></h2><p>The Redis listener monitors key expiration events and triggers the scheduler when a key expires.</p><p>Create a new file <code>src/listener.ts</code>:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">import { redisClient } from &apos;./utils/redisClient&apos;;
import { scheduler } from &apos;./scheduler&apos;;

(async () =&gt; {
    const client = await redisClient();

    // Enable keyspace notifications for expiration events
    client.configSet(&apos;notify-keyspace-events&apos;, &apos;Ex&apos;);

    const sub = client.duplicate();
    await sub.connect();
    const expired_subKey = &apos;__keyevent@0__:expired&apos;;

    // Listen for expired events on all keys
    sub.pSubscribe(expired_subKey, async (key) =&gt; {
        console.info(`[i] Key expired: ${key}`);
        await scheduler.runScheduler(key);
    });

    console.info(&apos;Redis listener set up and waiting for events...&apos;);
})();
</code></pre>
<!--kg-card-end: markdown--><h3 id="explanation-3"><strong>Explanation:</strong></h3><ul><li><strong>Redis Keyspace Notifications</strong>: Configures Redis to notify when keys expire. This is essential for our scheduling logic as Redis uses the <code>EXPIRE</code> command to manage key expiration.</li><li><strong>Listener Setup</strong>: Subscribes to expiration events and triggers the <code>runScheduler</code> function, which handles executing the scheduled task.</li></ul><h2 id="step-7-%E2%80%94-running-and-testing-the-scheduler"><strong>Step 7 &#x2014; Running and Testing the Scheduler</strong></h2><h3 id="71-configure-environment-variables"><strong>7.1. Configure Environment Variables</strong></h3><p>Ensure you have your Redis connection properly configured. Create a <code>.env</code> file in the project root if you haven&apos;t done so already, and add your Redis URI:</p><pre><code>REDIS_URI=redis://localhost:6379

</code></pre><p>Make sure to replace <code>localhost:6379</code> with the appropriate host and port if your Redis instance is hosted elsewhere.</p><h3 id="72-schedule-a-task-using-typescript-code"><strong>7.2. Schedule a Task Using TypeScript Code</strong></h3><p>Instead of using the Redis CLI, we will use a TypeScript script to add a task to Redis. This approach provides more flexibility and demonstrates how to automate task scheduling programmatically.</p><h3 id="step-by-step-guide-to-scheduling-a-task-via-typescript"><strong>Step-by-Step Guide to Scheduling a Task via TypeScript:</strong></h3><ol><li><strong>Create a TypeScript Script to Add a Task</strong>:</li></ol><p>Create a new file named <code>src/scheduleTask.ts</code> to automate the process of scheduling tasks in Redis:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">import { v4 as uuidv4 } from &apos;uuid&apos;;
import { setToRedis } from &apos;./utils/redisClient&apos;;
import { Task } from &apos;./interfaces/Task&apos;;

// Function to schedule a new task
const scheduleTask = async () =&gt; {
    // Generate a unique UUID for the task
    const taskId = uuidv4();

    // Create a new task object
    const newTask: Task = {
        task_id: taskId,
        title: &apos;Hello World Task&apos;,
        interval: &apos;5s&apos;,  // Adjust as needed for your testing
        next_run: new Date(),
        execution_path: {
            file_path: &apos;classHandler&apos;,  // This corresponds to src/handlers/classHandler.ts
            class_name: &apos;HelloHandler&apos;,
            function_name: &apos;execute&apos;,
        }
    };

    // Convert the task object to a JSON string
    const taskJSON = JSON.stringify(newTask);

    try {
        // Store the task in Redis without expiration
        await setToRedis(taskId, taskJSON);

        // Set a shadow key with expiration
        const intervalInSeconds = 5; // Example interval in seconds; modify as needed
        await setToRedis(`SCHEDS:${taskId}`, taskId, intervalInSeconds);

        console.info(`Task scheduled successfully with ID: ${taskId}`);
    } catch (error) {
        console.error(&apos;Error scheduling task:&apos;, error);
    }
};

// Execute the function to schedule the task
scheduleTask();
</code></pre>
<!--kg-card-end: markdown--><h3 id="explanation-of-the-script"><strong>Explanation of the Script:</strong></h3><ul><li><strong>UUID Generation</strong>: We use the <code>uuid</code> package to generate a unique identifier (<code>task_id</code>) for each task, ensuring no two tasks have the same ID.</li><li><strong>Task Object Creation</strong>: A <code>Task</code> object is created with all the necessary details, including the interval and execution path.</li><li><strong>Storing the Task in Redis</strong>:</li><li>The task is stored under a key <code>TASK:&lt;uuid&gt;</code> without an expiration time.</li><li>A shadow key, <code>SCHEDS:&lt;uuid&gt;</code>, is created with an expiration time corresponding to the task&apos;s interval. This shadow key will expire and trigger the listener.</li><li><strong>Expiration Handling</strong>: The <code>intervalInSeconds</code> specifies when the shadow key expires, causing Redis to emit an event that our listener can capture.</li></ul><ol><li><strong>Install the <code>uuid</code> Package</strong>:</li></ol><p>If you haven&apos;t already installed the <code>uuid</code> package, you can add it to your project by running:</p><pre><code class="language-bash">npm install uuid

</code></pre><p>Also, install the TypeScript types for <code>uuid</code>:</p><pre><code class="language-bash">npm install @types/uuid --save-dev

</code></pre><ol><li><strong>Run the TypeScript Script</strong>:</li></ol><p>Execute the script to add the task to Redis:</p><pre><code class="language-bash">ts-node src/scheduleTask.ts

</code></pre><h3 id="73-start-the-redis-listener"><strong>7.3. Start the Redis Listener</strong></h3><p>With the task now scheduled, start the Redis listener in a separate terminal window to monitor for expiration events:</p><pre><code class="language-bash">ts-node src/listener.ts

</code></pre><h3 id="74-observe-the-output"><strong>7.4. Observe the Output</strong></h3><p>After the interval specified in the shadow key (<code>SCHEDS:&lt;uuid&gt;</code>) expires, you should see the task execution output:</p><pre><code>[i] Key expired: SCHEDS:&lt;uuid&gt;
Importing module from path: &lt;path-to-classHandler&gt;
Hello from the handler

</code></pre><p>This output confirms that the Redis listener detected the key expiration and successfully executed the task associated with that key.</p><h3 id="conclusion"><strong>Conclusion</strong></h3><p>Using a TypeScript script to add tasks to Redis programmatically gives you greater flexibility and control over task scheduling. This method demonstrates how to integrate Redis-based scheduling into your applications, leveraging UUIDs for unique task identification and shadow keys for triggering task execution. You can now expand on this foundation to build more complex scheduling systems or integrate it into a broader application framework.</p>]]></content:encoded></item><item><title><![CDATA[Ensuring Data Integrity in Real-time Synchronization: A Phoenix LiveView Tale]]></title><description><![CDATA[Tackling duplicate webhook calls in #Phoenix with a Deduplicator! Learn how we ensure data integrity and system reliability with Elixir's GenServer and ETS for efficient deduplication. Dive into my journey from challenge to solution. #ElixirLang #WebDevelopment]]></description><link>https://nkpremices.com/eliminating-duplicate-requests-in-phoenix-a-journey-to-idempotent-webhooks/</link><guid isPermaLink="false">65cfa1effe48690568456672</guid><category><![CDATA[Blog]]></category><category><![CDATA[2024]]></category><category><![CDATA[Distributed Systems]]></category><category><![CDATA[Elixir]]></category><category><![CDATA[Phoenix]]></category><category><![CDATA[GenServer]]></category><category><![CDATA[SystemDesign]]></category><dc:creator><![CDATA[Prémices N. Kamasuwa]]></dc:creator><pubDate>Fri, 16 Feb 2024 19:27:43 GMT</pubDate><media:content url="https://nkpremices.com/content/images/2024/02/4f5b9baf-2153-41e9-ad61-bdd6b5a14d09.webp" medium="image"/><content:encoded><![CDATA[<img src="https://nkpremices.com/content/images/2024/02/4f5b9baf-2153-41e9-ad61-bdd6b5a14d09.webp" alt="Ensuring Data Integrity in Real-time Synchronization: A Phoenix LiveView Tale"><p>In the dynamic landscape of web applications, particularly those dealing with real-time data synchronization between systems, the challenge of handling duplicate requests is not merely theoretical. </p><p>This blog post delves into a complex issue I encountered while working on a Phoenix application designed for bidirectional data synchronization between two online enterprise platforms. My journey through identifying and solving the problem of duplicate webhook calls illustrates the importance of idempotent operations in maintaining data integrity and system reliability.</p><h2 id="the-challenge-of-synchronization">The Challenge of Synchronization</h2><p>The Phoenix-based OTP application was designed to facilitate seamless real-time data synchronization between two systems. Leveraging Elixir&apos;s GenServers for asynchronous data processing, I encountered an unexpected hurdle: <strong>duplicate webhook calls</strong>, threatening the central database&apos;s integrity by risking duplicate records.</p><h3 id="encountering-duplication">Encountering Duplication</h3><p>The discovery of webhook events firing multiple times for the same data underscored a significant threat to data consistency. My initial architecture, though efficient, lacked a robust mechanism to prevent the processing of duplicate requests.</p><h2 id="crafting-the-solution">Crafting the Solution</h2><p>The solution required a blend of creativity and technical expertise. My goal was to implement a deduplication mechanism that could reliably identify duplicate requests, ensuring that each unique piece of data was processed exactly once. So I created a module that I will be calling <code>Deduplicator</code> moving forward, just for the sake of reference.</p><h3 id="the-genesis-of-the-deduplicator-module">The Genesis of the Deduplicator module</h3><p><code>Deduplicator</code> emerged from the necessity to intercept and evaluate incoming webhook calls before proceeding with any data manipulation. The module&apos;s design was centered around generating unique identifiers for each request based on the request&apos;s payload. By serializing the <code>entity</code> part of the request and hashing it, I could create a distinctive fingerprint for each operation.</p><h4 id="tutorial-implementing-the-deduplicator-module">Tutorial: Implementing the Deduplicator module</h4><p>Here&apos;s how I brought <code>Deduplicator</code> to life, step by step:</p><ol><li><strong>Unique Identifier Generation</strong>: </li></ol><p>For each incoming request, serialize the <code>entity</code> payload into a JSON string and generate a SHA-256 hash. This hash serves as a unique identifier, encapsulating the essence of the request.<br>Below is a codesnippet for the unique identifier generator function</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">defp generate_unique_id(entity) do
    encoded_entity =
      entity
      |&gt; Jason.encode!()

    :crypto.hash(:sha256, encoded_entity)
    |&gt; Base.encode16()
end
</code></pre>
<!--kg-card-end: markdown--><p>2. <strong>Implementation of the </strong><code>Deduplicator</code><strong> Module</strong>:</p><p>This module, leveraging Elixir&apos;s GenServer and ETS (Erlang Term Storage), is designed to ensure idempotent operations, preventing duplicate data processing. Here&apos;s a deeper dive into its implementation and integration within our Phoenix application. </p><p><strong><u>The GenServer Foundation</u></strong></p><p>The <code>Deduplicator</code> module begins its life as a GenServer, a cornerstone of Elixir applications for maintaining state and executing background work asynchronously. Using GenServer allows <code>Deduplicator</code> to run continuously in the background, monitoring for duplicate requests.</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">defmodule Deduplicator do
  use GenServer

 def start_link(_opts) do
    GenServer.start_link(__MODULE__, %{}, name: __MODULE__)
  end
end
</code></pre>
<!--kg-card-end: markdown--><p><strong><u>The <code>init/1</code> function</u></strong></p><p>It serves as the foundational setup for the <code>Deduplicator</code> module. Upon the GenServer&apos;s initialization, this function is called to perform essential setup tasks crucial for the module&apos;s operation.</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">def init(_) do
  :ets.new(:dedup_table, [:set, :public, :named_table])
  {:ok, %{}}
end
</code></pre>
<!--kg-card-end: markdown--><p><em>The line <code>:ets.new(:dedup_table, [:set, :public, :named_table])</code> is instrumental in establishing an Erlang Term Storage (ETS) table named <code>:dedup_table</code>. This table is configured with a few options:</em></p><ul><li><em><code>:set</code>: This option ensures that the table behaves as a set, meaning each entry is unique based on its key. This is crucial for our deduplication logic, as it allows us to store each request&apos;s unique identifier without duplicates.</em></li><li><em><code>:public</code>: This option makes the table accessible to all processes, enabling different parts of the application to query or update the deduplication status of requests.</em></li><li><em><code>:named_table</code>: This allows the table to be referenced by its name, <code>:dedup_table</code>, facilitating easier access throughout the application.</em></li></ul><p><strong><u>Marking Requests as Processed</u></strong></p><p>When a request is processed, its unique identifier is stored in the ETS table along with the current system time. This marks the request as processed, preventing future duplications.</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">def mark_as_processed(unique_id) do
  :ets.insert(:dedup_table, {unique_id, :erlang.system_time()})
end
</code></pre>
<!--kg-card-end: markdown--><p>The <code>mark_as_processed/1</code> function is a pivotal part of the MyDeduplicator module, encapsulating the mechanism that records the processing of requests to prevent duplicate handling. This function demonstrates an effective use of Elixir&apos;s Erlang Term Storage (ETS) to maintain the idempotency of operations within our application.</p><p>At the heart of this function lies the <code>:ets.insert/2</code> call, which adds a new record into the <code>:dedup_table</code> ETS table. Each record is a tuple consisting of two elements: the <code>unique_id</code> of the request and the current system time captured by <code>:erlang.system_time()</code>.</p><ul><li><em><strong>Unique Identifier</strong>: The <code>unique_id</code> serves as the key for the record. It is a hash derived from the request&apos;s payload, ensuring that each request can be uniquely identified based on its content. This uniqueness is crucial for detecting and preventing duplicate processing of the same request.</em></li><li><em><strong>Timestamp</strong>: The inclusion of the current system time as the second element of the tuple serves a dual purpose. First, it timestamps when the request was processed, providing traceability. Second, it facilitates the cleanup process, allowing the system to determine which records are old and should be removed based on their age.</em></li></ul><h4 id="checking-for-duplicates"><u>Checking for Duplicates</u></h4><p>Before processing any request, <code>Deduplicator</code> checks the ETS table to see if the request&apos;s unique identifier already exists, indicating it has been processed.</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">def already_processed?(unique_id) do
  case :ets.lookup(:dedup_table, unique_id) do
    [{^unique_id, _timestamp}] -&gt; true
    _ -&gt; false
  end
end
</code></pre>
<!--kg-card-end: markdown--><p>The <code>already_processed?/1</code> function is a critical component of the MyDeduplicator module, serving as the gatekeeper in the deduplication strategy. This function scrutinizes requests to determine if they have been processed before, thus preventing redundant operations on the same data. </p><p><em>Here&apos;s a closer look at its implementation and significance:</em></p><ul><li><em><strong>ETS Lookup</strong>: The function begins with an <code>:ets.lookup/2</code> call, querying the <code>:dedup_table</code> ETS table for a record matching the provided <code>unique_id</code>. This <code>unique_id</code> is a hash derived from the request&apos;s payload, ensuring each request can be uniquely identified.</em></li><li><em><strong>Match Found</strong>: If the lookup returns a tuple matching the <code>unique_id</code>, the function interprets this as the request having been processed before. The presence of this record in the table indicates that the specific data payload associated with this <code>unique_id</code> has already been handled, signaling the function to return <code>true</code>.</em></li><li><em><strong>No Match Found</strong>: Conversely, if no matching record is found in the ETS table, the function concludes that the request has not been processed previously and returns <code>false</code>. This outcome indicates that it is safe to proceed with processing the request, as there is no risk of duplicating effort or data.</em></li></ul><h4 id="cleaning-up-ets-to-save-up-on-memory-usage"><u>Cleaning up ETS to save up on memory usage</u></h4><p></p><p>In any application that relies on in-memory storage for rapid data access and manipulation, managing memory usage efficiently is paramount. This is particularly true for our MyDeduplicator module, which utilizes Erlang Term Storage (ETS) to keep track of processed requests and prevent duplicates. However, without proper management, the memory consumed by the ETS table could grow indefinitely, potentially degrading system performance over time. To address this concern, we&apos;ve implemented a cleanup mechanism designed to periodically remove old entries from the ETS table, thereby conserving memory and maintaining optimal performance.</p><h4 id="implementing-periodic-cleanup">Implementing Periodic Cleanup</h4><p>The cleanup process is orchestrated through two primary functions: <code>schedule_cleanup/0</code> and <code>handle_info/2</code>. Here&apos;s how they work together to ensure the ETS table remains efficient and does not grow unbounded:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">defp schedule_cleanup do
  Process.send_after(self(), :cleanup, @cleanup_interval)
end
</code></pre>
<!--kg-card-end: markdown--><ul><li><em><strong>Scheduling Cleanup</strong>: The <code>schedule_cleanup/0</code> function leverages <code>Process.send_after/3</code> to schedule a message (<code>:cleanup</code>) to be sent to the GenServer itself after a predefined interval (<code>@cleanup_interval</code>). This periodic messaging acts as a trigger for the cleanup operation, ensuring that the process is automatically repeated at regular intervals.</em></li></ul><h4 id="handling-the-cleanup-process">Handling the Cleanup Process</h4><p>When the GenServer receives the <code>:cleanup</code> message, it triggers the <code>handle_info/2</code> function, which is responsible for the actual cleanup logic:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">def handle_info(:cleanup, state) do
  current_time = :erlang.system_time()

  :ets.tab2list(:dedup_table)
  |&gt; Enum.each(fn
    {id, timestamp} when current_time - timestamp &gt; @ttl -&gt;
      :ets.delete(:dedup_table, id)

    _ -&gt;
      :noop
  end)

  schedule_cleanup()
  {:noreply, state}
end
</code></pre>
<!--kg-card-end: markdown--><ul><li><em><strong>Executing Cleanup</strong>: Upon receiving the <code>:cleanup</code> message, this function retrieves all entries from the <code>:dedup_table</code> ETS table and iterates over them. Each entry is assessed to determine if its timestamp (indicating when it was added to the table) is older than the allowed time-to-live (<code>@ttl</code>). If an entry is found to be older, it is removed from the table, freeing up the memory it consumed.</em></li><li><em><strong>Recurrence of Cleanup</strong>: After performing the cleanup, the function calls <code>schedule_cleanup/0</code> again to ensure that the cleanup operation continues to run at regular intervals, thus maintaining the ongoing efficiency of the ETS table.</em></li></ul><h4 id="integrating-cleanup-with-initialization">Integrating Cleanup with Initialization</h4><p>To kickstart the cleanup process when the <code>Deduplicator</code> GenServer is initialized, we include the <code>schedule_cleanup/0</code> call within the <code>init/1</code> function:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">def init(_) do
    :ets.new(:dedup_table, [:set, :public, :named_table])
    schedule_cleanup()
    {:ok, %{}}
end
</code></pre>
<!--kg-card-end: markdown--><ul><li><em><strong>Ensuring Immediate Effectiveness</strong>: By invoking <code>schedule_cleanup/0</code> during initialization, we ensure that the cleanup mechanism is active right from the start, preventing the ETS table from ever becoming a memory concern.</em></li></ul><p>Now, the final version of the <code>Deduplicator</code> module:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">defmodule Duplicator do
  use GenServer

  @cleanup_interval :timer.minutes(1)
  @ttl :timer.hours(1)

  def start_link(_opts) do
    GenServer.start_link(__MODULE__, %{}, name: __MODULE__)
  end

  def init(_) do
    :ets.new(:dedup_table, [:set, :public, :named_table])
    schedule_cleanup()
    {:ok, %{}}
  end

  def mark_as_processed(unique_id) do
    :ets.insert(:dedup_table, {unique_id, :erlang.system_time()})
  end

  def already_processed?(unique_id) do
    case :ets.lookup(:dedup_table, unique_id) do
      [{^unique_id, _timestamp}] -&gt; true
      _ -&gt; false
    end
  end

  defp schedule_cleanup do
    Process.send_after(self(), :cleanup, @cleanup_interval)
  end

  def handle_info(:cleanup, state) do
    current_time = :erlang.system_time()

    :ets.tab2list(:dedup_table)
    |&gt; Enum.each(fn
      {id, timestamp} when current_time - timestamp &gt; @ttl -&gt;
        :ets.delete(:dedup_table, id)

      _ -&gt;
        :noop
    end)

    schedule_cleanup()
    {:noreply, state}
  end
  
  def generate_unique_id(entity) do
    encoded_entity =
      entity
      |&gt; Jason.encode!()

    :crypto.hash(:sha256, encoded_entity)
    |&gt; Base.encode16()
end
end
</code></pre>
<!--kg-card-end: markdown--><h2 id="utilizing-the-solution-in-a-phoenix-liveview-application">Utilizing the Solution in a Phoenix LiveView application</h2><h3 id="integrating-mydeduplicator-with-a-supervisor">Integrating MyDeduplicator with a Supervisor</h3><p>To ensure <code>Deduplicator</code>&apos;s resilience and reliability, it&apos;s integrated into my application&apos;s supervision tree. This guarantees that <code>Deduplicator</code> is automatically restarted in case of failures, maintaining the application&apos;s robustness.</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">defmodule Deduplicator.Supervisor do
 use Supervisor

  def start_link(arg) do
    Supervisor.start_link(__MODULE__, arg, name: __MODULE__)
  end

  @impl true
  def init(_arg) do
    children = [
      {Duplicator, []}
    ]

    # Define the restart strategy
    opts = [strategy: :one_for_one, name: DeduplicatorSupervisor]
    Supervisor.init(children, opts)
  end
end
</code></pre>
<!--kg-card-end: markdown--><p><em>This Supervisor oversees <code>Deduplicator</code>, utilizing the <code>:one_for_one</code> strategy, which specifies that if the GenServer crashes, it will be the only process to be restarted.</em></p><h3 id="adding-deduplicator-to-the-phoenix-applications-supervision-tree">Adding <code>Deduplicator</code> to the Phoenix Application&apos;s Supervision Tree</h3><p>Integrating <code>Deduplicator</code> into the application&apos;s main supervision tree ensures it&apos;s started at launch, ready to deduplicate requests from the get-go. This is achieved by modifying the application&apos;s root supervisor to include <code>Deduplicator</code>&apos;s Supervisor as a child.</p><h3 id="utilizing-deduplicator-in-the-controller">Utilizing <code>Deduplicator</code> in the Controller</h3><p>With <code>Deduplicator</code> operational, we modify our Phoenix controller to leverage it for handling potential duplicate requests. Before processing any data, we check if it has already been processed, ensuring idempotency.</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">def handle_duplicates(%{assigns: _assigns} = conn, %{} = params) do
    entity = params[&quot;entity&quot;]

    unique_id = Deduplicator.generate_unique_id(entity)

    if Deduplicator.already_processed?(unique_id) do
      respond(conn)
    else
      Deduplicator.mark_as_processed(unique_id)

      process_entity(%{} = entity)
      respond(conn)
    end
end
</code></pre>
<!--kg-card-end: markdown--><h2 id="lessons-learned-and-concluding-thoughts">Lessons Learned and Concluding Thoughts</h2><p>This journey illuminated the critical role of idempotency in ensuring data integrity across distributed systems. The development of <code>Deduplicator</code> not only solved my immediate challenge but also enriched my architectural approach, emphasizing resilience and reliability.</p><p>As I move forward, the insights gained from this experience will inform my future architectures, emphasizing the power of Elixir and Phoenix in building robust, fault-tolerant applications. For fellow engineers navigating similar challenges, I hope this account serves as both a guide and an inspiration.</p><p>Ciao &#x1F44B;</p>]]></content:encoded></item><item><title><![CDATA[Stop right there and think a bit!!]]></title><description><![CDATA[As software developers, it's easy to get caught up in following the latest trends and trying to do too much at once. We see articles and tweets from people who claim to have "made it," and we feel pressured to keep up with the latest technologies and approaches. ]]></description><link>https://nkpremices.com/stop-right-there-and-think-a-bit/</link><guid isPermaLink="false">617638fa8d3d0705e2977a3b</guid><category><![CDATA[Inspiration]]></category><category><![CDATA[2022]]></category><category><![CDATA[Blog]]></category><dc:creator><![CDATA[Prémices N. Kamasuwa]]></dc:creator><pubDate>Sun, 01 Jan 2023 23:28:02 GMT</pubDate><media:content url="https://nkpremices.com/content/images/2023/01/im-going-to-stop-you-right-there-stop.gif" medium="image"/><content:encoded><![CDATA[<img src="https://nkpremices.com/content/images/2023/01/im-going-to-stop-you-right-there-stop.gif" alt="Stop right there and think a bit!!"><p>As software developers, it&apos;s easy to get caught up in following the latest trends and trying to do too much at once. We see articles and tweets from people who claim to have &quot;made it,&quot; and we feel pressured to keep up with the latest technologies and approaches. </p><p>But at a certain point, it&apos;s important to stop and think about what we really want from our careers. Instead of trying to do everything that is popular or in demand, it&apos;s essential for software developers to find their own path and focus on what interests them. In this blog post, we&apos;ll discuss the importance of finding your own path as a software developer and offer some tips for doing so.</p><p><strong>Problems with following trends </strong></p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://nkpremices.com/content/images/2023/01/TrendsWord.jpg" class="kg-image" alt="Stop right there and think a bit!!" loading="lazy" width="600" height="400" srcset="https://nkpremices.com/content/images/2023/01/TrendsWord.jpg 600w"><figcaption>trends image</figcaption></figure><p>Every day, there are new articles, blogs, and social media posts proclaiming the next big thing in our industry. It&apos;s natural to want to stay up-to-date and be a part of the excitement. But the problem with trying to follow every trend and do everything that is popular is that it can be overwhelming and ultimately unfulfilling.</p><p>By trying to do too much at once, we risk spreading ourselves too thin and not being able to fully commit to any one thing. We may find ourselves constantly jumping from one trend to the next, never really mastering anything. This can lead to burnout and a lack of direction in our careers.</p><p>Furthermore, following trends can lead us to ignore our own interests and strengths. We may end up working on things that are popular, but not necessarily what we are passionate about. This can lead to boredom and a sense of disconnection from our work.</p><p>So instead of trying to do everything that is popular or in demand, it&apos;s important for us software developers to take a step back and think about what truly interests us and what we want to achieve in our careers. By focusing on our own passions and strengths, we can build fulfilling and successful careers that are truly our own.</p><p><strong>The importance of finding your own path</strong></p><figure class="kg-card kg-image-card"><img src="https://nkpremices.com/content/images/2023/01/image-1.png" class="kg-image" alt="Stop right there and think a bit!!" loading="lazy" width="700" height="467" srcset="https://nkpremices.com/content/images/size/w600/2023/01/image-1.png 600w, https://nkpremices.com/content/images/2023/01/image-1.png 700w"></figure><p>It&apos;s very important to find our own path and focus on what interests us. This can help us build fulfilling and successful careers that are truly our own. By understanding our own passions and strengths, we can choose projects and technologies that align with these interests and allow us to make a real impact.</p><p>Finding your own path also allows you to stand out in the industry. Instead of following the same trends as everyone else, you can showcase your unique perspective and skills. This can make you a valuable asset to any team or organization, and increase your chances of success in the long run.</p><p>Furthermore, focusing on what interests you can help you stay motivated and engaged in your work. When you are working on something that you are truly passionate about, it&apos;s easier to put in the extra effort and dedication that is needed to succeed.</p><p>So instead of trying to do everything that is popular or in demand, take the time to think about what truly interests you as a software developer.</p><p><strong>Conclusion</strong></p><p>It&apos;s important for us software developers to stop and think about what we want to achieve and focus on what truly interests us. </p><p>Here are some tips for finding your own path as a software developer:</p><ol><li><strong>Network with others in the industry</strong>: Talk to other software developers and learn about their experiences and insights. This can help you get a sense of what different career paths are available and what might be a good fit for you.</li><li><strong>Seek out mentors</strong>: Find someone who has been in the industry for a while and is willing to mentor you. They can provide valuable guidance and advice as you navigate your career.</li><li><strong>Try out different technologies and approaches</strong>: Don&apos;t be afraid to experiment and try out new things. This can help you discover what you are truly interested in and what you are good at.</li><li><strong>Take on side projects</strong>: Use side projects to explore your interests and try out new technologies. This can be a great way to get hands-on experience and find out what you really enjoy doing.</li><li><strong>Keep learning</strong>: Stay up-to-date with the latest technologies and trends, but be selective about what you choose to learn. Focus on things that align with your interests and career goals.</li></ol><p>By following these tips, you can start to find your own path as a software developer and focus on what truly interests you. This can lead to a fulfilling and successful career that is truly your own.</p>]]></content:encoded></item><item><title><![CDATA[Understanding Distributed Systems, my comments]]></title><description><![CDATA[As a software engineer, I've always been fascinated by the complexity and power of distributed systems. These systems, which operate across multiple devices and locations, are at the heart of many of the technologies we rely on every day, from...]]></description><link>https://nkpremices.com/understanding-distributed-systems-my-comments/</link><guid isPermaLink="false">62ce83991932300594a74f9a</guid><category><![CDATA[2022]]></category><category><![CDATA[Blog]]></category><category><![CDATA[Book]]></category><category><![CDATA[Distributed Systems]]></category><dc:creator><![CDATA[Prémices N. Kamasuwa]]></dc:creator><pubDate>Tue, 27 Dec 2022 01:14:00 GMT</pubDate><media:content url="https://nkpremices.com/content/images/2023/01/shutterstock_2057691140.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://nkpremices.com/content/images/2023/01/shutterstock_2057691140.jpg" alt="Understanding Distributed Systems, my comments"><p>As a software engineer, I&apos;ve always been fascinated by the complexity and power of distributed systems. These systems, which operate across multiple devices and locations, are at the heart of many of the technologies we rely on every day, from the internet and cloud computing to social networks and online marketplaces. Recently, I decided to dive deeper into the subject by reading the book &apos;<strong>Understanding Distributed Systems</strong>&apos; by Roberto Vitillo. In this post, I&apos;ll share my thoughts on the book and how it has helped me better understand the fundamental concepts and principles of distributed systems. I&apos;ll also provide some examples of how these systems are used in the real world and offer some tips for those interested in working with distributed systems.</p><hr><p><strong>The book</strong> is generally well-regarded as a comprehensive and accessible introduction to the field of distributed systems. It covers a wide range of topics in detail and includes numerous examples and illustrations to help readers understand the concepts and ideas presented. Many readers have found the book to be a useful resource for learning about distributed systems and for gaining a deeper understanding of the fundamental principles that underlie these systems. Some reviewers have noted that the book can be dense and technical at times, and may be more suitable for readers with some background in computer science or a related field. </p><hr><p>One of the things I appreciated most about this book was the clear and thorough explanation of key concepts and terminology. The book covers a wide range of topics, including<em><strong> distributed algorithms, network communication protocols, fault tolerance, and security</strong></em>. Each concept is introduced in a way that is easy to understand, and the book provides numerous examples and illustrations to help readers grasp the material.</p><p>As I read the book, I found myself jotting down notes and diagrams to help me visualize the different components and processes involved in distributed systems. The book does a great job of breaking down complex ideas into bite-sized chunks and explaining them in a way that is accessible to readers with a wide range of backgrounds. Whether you are new to the field of distributed systems or have some experience under your belt, you&apos;ll find plenty of valuable information and insights in this book.</p><p>One of the things I found most interesting as I read &apos;<strong>Understanding Distributed Systems</strong>&apos; was learning about the different types of distributed systems that exist. The book distinguishes between three main types: <em><strong>peer-to-peer systems, client-server systems, </strong>and<strong> cloud computing systems</strong></em>.</p><p><strong>Peer-to-peer (P2P) systems are decentralized networks in which each device acts as both a client and a server</strong>. These systems are often used for file sharing and other forms of data exchange. Examples of P2P systems include BitTorrent and Napster.</p><p><strong>Client-server systems, on the other hand, consist of a central server that manages data and resources</strong>, and a set of clients that request and receive information from the server. These systems are commonly used for web-based applications, where the server handles the logic and data storage, and the clients are web browsers that display the information to users.</p><p><strong>Cloud computing systems are large-scale distributed systems that provide on-demand access to a shared pool of computing resources</strong>, such as servers, storage, and networking. These systems are often used for storing and processing big data, and for running complex algorithms and applications. Examples of cloud computing systems include Amazon Web Services and Microsoft Azure.</p><p>Understanding the differences between these types of distributed systems is important for designing and implementing effective and efficient systems. The book does a great job of explaining the key characteristics and trade-offs of each type, and provides examples of when each might be most appropriate.</p><hr><p>As I worked my way through the book, I found myself tempted to apply what I was learning to a proof-of-concept side project: <strong>a clone of the popular file-sharing service Dropbox, built using Rust, Go, Elixir, and TypeScript</strong>. Building a distributed system like this can be a complex and challenging endeavor, as it requires you to consider a wide range of factors, including scalability, reliability, performance, and security.</p><p>One of the things that I found most useful as I tackled this project was the book&apos;s discussion of the common challenges that arise when designing and implementing distributed systems. The book covers topics such as <strong>concurrent access to shared data, networking and communication protocols, and fault tolerance</strong>, and provides examples of how these challenges can be addressed in real-world systems.</p><p>I also appreciated the book&apos;s emphasis on the importance of testing and monitoring distributed systems. As I built my Dropbox clone, I made sure to include a suite of unit and integration tests, as well as monitoring tools to help me identify and resolve issues as they arose. Working with a variety of languages allowed me to gain experience with different tools and approaches, and helped me to better understand the trade-offs and benefits of each. Although I have to say, it wasn&apos;t an easy thing to do.</p><p>Overall, trying to build my own distributed system helped me to gain a deeper understanding of the concepts and principles covered in the book. It was a challenging but rewarding experience and one that I would recommend to anyone interested in working with distributed systems.</p><p><strong>Conclusion</strong><br><br>In conclusion, the book <strong>Understanding Distributed Systems</strong> has been a valuable resource for me as I delve into the world of distributed systems. It has helped me to gain a deeper understanding of the fundamental concepts and principles that underlie these systems and has provided me with a wealth of practical insights and examples to draw upon.</p><p>One of the ways I&apos;ve put this knowledge into practice is by building a proof-of-concept clone of the popular file-sharing service Dropbox using Rust, Go, Elixir, and TypeScript. This project has been a challenging but rewarding experience and has given me a firsthand appreciation of the complexity and power of distributed systems.</p><p>Though my Dropbox clone is still a work in progress, I&apos;m excited to see what 2023 holds. With the knowledge and skills I&apos;ve gained from reading this book and working on this project, I&apos;m looking forward to continuing to explore the field of distributed systems and perhaps even writing a clear documentation of how far I am in builiding my own system.<br><br>___</p><p>I hope this post has provided you with a helpful overview of the book &apos;<strong>Understanding Distributed Systems</strong>&apos; and has given you some insight into the world of distributed systems. Whether you are just starting out in this field or are a seasoned pro, I think you&apos;ll find this book to be a valuable resource for understanding and working with these complex and powerful systems.</p><p>Ciao &#x1F44B;&#x1F3FE;<br></p>]]></content:encoded></item><item><title><![CDATA[Creating a cronjob micro-service using Elixir]]></title><description><![CDATA[If you've ever needed to automate a task or ensure that an important job gets done on schedule, you've probably used a cronjob. Simply put, a cronjob is a tool that allows you to schedule tasks to run automatically at a predetermined time or...]]></description><link>https://nkpremices.com/create-a-cronjob-micro-service-using-elixir/</link><guid isPermaLink="false">62bde1958d3d0705e2977ed3</guid><category><![CDATA[2022]]></category><category><![CDATA[Blog]]></category><category><![CDATA[Elixir]]></category><category><![CDATA[Phoenix Framework]]></category><category><![CDATA[Tech]]></category><dc:creator><![CDATA[Prémices N. Kamasuwa]]></dc:creator><pubDate>Tue, 20 Dec 2022 00:45:00 GMT</pubDate><media:content url="https://nkpremices.com/content/images/2023/01/what-you-should-know-about-elixir-part1.png" medium="image"/><content:encoded><![CDATA[<img src="https://nkpremices.com/content/images/2023/01/what-you-should-know-about-elixir-part1.png" alt="Creating a cronjob micro-service using Elixir"><p>If you&apos;ve ever needed to automate a task or ensure that an important job gets done on schedule, you&apos;ve probably used a cronjob. Simply put, a cronjob is a tool that allows you to schedule tasks to run automatically at a predetermined time or interval. Whether it&apos;s sending out a daily report or backing up a database, cronjobs are a convenient way to automate repetitive or time-consuming tasks.</p><p>Recently, I was working on a side project where I wanted to explore the benefits and trade-offs of using multiple languages in a system. I decided to create a cronjob micro-service using <strong>Elixir</strong>, <em>a functional programming language that is well-suited for building scalable and fault-tolerant systems</em>.</p><p>In this blog post, I&apos;ll document my experience building the service, and share some of the benefits and challenges I encountered along the way. If you&apos;re interested in using Elixir to create your own cronjob service, or if you&apos;re just curious about what&apos;s involved, I hope you&apos;ll find this post helpful.</p><p><strong>Why Elixir for a Cronjob Service?</strong></p><p>When it came to choosing a language for my cronjob service, I knew I wanted something that was fast, scalable, and fault-tolerant. Elixir checked all of those boxes and then some.</p><p>For those unfamiliar with Elixir, it is a functional programming language that runs on the Erlang virtual machine. One of the key benefits of Elixir is its support for concurrency, which allows it to easily handle multiple tasks concurrently. This makes it well-suited for building scalable and fault-tolerant systems.</p><p>In addition to its good performance and concurrency support, I was also drawn to Elixir&apos;s functional nature. Elixir encourages a functional programming style, which can make it easier to reason about code and write tests. All of these factors made Elixir an appealing choice for my cronjob service.</p><p><strong>Using Phoenix for the Web Interface</strong></p><p>To build the web interface for my cronjob service, I decided to use the Phoenix framework. Phoenix is a popular web framework for Elixir that makes it easy to build scalable and reliable web applications. It offers a variety of features that made it a good fit for my cronjob service, including support for web sockets, channels, and live view.</p><p>One of the key benefits of Phoenix is its use of the actor model for concurrency. In Phoenix, each web request is handled by its own Elixir process, which makes it easy to scale the application by adding more processes. This makes Phoenix well-suited for building a cronjob service, which may need to handle a large number of concurrent tasks.</p><p><strong>Overall Architecture</strong></p><p>The overall architecture of my cronjob service is designed to be scalable and fault-tolerant. It is composed of multiple Elixir processes that communicate with each other using the actor model. Each process is responsible for a specific task, such as scheduling a job to run or executing a job.</p><p>To ensure that the service can recover from failures, I used Elixir&apos;s built-in process supervision to monitor the health of the service. If any process fails, the supervisor will restart it, ensuring that the service stays up and running.</p><p><strong>The Role of the Database</strong></p><p>In my cronjob service, I used a database to store information about the tasks that are scheduled to run and the status of those tasks. This made it easy to track the progress of the service and ensure that tasks were being run as expected.</p><p>Integrating a database with an Elixir-based service can sometimes be a challenge, but I found that Elixir&apos;s <strong>Ecto</strong> library made it relatively straightforward. Ecto is a database library for Elixir that provides a simple interface for querying and updating a database.</p><p><strong>Setting up the Development Environment</strong></p><p>Before I could start building my cronjob service, I needed to set up a development environment. This involved installing Elixir, the Phoenix framework, and any other dependencies that were needed.</p><p>If you&apos;re new to Elixir, the first step is to install the Elixir runtime and build tools. You can find instructions for installing Elixir on the Elixir website. Once Elixir is installed, you&apos;ll also need to install the Phoenix framework. You can do this by running the following command:</p><!--kg-card-begin: markdown--><pre><code class="language-elixir">mix archive.install hex phx_new
</code></pre>
<!--kg-card-end: markdown--><p>Next, you&apos;ll need to set up a database for your cronjob service. I chose to use <strong>PostgreSQL</strong>, but you could use any database that is supported by Elixir&apos;s <strong>Ecto library</strong>. Once you have a database set up, you&apos;ll need to configure your development environment to use it. This typically involves creating a database and a user, and then updating your Phoenix configuration to use the correct database credentials.</p><p><strong>Defining the Tasks</strong></p><p>Once I had my development environment set up, I was ready to start defining the tasks that my cronjob service would run. I used Elixir&apos;s built-in scheduling functions, such as <code>cron/4</code>, to specify the schedule for each task. For example, if I wanted to run a task every hour, I would use the following code:</p><!--kg-card-begin: markdown--><pre><code class="language-elixir">cron(&quot;0 * * * *&quot;, MyApp.TaskScheduler, :run_task, [])
</code></pre>
<!--kg-card-end: markdown--><p>In this example, the <code>cron/4</code> function takes four arguments: a cron expression, a module, a function, and a list of arguments. The cron expression specifies the schedule for the task, and the module and function specify the code that should be run when the task is triggered.</p><p><strong>Implementing the Tasks</strong></p><p>Once I had defined the tasks that my cronjob service would run, I needed to implement the code that would actually perform the work. This involved writing Elixir functions that would be called by the <code>cron/4</code> function when the tasks were triggered.</p><p>For example, let&apos;s say I had defined a task to send a daily report by email. The implementation of this task might look something like this:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">defmodule MyApp.TaskScheduler do
  def run_task do
    # Generate the report
    report = generate_report()

    # Send the report by email
    send_email(report)
  end

  defp generate_report do
    # code to generate the report goes here
  end

  defp send_email(report) do
    # code to send the email goes here
  end
end
</code></pre>
<!--kg-card-end: markdown--><p>In this example, the <code>run_task/0</code> function is the entry point for the task. It calls the <code>generate_report/0</code> and <code>send_email/1</code> functions to perform the work of generating and sending the report.</p><p><strong>Configuring the Cronjob Service</strong></p><p>Once I had implemented the tasks that my cronjob service would run, I needed to configure the service to run on a predetermined schedule. To do this, I used the Phoenix framework to set up routes and controllers for the service.</p><p>For example, let&apos;s say I wanted to create a web interface for my cronjob service that would allow users to view and manage the tasks that were scheduled to run.</p><p>To create a web interface for my cronjob service, I used the Phoenix framework to set up routes and controllers. For example, I might create a route like this:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">scope &quot;/tasks&quot;, MyApp do
  pipe_through :api
  resources &quot;/&quot;, TaskController
end
</code></pre>
<!--kg-card-end: markdown--><p>This route would allow users to access the <code>tasks resource</code> at the <code>/tasks</code> URL. I could then create a <code>TaskController</code> to handle requests to this resource.</p><p>To make it easy for users to view and manage the tasks that were scheduled to run, I used Phoenix&apos;s live view feature. Live view allows you to build real-time, interactive interfaces with minimal coding. For example, I might create a live view like this:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">defmodule MyApp.TaskLive do
  use Phoenix.LiveView
  
  def mount(_params, _session, socket) do
    tasks = fetch_tasks()
    {:ok, assign(socket, tasks: tasks)}
  end
  
  def render(assigns) do
    # code to render the live view goes here
  end
  
  def handle_event(&quot;add_task&quot;, %{&quot;name&quot; =&gt; name, &quot;schedule&quot; =&gt; schedule}, socket) do
    # code to handle the &quot;add_task&quot; event goes here
  end
  
  defp fetch_tasks do
    # code to fetch the tasks from the database goes here
  end
end
</code></pre>
<!--kg-card-end: markdown--><p>In this example, the <code>mount/3</code> function is called when the live view is first rendered. It fetches the tasks from the database and assigns them to the <code>tasks</code> variable. The <code>render/1</code> function is then called to render the live view, and the <code>handle_event/3</code> function is called to handle events that are sent from the client (such as an &quot;add_task&quot; event).</p><p>Using live view made it easy for me to create a real-time, interactive interface for my cronjob service. Users could view and manage the tasks that were scheduled to run, and they could see the changes in real time as they were made.</p><p><br><strong>External communication with the service</strong></p><p>Here is an example of how I used a message queue (in this case, RabbitMQ) to communicate with the cronjob service:</p><p>First, I needed to set up a RabbitMQ server and install the <code>amqp</code> library, which is an Elixir client library for RabbitMQ. You can do this by adding the following dependencies to your <code>mix.exs</code> file:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">defp deps do
  [
    {:amqp, &quot;~&gt; 3.0&quot;}
  ]
end
</code></pre>
<!--kg-card-end: markdown--><p>Next, I needed to create a connection to the RabbitMQ server and set up a channel for sending and receiving messages. We can do this in our application&apos;s startup code:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">def start(_type, _args) do
  # Connect to the RabbitMQ server
  {:ok, conn} = AMQP.Connection.open(...)

  # Open a channel
  {:ok, chan} = AMQP.Channel.open(conn)

  # Set up queues and exchanges
  AMQP.Queue.declare(chan, &quot;tasks&quot;, durable: true)
  AMQP.Exchange.direct(chan, &quot;tasks&quot;, durable: true)
  AMQP.Queue.bind(chan, &quot;tasks&quot;, &quot;tasks&quot;, &quot;&quot;)

  # Start the task scheduler process
  TaskScheduler.start_link(chan)
  
    # Start the web server
  {:ok, _pid} = Phoenix.Server.start_link(...)
end

</code></pre>
<!--kg-card-end: markdown--><p>With the connection and channel set up, we can start using RabbitMQ to send and receive messages. Here is an example of how one might use it in the <code>TaskScheduler</code> process:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">defmodule TaskScheduler do
  use GenServer

  def start_link(chan) do
    GenServer.start_link(__MODULE__, chan, name: __MODULE__)
  end

  def init(chan) do
    # Set up the queue and exchange
    {:ok, _queue} = AMQP.Queue.declare(chan, &quot;&quot;, exclusive: true)
    AMQP.Queue.bind(chan, &quot;&quot;, &quot;tasks&quot;, &quot;&quot;)

    # Set up a consumer to receive messages from the queue
    AMQP.Basic.consume(chan, &quot;&quot;, fn(payload, _metadata, _ack) -&gt;
      # Parse the message and schedule the task
      {:ok, %{&quot;name&quot; =&gt; name, &quot;schedule&quot; =&gt; schedule}} = Jason.decode(payload)
      schedule_task(name, schedule)

      # Acknowledge the message
      AMQP.Basic.ack(chan, _ack)
    end)

    # Return the initial state
    {:ok, chan}
  end

  def schedule_task(name, schedule) do
    # Code to schedule the task goes here
  end
end
</code></pre>
<!--kg-card-end: markdown--><p>This <code>TaskScheduler</code> process uses RabbitMQ to set up a consumer that listens for messages on the &quot;tasks&quot; exchange. When a message is received, it parses the message and calls the <code>schedule_task/2</code> function to schedule the task.</p><p>To send a message to the cronjob service, you can use the <code>AMQP.Basic.publish/4</code> function. For example:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">def create_task(name, schedule) do
  # Connect to the RabbitMQ server
  {:ok, conn} = AMQP.Connection.open(...)

  # Open a channel
  {:ok, chan} = AMQP.Channel.open(conn)

  # Encode the message
  message = Jason.encode(%{&quot;name&quot; =&gt; name, &quot;schedule&quot; =&gt; schedule})

  
  # Publish the message to the &quot;tasks&quot; exchange
  AMQP.Basic.publish(chan, &quot;tasks&quot;, &quot;&quot;, message)
  
  # Close the channel and connection
  AMQP.Channel.close(chan)
  AMQP.Connection.close(conn)
end
</code></pre>
<!--kg-card-end: markdown--><p><strong>Conclusion</strong><br><br>Creating a cronjob micro-service using Elixir was a fun and interesting project. I enjoyed the process of setting up a development environment, defining and implementing tasks, and configuring the service to run on a predetermined schedule.</p><p>One of the biggest challenges I faced was getting used to the syntax and concepts of Elixir, which was new to me. However, once I got the hang of it, I found that Elixir was a powerful and expressive language that made it easy to build the cronjob service.</p><p>If you&apos;re interested in creating your own cronjob service using Elixir, I recommend checking out the following resources:</p><ul><li>The Elixir documentation: <a href="https://elixir-lang.org/docs/stable/elixir/">https://elixir-lang.org/docs/stable/elixir/</a></li><li>The Phoenix framework documentation: <a href="https://hexdocs.pm/phoenix/index.html">https://hexdocs.pm/phoenix/index.html</a></li><li>The Ecto library documentation: <a href="https://hexdocs.pm/ecto/index.html">https://hexdocs.pm/ecto/index.html</a></li></ul><p>I hope this blog post has been helpful and gives you an idea of what&apos;s involved in creating a cronjob service using Elixir. If you have any questions or comments, I&apos;d love to hear from you!</p><p></p><p><em>If you find anything wrong, or anything that needs correction, please feel free to leave a comment and let me know and I will make sure to check it out and address it.</em><br><br>Ciao &#x1F44B;&#x1F3FE;</p><hr><p>References <br><br>1. <a href="https://github.com/quantum-elixir/quantum-core">https://github.com/quantum-elixir/quantum-core</a></p><p>2. <a href="https://www.phoenixframework.org/">https://www.phoenixframework.org/</a></p><p>3. <a href="https://blog.kalvad.com/write-your-own-cron-with-with-elixir/">https://blog.kalvad.com/write-your-own-cron-with-with-elixir/</a></p><p>4. <a href="https://wrgoldstein.github.io/2017/02/20/phoenix-rabbitmq.html">https://wrgoldstein.github.io/2017/02/20/phoenix-rabbitmq.html</a></p>]]></content:encoded></item><item><title><![CDATA[Serving an unsupported third-party middleware to the NestJs dependency injection layer]]></title><description><![CDATA[Have you ever found yourself in the middle of a project and realized that the tool you need to use is not supported by your framework of choice? That's exactly what happened to me recently when I was working on a NestJS application and wanted to integrate Cloudinary for image hosting.]]></description><link>https://nkpremices.com/serving-an-unsupported-third-party-middleware-to-the-nestjs-dependency-injection-layer/</link><guid isPermaLink="false">62ce80eb1932300594a74f76</guid><category><![CDATA[2022]]></category><category><![CDATA[NestJs]]></category><category><![CDATA[Typescript]]></category><category><![CDATA[Blog]]></category><dc:creator><![CDATA[Prémices N. Kamasuwa]]></dc:creator><pubDate>Sat, 17 Dec 2022 23:57:00 GMT</pubDate><media:content url="https://nkpremices.com/content/images/2023/01/nESTjs2.JPG" medium="image"/><content:encoded><![CDATA[<img src="https://nkpremices.com/content/images/2023/01/nESTjs2.JPG" alt="Serving an unsupported third-party middleware to the NestJs dependency injection layer"><p>Have you ever found yourself in the middle of a project and realized that the tool you need to use is not supported by your framework of choice? That&apos;s exactly what happened to me recently when I was working on a <strong>NestJS</strong> application and wanted to integrate <strong>Cloudinary</strong> for image hosting.</p><p><strong>NestJS</strong> is a powerful framework for building server-side applications with Node.js, and it offers a variety of built-in modules and middleware for common tasks such as logging, validation, and routing. However, when it comes to working with third-party services, there may be times when you need to add a custom solution to the mix.</p><p>In this blog post, I&apos;ll share my experience of serving an unsupported third-party middleware (<strong>Cloudinary</strong>) to the <strong>NestJS</strong> dependency injection layer. I&apos;ll explain the options I considered and the approach I ultimately took to solve this problem. I hope that by reading this post, you&apos;ll be able to apply these techniques to your own NestJS projects and take your skills to the next level.</p><p><strong>Dependency injection</strong></p><p><strong>&quot;Dependency injection&quot;</strong> is a design pattern that helps to decouple parts of a system and make it more flexible and easier to test. In <strong>NestJS</strong>, the dependency injection system is based on the inversion of control (IoC) principle, which means that the framework is responsible for creating and supplying the dependencies required by a module or component.</p><p>To use dependency injection in <strong>NestJS</strong>, you first need to define a provider, which is a class or a function that returns an object or a value. This provider can then be injected into a module, controller, or service using the <code>@Injectable()</code> decorator.</p><p>For example, let&apos;s say you have a <code>LoggerService</code> that you want to use in multiple places throughout your application. You can define the <code>LoggerService</code> as a provider and then inject it wherever it is needed by using the <code>@Inject()</code> decorator:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">@Injectable()
export class LoggerService {
  log(message: string) {
    console.log(message);
  }
}

@Controller()
export class SomeController {
  constructor(@Inject(LoggerService) private logger: LoggerService) {}

  @Get()
  doSomething() {
    this.logger.log(&apos;Doing something...&apos;);
  }
}
</code></pre>
<!--kg-card-end: markdown--><p><strong>Adding third-party middleware to NestJS</strong></p><p>In <strong>NestJS</strong>, adding third-party middleware to your application is typically a straightforward process. First, you need to install the npm package for the middleware you want to use. Then, you can apply the middleware to a specific route or to the entire application using the <code>@UseMiddleware()</code> decorator.</p><p>For example, let&apos;s say you want to add the <code>cors</code> middleware to your <strong>NestJS</strong> application to enable cross-origin resource sharing (CORS). You can install the <code>cors</code> package using npm or yarn:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">npm install cors
</code></pre>
<!--kg-card-end: markdown--><p>Then, you can apply the <code>cors</code> middleware to a specific route by using the <code>@UseMiddleware()</code> decorator:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">import { Controller, Get, UseMiddleware } from &apos;@nestjs/common&apos;;
import * as cors from &apos;cors&apos;;

@Controller()
export class SomeController {
  @Get()
  @UseMiddleware(cors())
  doSomething() {
    // ...
  }
}
</code></pre>
<!--kg-card-end: markdown--><p>Or, you can apply the <code>cors</code> middleware to the entire application by using the <code>app.use()</code> method in the root module:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">import { MiddlewareConsumer, Module, NestModule } from &apos;@nestjs/common&apos;;
import * as cors from &apos;cors&apos;;

@Module({})
export class AppModule implements NestModule {
  configure(consumer: MiddlewareConsumer): void {
    consumer.apply(cors()).forRoutes(&apos;*&apos;);
  }
}
</code></pre>
<!--kg-card-end: markdown--><p>This is the normal process for adding third-party middleware to NestJS, but what do you do when the middleware you want to use is not officially supported by the framework? That&apos;s the topic of the next section, where we&apos;ll discuss the options for serving unsupported third-party middleware to the NestJS dependency injection layer.</p><p><strong>Serving unsupported third-party middleware</strong></p><p>As mentioned earlier, I recently ran into the challenge of integrating <strong>Cloudinary</strong> in a <strong>NestJS</strong> codebase. Cloudinary is a popular cloud-based image hosting and manipulation service, but it is not officially supported by <strong>NestJS</strong>. This meant that I had to find a way to serve the Cloudinary middleware to the <strong>NestJS</strong> dependency injection layer.</p><p>After researching different options, I decided to create a custom provider that wrapped the Cloudinary middleware and made it available for injection. This approach involved creating a class or function that returned the middleware as an object or a function and then decorating it with the <code>@Injectable()</code> decorator. Here is an example of the <code>CloudinaryMiddleware</code> provider I created:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">import { Injectable } from &apos;@nestjs/common&apos;;
import * as cloudinary from &apos;cloudinary&apos;;

@Injectable()
export class CloudinaryMiddleware {
  getMiddleware() {
    return cloudinary.v2.uploader.upload;
  }
}
</code></pre>
<!--kg-card-end: markdown--><p>The <code>cloudinary.v2.uploader.upload</code> function is the main method for uploading images to Cloudinary. By exposing it as a middleware function, I was able to use the <code>@UseMiddleware()</code> decorator to apply it to a specific route in my NestJS application:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">import { Controller, Get, Inject, UseMiddleware } from &apos;@nestjs/common&apos;;
import { CloudinaryMiddleware } from &apos;./cloudinary.middleware&apos;;

@Controller()
export class SomeController {
  constructor(@Inject(CloudinaryMiddleware) private cloudinaryMiddleware: CloudinaryMiddleware) {}

  @Get()
  @UseMiddleware(this.cloudinaryMiddleware.getMiddleware())
  doSomething() {
    // ...
  }
}
</code></pre>
<!--kg-card-end: markdown--><p>The <code>CloudinaryMiddleware</code> provider is just a wrapper around the <code>cloudinary.v2.uploader.upload</code> function, which is the main method for uploading images to Cloudinary.</p><p>To actually use this middleware to upload files, you would need to do the following:</p><ol><li>Install the Cloudinary npm package: <code>npm install cloudinary</code></li><li>Set up your Cloudinary account and obtain your API key, API secret, and cloud name. You can find these details in the Cloudinary dashboard.</li><li>Configure the Cloudinary npm package with your API key, API secret, and cloud name:</li></ol><!--kg-card-begin: markdown--><pre><code class="language-javascript">import * as cloudinary from &apos;cloudinary&apos;;

cloudinary.v2.config({
  api_key: &apos;YOUR_API_KEY&apos;,
  api_secret: &apos;YOUR_API_SECRET&apos;,
  cloud_name: &apos;YOUR_CLOUD_NAME&apos;
});
</code></pre>
<!--kg-card-end: markdown--><p>4. Use the <code>@UseMiddleware()</code> decorator to apply the <code>CloudinaryMiddleware</code> middleware to a specific route or controller in your NestJS application. You can then call the <code>cloudinary.v2.uploader.upload</code> function within that route or controller to upload an image to Cloudinary.</p><p>Here is an example of how you might use the <code>CloudinaryMiddleware</code> middleware to upload an image from a NestJS route:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">import { Controller, Get, Inject, UseMiddleware } from &apos;@nestjs/common&apos;;
import { CloudinaryMiddleware } from &apos;./cloudinary.middleware&apos;;

@Controller()
export class SomeController {
  constructor(@Inject(CloudinaryMiddleware) private cloudinaryMiddleware: CloudinaryMiddleware) {}

  @Get()
  @UseMiddleware(this.cloudinaryMiddleware.getMiddleware())
  async doSomething() {
    const imagePath = &apos;/path/to/image.jpg&apos;;
    const result = await this.cloudinaryMiddleware.getMiddleware()(imagePath);
    console.log(result); // logs the uploaded image details
  }
}
</code></pre>
<!--kg-card-end: markdown--><p><strong>Conclusion</strong></p><p>In this blog post, we looked at the problem of serving an unsupported third-party middleware (<strong>Cloudinary</strong>) to the <strong>NestJS</strong> dependency injection layer. We discussed two options for solving this problem: creating a custom provider, and extending the <code>HttpAdapterHost</code> class.</p><p>Using a custom provider, we were able to wrap the Cloudinary middleware and make it available for injection into a module or controller. This allowed us to use the <code>@UseMiddleware()</code> decorator to apply the Cloudinary middleware to a specific route or controller in our NestJS application.</p><p>Alternatively, we could have extended the <code>HttpAdapterHost</code> class and overridden the <code>register()</code> method to apply the Cloudinary middleware to the underlying HTTP server instance. This approach may be useful if you want to apply the middleware globally to the entire application.</p><p>I hope that by reading this blog post, you&apos;ve gained a better understanding of how to serve unsupported third-party middleware to the NestJS dependency injection layer. Whether you&apos;re working with Cloudinary or another service, these techniques can help you extend the capabilities of your NestJS applications and take your skills to the next level.</p><hr><p><strong>References</strong>:</p><ul><li>NestJS documentation: <a href="https://docs.nestjs.com/">https://docs.nestjs.com/</a></li><li>NestJS middleware tutorial: <a href="https://docs.nestjs.com/middleware">https://docs.nestjs.com/middleware</a></li><li>NestJS dependency injection documentation: <a href="https://docs.nestjs.com/fundamentals/dependency-injection">https://docs.nestjs.com/fundamentals/dependency-injection</a></li><li>Cloudinary documentation: <a href="https://cloudinary.com/documentation">https://cloudinary.com/documentation</a></li><li>Cloudinary npm package: <a href="https://www.npmjs.com/package/cloudinary">https://www.npmjs.com/package/cloudinary</a></li></ul>]]></content:encoded></item><item><title><![CDATA[Brainstorming Ideas for Exposing a Postgres 9.6 Server for Remote Access on a Custom Domain with Nginx Reverse Proxy (Ubuntu 18.04)]]></title><description><![CDATA[Exposing a local Postgres server for remote access on a custom domain using Nginx as a reverse proxy can be a useful configuration for a variety of scenarios, such ...]]></description><link>https://nkpremices.com/setting-up-nginx-as-revers-proxy-for-postgres-ubuntu/</link><guid isPermaLink="false">61dacda18d3d0705e2977e92</guid><category><![CDATA[2022]]></category><category><![CDATA[Blog]]></category><category><![CDATA[Databases]]></category><category><![CDATA[Postgres]]></category><category><![CDATA[Ubuntu]]></category><category><![CDATA[Nginx]]></category><dc:creator><![CDATA[Prémices N. Kamasuwa]]></dc:creator><pubDate>Fri, 16 Dec 2022 23:01:00 GMT</pubDate><media:content url="https://nkpremices.com/content/images/2023/01/databaseadministration.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://nkpremices.com/content/images/2023/01/databaseadministration.jpg" alt="Brainstorming Ideas for Exposing a Postgres 9.6 Server for Remote Access on a Custom Domain with Nginx Reverse Proxy (Ubuntu 18.04)"><p>Exposing a local Postgres server for remote access on a custom domain using Nginx as a reverse proxy can be a useful configuration for a variety of scenarios, such as hosting a database for a web application or enabling remote access for database administration tasks. In this blog post, we&apos;ll explore some ideas for setting up this configuration on a server running Ubuntu 18.04.</p><p>Throughout this post, we&apos;ll brainstorm different approaches to setting up a Postgres server for remote access on a custom domain with Nginx reverse proxy, considering factors such as security, performance, and maintenance. Whether you&apos;re a seasoned Postgres administrator or just getting started, we hope that this post will provide some useful insights and ideas for setting up this configuration.</p><h4 id="setting-up-nginx-as-a-reverse-proxy">Setting up Nginx as a reverse proxy</h4><p>To set up Nginx as a reverse proxy for a Postgres server, you will need to perform the following steps:</p><ol><li>Install Nginx on your server: To install Nginx on a server running Ubuntu 18.04, you can use the following command:</li></ol><!--kg-card-begin: markdown--><pre><code class="language-bash">sudo apt-get update
sudo apt-get install nginx
</code></pre>
<!--kg-card-end: markdown--><p>2. Configure Nginx as a reverse proxy: To configure Nginx as a reverse proxy, you will need to create a new configuration file in the <code>/etc/nginx/conf.d</code> directory. For example, you might create a file named <code>postgres.conf</code> with the following contents:</p><!--kg-card-begin: markdown--><pre><code class="language-bash">server {
    listen 80;
    server_name example.com;

    location / {
        proxy_pass http://localhost:5432;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }
}
</code></pre>
<!--kg-card-end: markdown--><p>This configuration will cause Nginx to listen for incoming requests on port 80 and forward them to the Postgres server listening on port 5432. You will need to replace <code>example.com</code> with the domain that you want to use for your Postgres server.</p><p>3. Restart Nginx: After saving the configuration file, you will need to restart Nginx to apply the changes. You can do this by running the following command:</p><!--kg-card-begin: markdown--><pre><code class="language-bash">sudo systemctl restart nginx
</code></pre>
<!--kg-card-end: markdown--><p>Test the configuration: To test the configuration, you can use a tool such as <code>curl</code> to send a request to the Postgres server through Nginx. For example:</p><!--kg-card-begin: markdown--><pre><code class="language-bash">curl http://example.com
</code></pre>
<!--kg-card-end: markdown--><p>If the configuration is working correctly, you should receive a response from the Postgres server.</p><p><strong>Allowing remote connections to the Postgres server</strong></p><p>By default, Postgres is configured to only listen for connections from localhost. To allow remote connections to the Postgres server, you will need to perform the following steps:</p><ol><li>Edit the <code>postgresql.conf</code> file: Open the <code>postgresql.conf</code> file in a text editor and locate the <code>listen_addresses</code> parameter. Set this parameter to <code>&apos;*&apos;</code> to allow Postgres to listen for connections from any host.</li><li>Edit the <code>pg_hba.conf</code> file: Open the <code>pg_hba.conf</code> file in a text editor and add a line to allow connections from the IP address of the server where Nginx is running. For example:</li></ol><p><code>host &#xA0; &#xA0;all &#xA0; &#xA0; &#xA0; &#xA0; &#xA0; &#xA0; all &#xA0; &#xA0; &#xA0; &#xA0; &#xA0; &#xA0; 1.2.3.4/32 &#xA0; &#xA0; &#xA0; &#xA0; &#xA0; &#xA0;md5</code></p><p>Replace <code>1.2.3.4</code> with the actual IP address of the server where Nginx is running.</p><p>3. Restart the Postgres server: After making these changes, you will need to restart the Postgres server to apply the changes. You can do this by running the following command:</p><p><code>sudo service postgresql restart</code></p><p>4. Test the configuration: To test the configuration, you can use a tool such as <code>psql</code> to connect to the Postgres server from a remote location. For example:</p><p><code>psql -h example.com -U postgres</code></p><p>If the configuration is working correctly, you should be able to connect to the Postgres server from a remote location.<br><br><strong>Configuring SSL/TLS for HTTPS</strong></p><p>To secure the connection between the client and the Postgres server using HTTPS, you will need to obtain an SSL/TLS certificate and configure Nginx to use it. There are two main options for obtaining a certificate:</p><ol><li>Obtain a certificate from a trusted certificate authority (CA): One option is to obtain a certificate from a trusted CA such as Let&apos;s Encrypt or DigiCert. These CAs offer free and low-cost certificates that are widely recognized as trusted by web browsers and other clients. To obtain a certificate from a CA, you will need to follow the CA&apos;s specific instructions for generating and installing a certificate.</li><li>Use a self-signed certificate: Another option is to generate a self-signed certificate for testing or development purposes. While self-signed certificates are not trusted by web browsers and other clients by default, they can be useful for testing or prototyping. To generate a self-signed certificate, you can use the <code>openssl</code> tool. For example:</li></ol><!--kg-card-begin: markdown--><pre><code class="language-bash">openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout example.key -out example.crt
</code></pre>
<!--kg-card-end: markdown--><p>This command will generate a self-signed certificate and a private key, which you can use to configure Nginx.</p><p>To configure Nginx to use an SSL/TLS certificate, you will need to modify the Nginx configuration file that you created earlier. Specifically, you will need to add the following lines to the <code>server</code> block:</p><!--kg-card-begin: markdown--><pre><code>listen 443 ssl;
    ssl_certificate /path/to/example.crt;
    ssl_certificate_key /path/to/example.key;
</code></pre>
<!--kg-card-end: markdown--><p>Make sure to replace <code>/path/to/example.crt</code> and <code>/path/to/example.key</code> with the actual paths to the certificate and private key files.</p><p>After making these changes, you will need to restart Nginx to apply the changes. You can do this by running the following command:</p><p><code>sudo systemctl restart nginx</code></p><p>To test the configuration, you can use a tool such as <code>curl</code> to send a request to the Postgres server over HTTPS. For example:</p><p><code>curl --insecure <a href="https://example.com">https://example.com</a></code></p><p>If the configuration is working correctly, you should receive a response from the Postgres server.</p><h4 id="testing-the-setup">Testing the setup</h4><p>To test the setup and ensure that everything is working as expected, you can use a tool such as <code>psql</code> to connect to the Postgres server from a remote location.</p><p>To connect to the Postgres server using <code>psql</code>, you will need to specify the hostname of the server (e.g., <code>example.com</code>) and the username of a Postgres user that has the necessary privileges to connect to the server.</p><p>For example:</p><!--kg-card-begin: markdown--><pre><code>psql -h example.com -U postgres
</code></pre>
<!--kg-card-end: markdown--><p>If the setup is working correctly, you should be able to connect to the Postgres server and perform database tasks such as creating tables and inserting data.</p><p>You can also use other tools such as <code>pgadmin</code> or a web-based administration tool to connect to the Postgres server and perform database tasks.</p><h4 id="conclusion">Conclusion</h4><p>In this blog post, we explored some ideas for setting up a Postgres server for remote access on a custom domain with Nginx as a reverse proxy. We considered factors such as security, performance, and maintenance, and brainstormed different approaches to configuring Nginx and Postgres to allow remote connections.</p><p>Please note that these ideas are for brainstorming purposes only and have not been tested. If you have tried any of these approaches and encountered any issues, or if you have any suggestions for improving the setup, please leave a comment below. We&apos;d love to hear from you!</p><p>Whether you are a seasoned Postgres administrator or just getting started, we hope that you found this post helpful and informative. If you have any further questions or need more assistance with this topic, please don&apos;t hesitate to reach out. We&apos;d be happy to help!</p><hr><p><strong>References</strong></p><ol><li><a href="https://www.postgresql.org/docs/">https://www.postgresql.org/docs/</a></li><li><a href="https://nginx.org/en/docs/">https://nginx.org/en/docs/</a></li><li><a href="https://letsencrypt.org/">https://letsencrypt.org/</a></li><li><a href="https://www.openssl.org/docs/">https://www.openssl.org/docs/</a></li><li><a href="https://www.postgresql.org/docs/current/app-psql.html">https://www.postgresql.org/docs/current/app-psql.html</a></li><li><a href="https://wiki.postgresql.org/">https://wiki.postgresql.org/</a></li><li><a href="https://www.pgadmin.org/docs/">https://www.pgadmin.org/docs/</a></li></ol>]]></content:encoded></item><item><title><![CDATA[Why Software Developers should make time for side projects]]></title><description><![CDATA[Why Software Developers should make time for side projects]]></description><link>https://nkpremices.com/why-software-developers-should-make-time-for-side-projects/</link><guid isPermaLink="false">63a5887644126d0512887fa8</guid><category><![CDATA[Inspiration]]></category><category><![CDATA[2022]]></category><category><![CDATA[Blog]]></category><category><![CDATA[Tech]]></category><dc:creator><![CDATA[Prémices N. Kamasuwa]]></dc:creator><pubDate>Fri, 02 Dec 2022 13:55:00 GMT</pubDate><media:content url="https://nkpremices.com/content/images/2022/12/cq5dam.web.3840.3840.jpeg" medium="image"/><content:encoded><![CDATA[<img src="https://nkpremices.com/content/images/2022/12/cq5dam.web.3840.3840.jpeg" alt="Why Software Developers should make time for side projects"><p>As a software developer, it&apos;s easy to get caught up in the daily grind of working on client projects or tasks assigned by your employer. While it&apos;s important to focus on your job and deliver quality work, it&apos;s also essential to make time for side projects - those little side ventures that allow you to explore new technologies, improve your skills, and work on something that is purely for your own enjoyment.</p><p>Many successful developers have credited their side projects as a crucial factor in their growth and career advancement. In this blog post, we&apos;ll explore the benefits of side projects and why it&apos;s important to never underestimate their potential. So let&apos;s get started!</p><p><strong>The benefits of side projects</strong></p><p>Side projects are a crucial aspect of the software development world, and they should never be underestimated.</p><p>There are countless benefits to pursuing side projects as a software developer. For one, side projects are a great way to learn new technologies and expand your knowledge set. Whether you want to try out a new programming language, explore a new framework, or experiment with a new library, side projects provide the perfect opportunity to do so. Not only will you have the freedom to try out new technologies and approaches, but you will also have the chance to apply your learning in a real-world context.</p><p>Another benefit of side projects is that they can help you keep your skills up to date. In the fast-paced world of software development, it is essential to stay current with the latest trends and best practices. Side projects provide the perfect opportunity to do so, as they allow you to experiment with new technologies and approaches that you might not have the chance to explore in your day-to-day work.</p><p>In addition to learning and improving your skills, side projects are also a great way to work on something you are passionate about. Whether you are interested in building a new product, solving a specific problem, or just exploring your own interests, side projects provide the perfect opportunity to do so. By pursuing a side project that you are passionate about, you can find personal fulfillment and enjoyment in your work, which can be especially rewarding in times when your day job might feel monotonous or unfulfilling.</p><p>Finally, side projects can also be a great way to make your resume stand out. By demonstrating your ability to take initiative and complete projects on your own, you can show potential employers that you are a self-starter who is capable of handling complex tasks and delivering results.</p><p><strong>My personal story </strong><br><br>As a software developer, I know firsthand the value of side projects. In 2019, I started learning NestJs as a way to expand my knowledge and have fun. As a result, in 2020 I was able to use this knowledge to get a new position as a backend engineer at Data Systems in Kigali, Rwanda, where I built the backend of a huge edTech platform that was planned to be used by multiple schools in the country as a management tool.</p><p>One reason I enjoyed working with NestJs was that, at my previous company (Andela), many of my colleagues hated working with the front-end framework Angular. Personally, I have a drive to tackle the things that others avoid. So, I decided to learn Angular and even built an application to help manage mentorship internally at Andela. This side project helped me gain attention from my manager, who recommended me to Data Systems.</p><p>In 2020, a group of friends and I came together to build a side project that could potentially help us manage our community, share job posts, and connect with each other. This project ended up being a key factor in my next interview, with DEJ Technology GmbH. During the interview, I talked about the project and did some technical tests, and I was offered the job.</p><p>While working at DEJ, my friends and I created an application that manages hospitals using NextJs and NestJs. We decided to add a blockchain layer to the project as a way to stand out in the market if we decided to sell the product one day. This project allowed me the flexibility to implement all the ideas I had for a project, so I built microservices in Elixir, Typescript, Python Django, and Golang just for fun. Little did I know that this would help me get my current job, where I work on multiple products in various languages.</p><p>This year, I started building a clone of Dropbox using multiple languages as a proof of concept for distributed systems. I want to invest my future in this area of the industry, and I know that this project will help me get my next job.<br><br><strong>Conclusion</strong><br><br>Whether you&apos;re looking to learn a new tech stack, expand your knowledge set, keep your skills up to date, work on something you&apos;re passionate about, or even earn some passive income, side projects can be a valuable asset. </p><p>To get started with your own side project, consider the following tips:</p><ul><li>Find a project idea that aligns with your interests and goals.</li><li>Set clear goals and a timeline for your project.</li><li>Break your project into manageable chunks to make it more achievable.</li><li>Find resources and support to help you along the way.</li><li>Don&apos;t be afraid to pivot or adjust your project as you go.</li><li>Most importantly, have fun and enjoy the process!</li></ul>]]></content:encoded></item><item><title><![CDATA[Django: Testing, Factories, and Data Seeding (Pytest, Mixer)]]></title><description><![CDATA[If you're a Django developer, you know how important it is to have a solid suite of tests to ensure that your code is working as expected. But what do you do when you inherit a huge codebase that has zero tests? That was the situation I found myself in earlier this year (2022)...]]></description><link>https://nkpremices.com/graphene-testing-factories-and-data-seeding-in-django-pytest-mixer/</link><guid isPermaLink="false">62bde1718d3d0705e2977ece</guid><category><![CDATA[2022]]></category><category><![CDATA[Blog]]></category><category><![CDATA[Python]]></category><category><![CDATA[Testing]]></category><dc:creator><![CDATA[Prémices N. Kamasuwa]]></dc:creator><pubDate>Thu, 01 Dec 2022 22:17:00 GMT</pubDate><media:content url="https://nkpremices.com/content/images/2023/01/1_hCepk89R2z0Jp3OyWdvJ8Q.jpeg" medium="image"/><content:encoded><![CDATA[<img src="https://nkpremices.com/content/images/2023/01/1_hCepk89R2z0Jp3OyWdvJ8Q.jpeg" alt="Django: Testing, Factories, and Data Seeding (Pytest, Mixer)"><p>If you&apos;re a Django developer, you know how important it is to have a solid suite of tests to ensure that your code is working as expected. But what do you do when you inherit a huge codebase that has zero tests? That was the situation I found myself in at some point in 2022. </p><p>Coming from a project that was built using Elixir/Phoenix Live view and React, I was familiar with the concept of factories and the benefits they provide for test data management. I decided to bring this approach over to my Django project and was pleased with the results. </p><p>In this blog post, I&apos;ll explain how I set up Pytest for my Django project, defined factories using the Mixer library, and used them in my tests to create test data and run assertions. I&apos;ll also cover some advanced techniques for using factories, such as seeding the database. If you&apos;re looking for a more efficient and organized way to manage test data in your Django projects, read on to learn more about using factories with Pytest and Mixer.</p><ol><li><strong>Setting up Pytest</strong></li></ol><p>Before we can start using factories in our Django tests, we need to set up Pytest as our testing framework. Pytest is a powerful, feature-rich testing tool that is well-suited for testing Django applications. It&apos;s easy to install and has a number of plugins and features that can make testing Django projects a breeze.</p><p>To get started with Pytest in a Django project, you&apos;ll need to install the pytest and pytest-django packages. You can do this using pip:</p><!--kg-card-begin: markdown--><pre><code class="language-bash">pip install pytest pytest-django
</code></pre>
<!--kg-card-end: markdown--><p>If your team uses flake8 and black you might need to add some extra configs because pytest&apos;s script can also execute flake8 and black rules to check for linting issues. To do that you first need to install flake8 and black with the following script</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">pip install flake8 black
</code></pre>
<!--kg-card-end: markdown--><p> And then create a file named <code>pytest.ini</code></p><p>Here is an example:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">[pytest]
filterwarnings =
    error
    ignore::UserWarning
    ignore:function ham\(\) is deprecated:DeprecationWarning
DJANGO_SETTINGS_MODULE = server.settings
flake8-max-line-length = 120
</code></pre>
<!--kg-card-end: markdown--><p>There we provide some configs for <code>pytest</code>. (Please read the official docs from the link under references to understand more about the <code>filterwarnings</code> setting.). And then we provide the <code>settings.py</code> file that Django uses for global configs, in our case, a file called <code>settings.py</code> that is under a directory called <code>server</code> . And then lastly we provide the maximum length for flake8.</p><p>In general, this should be enough. You can simply run <code>pytest</code> from the command line to execute your tests. <br><br>By default, Pytest will discover and run all tests within the <code>tests</code> directory and its subdirectories. If you want to run a specific test or group of tests, you can use the <code>-k</code> flag to specify a test function or method name:<br></p><!--kg-card-begin: markdown--><pre><code class="language-javascript">pytest -k test_function_name
</code></pre>
<!--kg-card-end: markdown--><p>2. <strong>Creating factories</strong></p><p>Now that we have Pytest set up in our Django project, let&apos;s look at how we can use factories to create test data using Pytest fixtures. A fixture is a function that returns test data and can be used in multiple tests. It allows us to easily create realistic, customizable test data without having to manually set up complex data structures or write repetitive test setup code.</p><p>To use factories as Pytest fixtures in a Django project, we can use the Mixer library. Mixer is a powerful and easy-to-use library that allows us to define factories for Django models, forms, views, and other objects. It also has a number of features for customizing factory data and creating relationships between factories.</p><p>To install Mixer, you can use pip:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">pip install mixer
</code></pre>
<!--kg-card-end: markdown--><p>Once Mixer is installed, we can start defining factories for our Django models. Here is an example of a factory for a <code>Person</code> model with a <code>first_name</code> and <code>last_name</code> field:<br></p><!--kg-card-begin: markdown--><pre><code class="language-javascript">from mixer.backend.django import mixer

def person_factory(**kwargs):
    return mixer.blend(
        &apos;server.app.models.Person&apos;,
        first_name=mixer.sequence(lambda n: f&apos;first_name_{n}&apos;),
        last_name=mixer.sequence(lambda n: f&apos;last_name_{n}&apos;),
        **kwargs,
    )
</code></pre>
<!--kg-card-end: markdown--><p>This factory uses Mixer&apos;s <code>blend</code> function to create a new <code>Person</code> object with default values for the <code>first_name</code> and <code>last_name</code> fields. The <code>mixer.sequence</code> function generates unique values for these fields using a given lambda function. We can also pass additional keyword arguments to the factory function to override the default values for any field.</p><p>We can use this factory as a Pytest fixture by decorating it with the <code>@pytest.fixture</code> decorator:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">import pytest

@pytest.fixture
def person(db):
    return person_factory()
</code></pre>
<!--kg-card-end: markdown--><p>This fixture returns a new <code>Person</code> object created by the <code>person_factory</code> function. The <code>db</code> fixture provided by Pytest-Django is used to ensure that the <code>Person</code> object is saved to the database. We can then use this fixture in our Pytest tests like this:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">def test_person_model(person):
    assert person.first_name == &apos;first_name_0&apos;
    assert person.last_name == &apos;last_name_0&apos;
</code></pre>
<!--kg-card-end: markdown--><p>This test uses the <code>person</code> fixture to get a <code>Person</code> object and then runs assertions on its fields.</p><p>By using Mixer factories as Pytest fixtures, we can easily create test data for our tests and reuse it across multiple tests. In the next section, we&apos;ll look at more advanced techniques for using factories in tests.<br><br><strong>3</strong>. <strong>Advanced techniques</strong></p><p>In the previous sections, we looked at how to set up Pytest in a Django project and how to use Mixer factories to create test data. In this section, we&apos;ll explore some advanced techniques for using factories in tests, including how to refactor our factory code to support multiple keyword arguments and how to create custom Pytest fixtures for creating test data.</p><p>First, let&apos;s refactor our <code>Person</code> factory to support multiple keyword arguments. This will allow us to easily override the default values for any field in our <code>Person</code> model:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">from mixer.backend.django import mixer

def person_factory(**kwargs):
    defaults = {
        &apos;first_name&apos;: mixer.sequence(lambda n: f&apos;first_name_{n}&apos;),
        &apos;last_name&apos;: mixer.sequence(lambda n: f&apos;last_name_{n}&apos;),
    }
    defaults.update(kwargs)
    return mixer.blend(&apos;server.app.models.Person&apos;, **defaults)
</code></pre>
<!--kg-card-end: markdown--><p>This updated version of the <code>person_factory</code> function defines default values for the <code>first_name</code> and <code>last_name</code> fields and then updates them with any keyword arguments passed to the factory. This allows us to easily create a <code>Person</code> object with custom values for any field by calling the factory with keyword arguments like this:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">person = person_factory(first_name=&apos;John&apos;, last_name=&apos;Doe&apos;)
</code></pre>
<!--kg-card-end: markdown--><p>Next, let&apos;s create a custom Pytest fixture for creating test data using our <code>person_factory</code> . Let&apos;s call it<code>insert</code> . This fixture will allow us to easily create multiple <code>Person</code> objects with different values for any field by calling it with keyword arguments.</p><p>First, we need a container for all factories: here is how you can create one:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">model_factories = []
</code></pre>
<!--kg-card-end: markdown--><p>Here is an example of the <code>insert</code> fixture:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">import pytest

@pytest.fixture
def insert(db):
    def _insert(model_name, count, persist=True, **kwargs):
        model_factory = next(f for f in model_factories if f.__name__ == f&quot;{model_name.lower()}_factory&quot;)
        objects = [model_factory(**kwargs) for _ in range(count)]
        if persist:
            for obj in objects:
                obj.save()
        return objects
    return _insert
</code></pre>
<!--kg-card-end: markdown--><p>This fixture takes three required arguments:</p><ul><li><code>model_name</code>: The name of the model as a string.</li><li><code>count</code>: The number of objects to return in a list.</li><li><code>persist</code>: A boolean indicating whether the objects should be saved to the database (defaults to <code>True</code>).</li></ul><p>It also supports additional keyword arguments that will be passed to the model factory.</p><p>Here is an example of how we can use the <code>insert</code> fixture to create multiple <code>Person</code> objects with different values for the <code>first_name</code> field:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">def test_person_model(insert, db):
    persons = insert(
        &apos;app.Person&apos;,
        count=3,
        first_name=mixer.sequence(lambda n: f&apos;first_name_{n}&apos;),
    )
    assert persons[0].first_name == &apos;first_name_0&apos;
    assert persons[1].first_name == &apos;first_name_1&apos;
    assert persons[2].first_name == &apos;first_name_2&apos;
</code></pre>
<!--kg-card-end: markdown--><p>This test uses the <code>insert</code> fixture to create three <code>Person</code> objects with unique <code>first_name</code> values and then runs assertions on the <code>first_name</code> fields of the objects. The <code>db</code> fixture provided by Pytest-Django is used to ensure that the <code>Person</code> objects are saved to the database.<br><br>And that is how you can easily seed dynamic data directly in tests using pytest and mixer.<br><br><strong>4. Bonus</strong></p><p>As developers, we often join projects where we are not provided with a test database dump to use for local development. In these situations, it can be useful to create a Django management command that generates dummy data for use in local development.</p><p>To create a management command for generating dummy data, we can use our Pytest fixtures and the Faker library. Faker is a library that generates fake data, such as names, addresses, and phone numbers, for use in test environments. To install Faker, you can use <code>pip</code>:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">$ pip install Faker
</code></pre>
<!--kg-card-end: markdown--><p>Here is an example of a Django management command called <code>db_seed</code> that generates dummy data using our Pytest fixtures and Faker:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">import faker
import logging
from django.core.management.base import BaseCommand

class Command(BaseCommand):
    help = &apos;Seeds the database with fake data&apos;

    def add_arguments(self, parser):
        parser.add_argument(&apos;--quantity&apos;, type=int, default=1, help=&apos;Number of items to create&apos;)
        parser.add_argument(&apos;--model&apos;, type=str, required=True, help=&apos;Model to use for creating data&apos;)
        parser.add_argument(&apos;--attributes&apos;, type=str, help=&apos;Model attributes&apos;)

    def handle(self, *args, **options):
        fake = faker.Faker()
        quantity = options[&apos;quantity&apos;]
        model_name = options[&apos;model&apos;]
        model_attributes = options[&apos;attributes&apos;]

        # Get the model factory function from the Pytest fixtures
        model_factory = pytest.fixture(&apos;model_factory&apos;)

        for _ in range(quantity):
            # Use the model factory function to create dummy data
            obj = model_factory(model_name=model_name, **model_attributes)
            obj.save()
        logging.info(f&apos;Successfully seeded {quantity} {model_name} objects&apos;)
</code></pre>
<!--kg-card-end: markdown--><p>This management command takes three arguments:</p><ul><li><code>quantity</code>: The number of items to create</li><li><code>model</code>: The name of the model to use for creating data. This argument is required.</li><li><code>attributes</code>: A string of Python-like keyword arguments to pass to the model factory.</li></ul><p>It uses the Pytest fixtures and Faker to create <code>quantity</code> number of dummy objects for the specified model, using the specified attributes. The objects are then saved to the database.</p><p>To use this management command, you would run the following command from the command line:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">$ python manage.py db_seed --quantity=5 --model=app.Person --attributes=&apos;first_name=&quot;John&quot;, last_name=&quot;Doe&quot;&apos;
</code></pre>
<!--kg-card-end: markdown--><p>This command would create five <code>Person</code> objects with the <code>first_name</code> of &quot;John&quot; and the <code>last_name</code> of &quot;Doe&quot;.</p><p>The <code>db_seed</code> management command uses the built-in Python logging module to log a message when the seeding is complete. You can configure the logging module to output log messages to different places, such as a file or the console, by setting up a logging configuration in your Django settings.</p><p>Furthermore, for safety, you can even handle all operations to the database in transactions. That way you keep the integrity of the data and also secure the CLI from unexpected errors.</p><p><strong>5. Conclusion</strong><br><br>We have explored how to use Pytest, Mixer, and Faker to create factories for generating dummy data in Django tests. We have covered how to create a simple model factory, how to use Pytest fixtures to create test data, and how to create a Django management command for generating dummy data for local development.</p><p>Using these techniques, you can easily create test data for your Django applications, making it easier to write and run tests that rely on data being present in the database. This can help you ensure that your code is working correctly and reduce the risk of regressions as you make changes to your codebase.</p><p>I hope you have found this blog post helpful, and that you have learned some useful techniques for testing Django applications. If you have any questions or suggestions, please feel free to leave a comment below.</p><hr><p><strong>References</strong>:</p><!--kg-card-begin: markdown--><ol>
<li><a href="https://pypi.org/project/Faker/">https://pypi.org/project/Faker/</a></li>
<li><a href="https://pypi.org/project/mixer/">https://pypi.org/project/mixer/</a></li>
<li><a href="https://pypi.org/project/pytest/">https://pypi.org/project/pytest/</a></li>
<li><a href="https://stackoverflow.com/questions/67932110/mocking-model-user-using-mixer-throughs-error">https://stackoverflow.com/questions/67932110/mocking-model-user-using-mixer-throughs-error</a></li>
</ol>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Uploading files to Digital Ocean Spaces, NestJs]]></title><description><![CDATA[Uploading files to Digital Ocean Spaces, NestJs]]></description><link>https://nkpremices.com/uploading-files-to-digital-ocean-spaces-nestjs/</link><guid isPermaLink="false">61a0c2ca8d3d0705e2977bba</guid><category><![CDATA[Blog]]></category><category><![CDATA[Tech]]></category><category><![CDATA[Typescript]]></category><category><![CDATA[NestJs]]></category><category><![CDATA[DigitalOcean]]></category><dc:creator><![CDATA[Prémices N. Kamasuwa]]></dc:creator><pubDate>Mon, 01 Nov 2021 12:15:00 GMT</pubDate><media:content url="https://nkpremices.com/content/images/2021/11/spaces.png" medium="image"/><content:encoded><![CDATA[<img src="https://nkpremices.com/content/images/2021/11/spaces.png" alt="Uploading files to Digital Ocean Spaces, NestJs"><p>If you&apos;ve ever used the AWS SDK to upload files to S3, you know how convenient it can be. However, not everyone has the option to leverage AWS services, and that&apos;s where DigitalOcean&apos;s Spaces comes in. Spaces is an excellent alternative that offers similar functionality and is just as easy to use.</p><p>While uploading files to Spaces using a Node.js server is well-documented, resources covering the process with a NestJS server are harder to come by. In this article, I&apos;ll walk you through how to seamlessly upload files to DigitalOcean Spaces using the AWS SDK within a NestJS application.</p><p>Before we dive in, let&apos;s clarify a few assumptions: I&apos;m assuming you&apos;re already familiar with NestJS, have experience with DigitalOcean Spaces, and have set up a Spaces instance with the necessary API keys. Additionally, you should have a working NestJS project ready to go.</p><p>With those basics covered, let&apos;s move on to creating a simple controller that accepts a file from a form-data request body and uploads it to DigitalOcean Spaces. This should be a straightforward exercise and a great way to get comfortable with the process. Let&apos;s get started!</p><h3 id="1-creating-a-service-to-handle-file-uploads">1. Creating a Service to Handle File Uploads</h3><p>One of <strong>NestJS</strong>&apos;s standout features is its strong emphasis on the <strong>Dependency Injection (DI)</strong> design pattern. <strong>DI</strong> simplifies managing dependencies in a TypeScript codebase by resolving them automatically based on their types.</p><p>For our file upload functionality, we&apos;ll create a custom service. Unlike built-in services that NestJS can automatically resolve, custom services require a bit more setup. We&apos;ll need to create a <a href="https://docs.nestjs.com/fundamentals/custom-providers#custom-providers">custom provider</a> &#xA0;to ensure our service is correctly instantiated and injected where needed.</p><p>Before we dive into the code, let&apos;s start by installing the AWS SDK. You can do so with one of the following commands:</p><!--kg-card-begin: markdown--><p>in case you are using npm:</p>
<pre><code class="language-javascript">// npm
npm install aws-sdk
</code></pre>
<pre><code class="language-javascript">// yarn
yarn add  aws-sdk
</code></pre>
<!--kg-card-end: markdown--><p>In our codebase, under <code>src</code>, let&apos;s create a directory called <code><strong>SpacesModule</strong></code> that contains &#xA0;a directory called <strong><code>SpacesService</code></strong> that contains 2 files, <code>index.ts</code> and <code>doSpacesService.ts</code>. <code>index.ts</code> will contain the provider and <code>doSpacesService.ts</code> will be the actual service.</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">// index.ts
import * as AWS from &apos;aws-sdk&apos;;
import { Provider } from &apos;@nestjs/common&apos;;

// Unique identifier of the service in the dependency injection layer
export const DoSpacesServiceLib = &apos;lib:do-spaces-service&apos;;

// Creation of the value that the provider will always be returning.
// An actual AWS.S3 instance
const spacesEndpoint = new AWS.Endpoint(&apos;fra1.digitaloceanspaces.com&apos;);

const S3 = new AWS.S3({
  endpoint: spacesEndpoint.href,
  credentials: new AWS.Credentials({
    accessKeyId: &apos;&lt;put-your-digital-ocean-spaces-key-here&gt;&apos;,
    secretAccessKey: &apos;&lt;put-your-digital-ocean-spaces-secret-here&gt;&apos;,
  }),
});

// Now comes the provider
export const DoSpacesServicerovider: Provider&lt;AWS.S3&gt; = {
  provide: DoSpacesServiceLib,
  useValue: S3,
};

// This is just a simple interface that represents an uploaded file object 
export interface UploadedMulterFileI {
  fieldname: string;
  originalname: string;
  encoding: string;
  mimetype: string;
  buffer: Buffer;
  size: number;
}

</code></pre>
<!--kg-card-end: markdown--><p>Now, let&apos;s create the service with a method called <code>uploadFile</code></p><!--kg-card-begin: markdown--><pre><code class="language-javascript">// doSpacesService.ts
import { Inject, Injectable } from &apos;@nestjs/common&apos;;
import * as AWS from &apos;aws-sdk&apos;;
import {
  DoSpacesServiceLib,
  DoSpacesServicerovider,
} from &apos;./doSpacesService&apos;;

// Typical nestJs service
@Injectable()
export class DoSpacesService {
  constructor(@Inject(DoSpacesServiceLib) private readonly s3: AWS.S3) {}

  async uploadFile(file: UploadedMulterFileI) {
    // Precaution to avoid having 2 files with the same name
    const fileName = `${Date.now()}-${
      file.originalname
    }`;

    // Return a promise that resolves only when the file upload is complete
    return new Promise((resolve, reject) =&gt; {
      this.s3.putObject(
        {
          Bucket: &apos;&lt;put-here-the-name-of-your-spaces-bucket&gt;&apos;,
          Key: fileName,
          Body: file.buffer,
          ACL: &apos;public-read&apos;,
        },
        (error: AWS.AWSError) =&gt; {
          if (!error) {
            resolve(`&lt;put-here-the-public-link-to-your-spaces-instance&gt;/${fileName}`);
          } else {
            reject(
              new Error(
                `DoSpacesService_ERROR: ${error.message || &apos;Something went wrong&apos;}`,
              ),
            );
          }
        },
      );
    });
  }
}

</code></pre>
<!--kg-card-end: markdown--><h3 id="2-as-the-last-pieces-of-the-puzzle-lets-create-a-module-to-wrap-everything-and-the-controller">2. As the last pieces of the puzzle, let&apos;s create a module to wrap everything and the Controller</h3><p>Under the <code>SpacesModule</code> directory, let&apos;s create 2 files, <code>spaces.module.ts</code> and <code>spaces.controller.ts</code> at this point, our <code>SpacesModule</code> directory looks like this:</p><!--kg-card-begin: markdown--><pre><code>src-
    |
    |SpacesModule
                |
                |-SpacesService
                |             |
                |             | doSpacesService.ts
                |             | index.ts
                | spaces.controller.ts
                | spaces.module.ts
                

</code></pre>
<!--kg-card-end: markdown--><p>in <code>spaces.controller.ts</code> let&apos;s have the following</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">import {
  Controller,
  UploadedFile,
  UseInterceptors,
  Post
} from &apos;@nestjs/common&apos;;
import { FileInterceptor } from &apos;@nestjs/platform-express&apos;;
import { DoSpacesService } from &apos;./SpacesService/doSpacesService&apos;;
import { DoSpacesServicerovider, UploadedMulterFileI } from &apos;./SpacesService&apos;;

// just a typical nestJs controller
@Controller(&apos;/api/v1/do&apos;)
export class CommonController {
  constructor(
    private readonly doSpacesService: DoSpacesService,
  ) {}

  @UseInterceptors(FileInterceptor(&apos;file&apos;))
  @Post(&apos;spaces&apos;)
  async uploadFile(@UploadedFile() file: UploadedMulterFileI) {
    const url = await this.doSpacesService.uploadFile(file);

    return {
      url,
    };
  }
}

</code></pre>
<!--kg-card-end: markdown--><p>in <code>spaces.module.ts</code> let&apos;s have the following</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">import { Module } from &apos;@nestjs/common&apos;;
import { SpacesController } from &apos;./spaces.controller&apos;;
import { DoSpacesService } from &apos;./SpacesService/doSpacesService&apos;;
import { DoSpacesServicerovider } from &apos;./SpacesService&apos;;

@Module({
  imports: [],
  controllers: [SpacesController],
  // provide both the service and the custom provider
  providers: [DoSpacesServicerovider, DoSpacesService],
})
export class SpacesModule {}


</code></pre>
<!--kg-card-end: markdown--><p>We&apos;ve covered all the steps necessary to handle file uploads to Digital Ocean Spaces using a NestJS server and the AWS SDK. All that&apos;s left to do now is to add the module we created to the main app module and then send a POST request to the <code>/do/spaces</code> endpoint with the file attached as a form field named <code>file</code>. If everything is set up correctly, you should receive a URL back in the response, and you can check the file on Digital Ocean Spaces to confirm that the upload was successful.</p><p> </p><p><strong>References</strong></p><ol><li>NestJs Custom Providers: <a href="https://docs.nestjs.com/fundamentals/custom-providers#custom-providers">https://docs.nestjs.com/fundamentals/custom-providers#custom-providers</a></li><li>AWS NPM SDK: <a href="https://www.npmjs.com/package/aws-sdk">https://www.npmjs.com/package/aws-sdk</a></li><li>Digital Ocean Spaces: <a href="https://www.digitalocean.com/products/spaces/">https://www.digitalocean.com/products/spaces/</a></li></ol>]]></content:encoded></item><item><title><![CDATA[Boilerplate of A Desktop App With Electron & React/Typescript (For busy developers)]]></title><description><![CDATA[Boilerplate for A Desktop App With Electron & React/Typescript ]]></description><link>https://nkpremices.com/creating-a-boilerplate-eletron-react-ts/</link><guid isPermaLink="false">61060334eb869117ff36e851</guid><category><![CDATA[React]]></category><category><![CDATA[Typescript]]></category><category><![CDATA[Javascript]]></category><category><![CDATA[Electron]]></category><category><![CDATA[Blog]]></category><dc:creator><![CDATA[Ghost]]></dc:creator><pubDate>Tue, 12 Oct 2021 02:13:00 GMT</pubDate><media:content url="https://nkpremices.com/content/images/2021/10/Screen-Shot-2021-10-13-at-00.55.21.png" medium="image"/><content:encoded><![CDATA[<img src="https://nkpremices.com/content/images/2021/10/Screen-Shot-2021-10-13-at-00.55.21.png" alt="Boilerplate of A Desktop App With Electron &amp; React/Typescript (For busy developers)"><p><strong>Hey busy developers!</strong> Looking for a quick and straightforward way to get started with Electron, React, and TypeScript? You&apos;re in the right place! In this article, I&apos;ll guide you through creating a boilerplate for a desktop app using these technologies.</p><p>Now, you might be thinking: <em>&quot;But I&apos;ve never worked with Electron before, and I don&apos;t have time to read the documentation!&quot;</em> Don&apos;t worry&#x2014;that&apos;s exactly why this tutorial exists. I&apos;ve designed it to be as simple and streamlined as possible, so you can have your app up and running in no time.</p><p>It&apos;s important to note that this isn&apos;t the only way to set up an Electron app. However, it&apos;s one of the fastest methods, perfect for those short on time who just want to get things up and running quickly.</p><p>So, let&apos;s get started!</p><p>Before diving in, it&apos;s essential to understand the basic structure of an Electron app. Essentially, it&apos;s a web application embedded within a Chromium/Electron/Node.js framework. To help illustrate this, check out the image below, which shows the build process of an Electron app:</p><figure class="kg-card kg-image-card"><img src="https://nkpremices.com/content/images/2021/10/Screen-Shot-2021-10-13-at-02.31.37.png" class="kg-image" alt="Boilerplate of A Desktop App With Electron &amp; React/Typescript (For busy developers)" loading="lazy" width="1282" height="235" srcset="https://nkpremices.com/content/images/size/w600/2021/10/Screen-Shot-2021-10-13-at-02.31.37.png 600w, https://nkpremices.com/content/images/size/w1000/2021/10/Screen-Shot-2021-10-13-at-02.31.37.png 1000w, https://nkpremices.com/content/images/2021/10/Screen-Shot-2021-10-13-at-02.31.37.png 1282w" sizes="(min-width: 720px) 720px"></figure><p>Let&apos;s start by creating a basic react app using create-react-app with the command</p><!--kg-card-begin: markdown--><p><code>npx create-react-app ./ --template typescript</code></p>
<!--kg-card-end: markdown--><p>This will create an empty react app in the current directory.</p><!--kg-card-begin: markdown--><p>Now let&apos;s install electron with the command</p>
<pre><code class="language-javascript">npm install electron
</code></pre>
<p>and then create a directory called electron with the following files:</p>
<ul>
<li>index.ts</li>
<li>preload.ts</li>
<li>tsconfig.json</li>
</ul>
<!--kg-card-end: markdown--><ol><li><strong>index.ts</strong></li></ol><p>The index.ts file is the entry point for Electron. It contains all the necessary instructions to start a basic Electron browser window and load the index.html file from the build folder. This file, in turn, loads all the bundles created by the build script of React-scripts. The content of the index.ts file includes inline comments that provide explanations for each step of the process. <br><br>Here is the code:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">import { app, BrowserWindow } from &apos;electron&apos;;
import * as path from &apos;path&apos;;

let mainWindow: Electron.BrowserWindow | null;

function createWindow() {
  // Create the browser window.electron
  mainWindow = new BrowserWindow({
    webPreferences: {
      preload: path.join(__dirname, &apos;preload.js&apos;),
    },
  });

  // and load the index.html of the app.
  mainWindow.loadFile(path.join(__dirname, &apos;index.html&apos;));

  // Open the DevTools.
  // mainWindow.webContents.openDevTools();

  // Emitted when the window is closed.
  mainWindow.on(&apos;closed&apos;, () =&gt; {
    // Dereference the window object, usually you would store windows
    // in an array if your app supports multi windows, this is the time
    // when you should delete the corresponding element.
    mainWindow = null;
  });
  mainWindow.maximize();
}

// This method will be called when Electron has finished
// initialization and is ready to create browser windows.
// Some APIs can only be used after this event occurs.
app.on(&apos;ready&apos;, () =&gt; {
  createWindow();
});

// Quit when all windows are closed.
app.on(&apos;window-all-closed&apos;, () =&gt; {
  // On OS X it is common for applications and their menu bar
  // to stay active until the user quits explicitly with Cmd + Q
  if (process.platform !== &apos;darwin&apos;) {
    app.quit();
  }
});

app.on(&apos;activate&apos;, () =&gt; {
  // On OS X it&quot;s common to re-create a window in the app when the
  // dock icon is clicked and there are no other windows open.
  if (mainWindow === null) {
    createWindow();
  }
});

</code></pre>
<!--kg-card-end: markdown--><p><strong>2. preload.ts</strong></p><p>You can use <code>preload.ts</code> which <strong>will be loaded before other</strong> scripts run on the main page. This script will always have access to both electron APIs and node APIs(and also the browser APIs) no matter whether node integration is turned on or off.</p><p>Since the app that we are building is pretty basic, it will not have much content. </p><!--kg-card-begin: markdown--><pre><code class="language-javascript">// All of nodeJS APIs are available in the preload process
// it has the same sanbox as chrome extension
window.addEventListener(&apos;DOMContentLoaded&apos;, () =&gt; {});

export {};
</code></pre>
<!--kg-card-end: markdown--><p><strong>3. tsconfig.json</strong></p><p>If you&apos;ve worked with TypeScript before, you&apos;re probably familiar with the tsconfig.json file. For those who may not be familiar, tsconfig.json is a configuration file that allows you to specify the root-level files and compiler options needed to compile a TypeScript project. The presence of this file in a directory indicates that the directory is the root of a TypeScript project. You can find more information about tsconfig.json and its various properties in the TypeScript documentation.</p><p><a href="https://dzone.com/articles/what-is-the-tsconfigjson-configuration-file#:~:text=Like (2)-,The tsconfig.,is the TypeScript project root.&amp;text=json %2C it&apos;s various properties%2C and how to extend it.">here</a></p><p>In our case, the content should look like the following:</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">{
  &quot;compilerOptions&quot;: {
    &quot;module&quot;: &quot;commonjs&quot;,
    &quot;noImplicitAny&quot;: true,
    &quot;sourceMap&quot;: true,
    &quot;outDir&quot;: &quot;../build&quot;,
    &quot;baseUrl&quot;: &quot;.&quot;,
    &quot;paths&quot;: {
      &quot;*&quot;: [&quot;node_modules/*&quot;]
    }
  },
  &quot;include&quot;: [
    &quot;**/*&quot;
  ]
}
</code></pre>
<!--kg-card-end: markdown--><p>Now the last thing that we have to do is configure the scripts in <code>package.json</code> &#xA0;exactly as the image at the beginning of this post describes it.</p><!--kg-card-begin: markdown--><pre><code class="language-json">   {
      &quot;scripts&quot;: {
         &quot;build:web&quot;: &quot;PUBLIC_URL=./ react-scripts build&quot;,
         &quot;build:desktop&quot;: &quot;tsc -p electron/tsconfig.json&quot;,
         &quot;start:desktop&quot;: &quot;npm run build:web &amp;&amp; npm run build:desktop &amp;&amp; electron ./build/index.js&quot;,
     }
   }
</code></pre>
<!--kg-card-end: markdown--><p>Now that we have everything set up, we can run the command <code>npm run start:desktop</code> and see that app is built in an electron frame.</p><p> And that&apos;s it, you have created a desktop app with Electron &amp; React/Typescript.</p><h2 id="bonus">Bonus </h2><ol><li><strong>Creating executables for all platforms (Windows, Linux, macOS) with electron-builder</strong></li></ol><p>There are multiple packages to use to create executables for an electron app. There are lots of tutorials online on how to use them but I found electron-builder quite easy to set up and to understand.</p><p>Let&apos;s start by adding the configurations in the package.json file. It&apos;s a build property added to the root of the JSON file. &#xA0;</p><!--kg-card-begin: markdown--><pre><code class="language-json">  &quot;build&quot;: {
    &quot;extraMetadata&quot;: {
      &quot;homepage&quot;: &quot;./&quot;,
      &quot;main&quot;: &quot;build/index.js&quot;
    },
    &quot;productName&quot;: &quot;my-app-name&quot;,
    &quot;appId&quot;: &quot;my-app-id-or-version&quot;,
    &quot;files&quot;: [
      &quot;build/**/*&quot;,
      &quot;node_modules/**/*&quot;
    ],
    &quot;mac&quot;: {
      &quot;category&quot;: &quot;public.app-category.productivity&quot;,
      &quot;target&quot;: [
        &quot;dmg&quot;,
        &quot;zip&quot;
      ],
      &quot;icon&quot;: &quot;src/Assets/img/unix.icns&quot;
    },
    &quot;linux&quot;: {
      &quot;maintainer&quot;: &quot;john@doe.com&quot;,
      &quot;target&quot;: [
        &quot;tar.gz&quot;,
        &quot;deb&quot;
      ],
      &quot;icon&quot;: &quot;src/Assets/img/unix.icns&quot;
    },
    &quot;win&quot;: {
      &quot;target&quot;: [
        &quot;zip&quot;,
        &quot;dir&quot;
      ],
      &quot;icon&quot;: &quot;src/Assets/img/windows.ico&quot;
    }
  }
</code></pre>
<!--kg-card-end: markdown--><p>Now, let&apos;s add scripts to trigger the creation of the files for all operating systems</p><!--kg-card-begin: markdown--><pre><code class="language-json">    &quot;prebuild:package&quot;: &quot;npm run build:web &amp;&amp; npm run build:desktop&quot;,
    &quot;build:package:windows&quot;: &quot;npm run prebuild:package &amp;&amp; electron-builder --win&quot;,
    &quot;build:package:linux&quot;: &quot;npm run prebuild:package &amp;&amp; electron-builder --linux&quot;,
    &quot;build:package:mac&quot;: &quot;npm run prebuild:package &amp;&amp; electron-builder --mac&quot;,
    &quot;build:package:all&quot;: &quot;npm run prebuild:package &amp;&amp; electron-builder --win --linux --mac&quot;
 
</code></pre>
<!--kg-card-end: markdown--><p>And there you have it. I think the names of the scripts speak for themselves. No further explanation is needed.</p><p><strong>2. Automatic updates of the electron app in production</strong></p><p>Automatic updates of the electron app in production can also be achieved in many ways but the most popular one is doing it via an &quot;update server&quot;. I think that one is subject to a whole new blog post that I will publish in the future if I have time.</p><p><strong>References:</strong></p><ol><li><a href="https://dev.to/achuthhadnoor/getting-started-with-electron-typescript-react-and-webpack-3ik4">https://dev.to/achuthhadnoor/getting-started-with-electron-typescript-react-and-webpack-3ik4</a></li><li><a href="https://awsm.page/electron/how-to-use-preload-script-in-electron/">https://awsm.page/electron/how-to-use-preload-script-in-electron/</a></li><li><a href="https://www.electronjs.org/docs/latest/tutorial/updates">https://www.electronjs.org/docs/latest/tutorial/updates</a></li></ol>]]></content:encoded></item><item><title><![CDATA[Dynamically Set Angular Environment Variables in Docker]]></title><description><![CDATA[Dynamically Set Angular Environment Variables in Docker]]></description><link>https://nkpremices.com/dynamically-set-angular-env-variables-in-docker/</link><guid isPermaLink="false">61060334eb869117ff36e85d</guid><category><![CDATA[Blog]]></category><category><![CDATA[Tech]]></category><category><![CDATA[Angular]]></category><category><![CDATA[Javascript]]></category><category><![CDATA[Typescript]]></category><category><![CDATA[Docker]]></category><dc:creator><![CDATA[Ghost]]></dc:creator><pubDate>Tue, 12 Oct 2021 02:13:00 GMT</pubDate><media:content url="https://nkpremices.com/content/images/2021/10/Screen-Shot-2021-10-13-at-00.25.32.png" medium="image"/><content:encoded><![CDATA[<img src="https://nkpremices.com/content/images/2021/10/Screen-Shot-2021-10-13-at-00.25.32.png" alt="Dynamically Set Angular Environment Variables in Docker"><p>When using Angular, environent variables are kinda cooked into the application bundles on every single build that they are not meant to be changed afterward. </p><p>In a recent project, I encountered a challenge when trying to manage multiple environment settings in an Angular app running in a Docker container. </p><p>I will try here to solve the problem in a step-by-step, self-explanatory way.</p><h2 id="problem-description">Problem description</h2><p>The application is a single-page app with a .Net Core backend API. An example of one variable that has to vary from one environment to another is the backend URL. &#xA0;</p><!--kg-card-begin: markdown--><p>Here is how the <code>environment.ts</code> file looks like:</p>
<pre><code class="language-javascript">export const environment = {
  production: false,
  backendBaseUrl: &apos;http://localhost:5151&apos;
};
</code></pre>
<!--kg-card-end: markdown--><p>As you might have guessed, <code>environnment.backendBaseUrl</code> is the part that will be changing.</p><!--kg-card-begin: markdown--><ul>
<li>Locally we could have <code>http://localhost:5151</code></li>
<li>On staging we could have <code>staging.some-app-domain.com</code></li>
<li>And on production: <code>some-app-domain.com</code></li>
</ul>
<!--kg-card-end: markdown--><p>According to the Twelve-Factor App, <a href="https://12factor.net/config">configuration should be stored in the environment</a>. By default, this is not possible with Angular&apos;s built-in environment variables. So we need to feed them from an external source.</p><h2 id="solution-feeding-angular-env-variables-from-an-external-source">Solution: Feeding Angular Env Variables from an external source</h2><p>Let&apos;s consider the following:</p><ol><li>After a successful build, the default <code>environment.ts</code> gets cross-compiled into a nearly un-editable JavaScript file. One way around that can be externalizing the configuration from the cross-compiled app bundle files.</li><li>The content of the &#xA0;<code>/assets</code> directory never gets changed. It just gets copied into the build directory.</li></ol><p>In a Docker environment, we could take advantage of behavior number 2 and create a volume on the assets directory. In that scenario, we just add an &quot;extra environment file&quot; that the main environment files of the Angular app can read from. And hence, whenever we change the config file in the <code>/assets</code> &#xA0;directory (from the volume) the whole application will adapt. And that way, we have flexibility over environments.</p><p>Furthermore, we can make it a little bit better by creating a docker command that can provide values to placeholders in the file every time the docker image of the app is being built. (This will not even require to have a volume)</p><p>Okay &#x1F913;, no more talking, let&apos;s make that happen.</p><p>Let&apos;s create a new &#xA0;<code>env.js</code> &#xA0; file in the &#xA0;<code>/assets</code> directory with the following content.</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">(function (window) {
  window[&apos;env&apos;] = window[&apos;env&apos;] || {};

  // Environment variables
  window[&apos;env&apos;][&apos;backendBaseUrl&apos;] = &apos;https://dot-net-backend.com/api/v1&apos;;
})(this);
</code></pre>
<!--kg-card-end: markdown--><p>The Javascript function that we just created defines our future environment variables. It won&apos;t be cross-compiled but simply copied to the <code>/dist</code> directory since it is part of the <code>/assets</code> folder, &#xA0;and can be edited in clear text later.</p><p>Now, let&apos;s call the function at application startup by adding it to the <code>index.html</code> file:</p><!--kg-card-begin: markdown--><pre><code class="language-html">&lt;!DOCTYPE html&gt;
&lt;html lang=&quot;en&quot;&gt;
  &lt;head&gt;
    &lt;!-- ... --&gt;

    &lt;!-- Load environment variables --&gt;
    &lt;script src=&quot;assets/env.js&quot;&gt;&lt;/script&gt;
  &lt;/head&gt;
  &lt;body&gt;
    &lt;app-root&gt;&lt;/app-root&gt;
  &lt;/body&gt;
&lt;/html&gt;
</code></pre>
<!--kg-card-end: markdown--><p>Now let&apos;s feed the env variables from the new &#xA0;<code>env.js</code> &#xA0;in the normal Angular &#xA0;<code>environemnt.*.ts</code> &#xA0;files.</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">export const environment = {
  production: false,
  backendBaseUrl:
    window[&apos;env&apos;][&apos;backendBaseUrl&apos;] || &apos;http://localhosst:5151/api/v1&apos;
};
</code></pre>
<!--kg-card-end: markdown--><p>Let&apos;s now create a template file for our environment variables. </p><p>Let&apos;s create a &#xA0; <code>env.sample.js</code> &#xA0;in the <code>/assets</code> director</p><!--kg-card-begin: markdown--><pre><code class="language-javascript">(function (window) {
  window[&apos;env&apos;] = window[&apos;env&apos;] || {};

  // Environment variables
  window[&apos;env&apos;][&apos;backendBaseUrl&apos;] = &apos;${BACKEND_BASE_URL}&apos;;
})(this);
</code></pre>
<!--kg-card-end: markdown--><p>The <code>${PLACEHOLDER}</code> variables can now be overwritten during our Docker image creation. We are going to use the &#xA0;<code>envsubst</code> &#xA0;shell command for that.</p><p>What the command will basically do is copy-paste the content of the <code>env.sample.js</code> into &#xA0;<code>env.js</code> &#xA0; replacing the <code>${PLACEHOLDER}</code> values in the process while picking them from the standard environment.</p><p>Here the code comes:</p><!--kg-card-begin: markdown--><pre><code class="language-docker"># Dockerfile to build and server the Angular application


###############
### STAGE 1: Build app
###############
FROM node:14-alpine as build

WORKDIR /usr/local/app
# Add the source code to app
COPY ./ /usr/local/app/
# Install all the dependencies
RUN npm install
# Generate the build of the application
RUN npm run build

###############
### STAGE 2: Serve app with nginx ###
###############
FROM nginx:1.19.3-alpine
COPY  --from=build /usr/local/app/dist /usr/share/nginx/html

# Expose port 80
EXPOSE 80

# When the container starts, replace the env.js with values from environment variables
CMD [&quot;/bin/sh&quot;,  &quot;-c&quot;,  &quot;envsubst &lt; /usr/share/nginx/html/assets/env.sample.js &gt; /usr/share/nginx/html/assets/env.js &amp;&amp; exec nginx -g &apos;daemon off;&apos;&quot;]
</code></pre>
<!--kg-card-end: markdown--><p>There we have everything set up and ready to fly!!!!</p><p>I&apos;ve come to like &#xA0; <code>docker-compose</code> because of the way it helps managing multiple services while also easily controlling the environment. I would suggest that we use it.</p><p>Let&apos;s create a &#xA0;<code>docker-compose.yml</code> file that will build our image and start its container and make it listen on port 4200. </p><!--kg-card-begin: markdown--><pre><code class="language-yml">version: &apos;3.5&apos;

services:
  web:
    env_file: &quot;.env&quot;
    build:
      context: .
      dockerfile: Dockerfile
    ports:
      - &apos;4200:80&apos;
</code></pre>
<!--kg-card-end: markdown--><p>With the .env file having the following content:</p><!--kg-card-begin: markdown--><pre><code>BACKEND_BASE_URL=http://the-bakend-url.com/api/v1
</code></pre>
<!--kg-card-end: markdown--><p>Now if you just run &#xA0;<code>docker-compose up --build</code> &#xA0;at the root of the project, you will find the app on port 4200 with the right environment variable that you set in the .env file.</p><p>Thanks for taking the time to read.</p><p>Ciao &#x1F44B;&#x1F3FD;</p><p><strong>References:</strong></p><ol><li><a href="https://medium.com/@wkrzywiec/build-and-run-angular-application-in-a-docker-container-b65dbbc50be8">https://medium.com/@wkrzywiec/build-and-run-angular-application-in-a-docker-container-b65dbbc50be8</a></li><li><a href="https://www.youtube.com/watch?v=2nqkIDNkVfY">https://www.youtube.com/watch?v=2nqkIDNkVfY</a></li><li><a href="https://skofgar.ch/dev/2020/08/how-to-quickly-replace-environment-variables-in-a-file/">https://skofgar.ch/dev/2020/08/how-to-quickly-replace-environment-variables-in-a-file/</a></li><li><a href="https://angular.io/guide/build">https://angular.io/guide/build</a></li></ol>]]></content:encoded></item><item><title><![CDATA[Investigating the TMDB movie dataset, part 2]]></title><description><![CDATA[Investigating the TMDB movie dataset, part 2]]></description><link>https://nkpremices.com/investigating-the-tmdb-movie-dataset-part-2/</link><guid isPermaLink="false">61089a2073780805b945cb8c</guid><category><![CDATA[Python]]></category><category><![CDATA[Data Analysis]]></category><category><![CDATA[Blog]]></category><dc:creator><![CDATA[Prémices N. Kamasuwa]]></dc:creator><pubDate>Tue, 03 Aug 2021 01:25:16 GMT</pubDate><media:content url="https://nkpremices.com/content/images/2021/08/mih10uhu1464fx1kr0by-2.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://nkpremices.com/content/images/2021/08/mih10uhu1464fx1kr0by-2.jpg" alt="Investigating the TMDB movie dataset, part 2"><p>This blog post is the second part of a series. I recommend reading the first part to fully understand the context of this one. In this post, we&apos;ll focus on data cleaning, using the results from the first part as our foundation.</p><h1 id="data-cleaning">Data Cleaning</h1><h5 id="step-1-remove-columns-with-excessive-null-values">Step 1. Remove Columns with Excessive Null Values.</h5><!--kg-card-begin: markdown--><p><code>df.head(1)</code></p>
<!--kg-card-end: markdown--><figure class="kg-card kg-image-card"><img src="https://res.cloudinary.com/premices/image/upload/v1583942632/Screen_Shot_2020-03-11_at_18.03.05_czezk0.png" class="kg-image" alt="Investigating the TMDB movie dataset, part 2" loading="lazy"></figure><h5 id="step-2-remove-duplicated-data">Step 2. Remove duplicated data</h5><blockquote><code>df.drop_duplicates(inplace=True)</code></blockquote><h5 id="step-3-eliminate-rows-with-null-values-in-essential-columns">Step 3. Eliminate Rows with Null Values in Essential Columns</h5><blockquote><code>df.dropna(subset = [&apos;cast&apos;, &apos;director&apos;, &apos;genres&apos;], how=&apos;any&apos;, inplace=True)</code></blockquote><p>Let&apos;s check if there are still null values</p><blockquote><code>df.isnull().sum()</code></blockquote><figure class="kg-card kg-image-card"><img src="https://res.cloudinary.com/premices/image/upload/v1583940342/Screen_Shot_2020-03-11_at_17.19.32_ei0xld.png" class="kg-image" alt="Investigating the TMDB movie dataset, part 2" loading="lazy"></figure><h5 id="step-4-replace-zero-values-with-null-values-in-the-budget-and-revenue-column">Step 4. Replace zero values with null values in the budget and revenue column.</h5><!--kg-card-begin: markdown--><p><code>df[&apos;revenue&apos;] = df[&apos;revenue&apos;].replace(0, np.NaN)df.info()</code></p>
<!--kg-card-end: markdown--><figure class="kg-card kg-image-card"><img src="https://res.cloudinary.com/premices/image/upload/v1583940343/Screen_Shot_2020-03-11_at_17.19.42_jinvxt.png" class="kg-image" alt="Investigating the TMDB movie dataset, part 2" loading="lazy"></figure><h5 id="step-5-drop-the-runtime-column">Step 5. Drop the runtime column.</h5><!--kg-card-begin: markdown--><p><code>df.query(&apos;runtime == 0&apos;)</code></p>
<!--kg-card-end: markdown--><figure class="kg-card kg-image-card"><img src="https://res.cloudinary.com/premices/image/upload/v1583940342/Screen_Shot_2020-03-11_at_17.19.57_wqlokv.png" class="kg-image" alt="Investigating the TMDB movie dataset, part 2" loading="lazy"></figure><blockquote><code>df.info()</code></blockquote><figure class="kg-card kg-image-card"><img src="https://res.cloudinary.com/premices/image/upload/v1583940343/Screen_Shot_2020-03-11_at_17.20.04_tfcs5d.png" class="kg-image" alt="Investigating the TMDB movie dataset, part 2" loading="lazy"></figure><blockquote><code>df.describe()</code></blockquote><figure class="kg-card kg-image-card"><img src="https://res.cloudinary.com/premices/image/upload/v1583943336/Screen_Shot_2020-03-11_at_18.15.18_vstnox.png" class="kg-image" alt="Investigating the TMDB movie dataset, part 2" loading="lazy"></figure><p>From the table above, we can see that replacing the zeros with null values in the budget and revenue distribution made them look better. We can also see that the minimum makes now more sense</p><hr><p>This is the end of the second part. If you got some good time reading, stay tuned. I will post the third part soon.</p><p>Thank you for reading.</p>]]></content:encoded></item><item><title><![CDATA[Investigating the TMDB movie dataset]]></title><description><![CDATA[Investigating the TMDB movie dataset]]></description><link>https://nkpremices.com/investigating-the-tmdb-movie-dataset/</link><guid isPermaLink="false">610894093455141cbf9ee82b</guid><category><![CDATA[Python]]></category><category><![CDATA[Data Analysis]]></category><category><![CDATA[Blog]]></category><dc:creator><![CDATA[Prémices N. Kamasuwa]]></dc:creator><pubDate>Tue, 03 Aug 2021 01:19:50 GMT</pubDate><media:content url="https://nkpremices.com/content/images/2021/08/mih10uhu1464fx1kr0by-1.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://nkpremices.com/content/images/2021/08/mih10uhu1464fx1kr0by-1.jpg" alt="Investigating the TMDB movie&#xA0;dataset"><p>I participated in the <a href="https://medium.com/r/?url=https%3A%2F%2Fwww.udacity.com%2Fcourse%2Fdata-analyst-nanodegree--nd002">Data analyst nanodegree program from Udacity</a>where I worked on a number of projects. In the coming weeks, I will be writing blog posts to share my experiences and insights from these projects.</p><blockquote><em><code>Note:</code> </em>This blog post is the first part of a series where I analyze a dataset. The goal is to demonstrate how straightforward data analysis can be.</blockquote><h1 id="introduction">Introduction</h1><p>Are you curious about what makes a movie successful? In this series of blog posts, we&apos;ll use data from The Movie Database (TMDb) to explore the factors that contribute to a film&apos;s popularity, ratings, and revenue. Our dataset includes information on over 5,000 movies, covering aspects like budget, cast, director, keywords, runtime, genres, production companies, release date, and more.</p><p>In this first post, we&apos;ll take a closer look at the TMDb movie data and introduce some of the questions we&apos;ll be addressing in the coming weeks, such as:</p><ul><li>How has movie popularity changed over the years?</li><li>How does revenue vary across different ratings and popularity levels?</li><li>What characteristics are associated with high-popularity movies?</li><li>How many movies are released each year?</li><li>What are the keyword trends by generation?</li></ul><p>Using tools like Numpy, Pandas, and Matplotlib, we&apos;ll dive into the data to uncover valuable insights. But before we begin, let&apos;s introduce the dataset and discuss its contents.</p><p><strong>LET&apos;S GO!!</strong></p><p>First, let&apos;s import the necessary packages.</p><!--kg-card-begin: markdown--><pre><code class="language-import">import matplotlib.pyplot as plt
import seaborn as sns
from collections import Counter

%matplotlib inline
</code></pre>
<!--kg-card-end: markdown--><h1 id="data-wrangling">Data Wrangling</h1><h5 id="general-properties">General Properties</h5><p>Let&apos;s load the info of the dataset</p><!--kg-card-begin: markdown--><p><code>df.info()</code></p>
<!--kg-card-end: markdown--><figure class="kg-card kg-image-card"><img src="https://res.cloudinary.com/premices/image/upload/v1583940342/Screen_Shot_2020-03-11_at_17.17.33_uuhhm8.png" class="kg-image" alt="Investigating the TMDB movie&#xA0;dataset" loading="lazy"></figure><p>The TMDB movie data includes 10866 entries and 21 columns, with data types including integers, floats, and strings. A significant number of columns have null values, as indicated by the number of entries per column. In the next step, we will examine the exact number of null records per column.</p><blockquote><code>list(df.isnull().sum().items())</code></blockquote><figure class="kg-card kg-image-card"><img src="https://res.cloudinary.com/premices/image/upload/v1583940342/Screen_Shot_2020-03-11_at_17.18.13_b4rymb.png" class="kg-image" alt="Investigating the TMDB movie&#xA0;dataset" loading="lazy"></figure><p>After examining the null values in the TMDB movie data, we found that several columns contain null records, including cast, homepage, director, tagline, keywords, overview, genres, and production companies. In particular, the homepage, tagline, keywords, and production_companies columns have a large number of null records. In order to move forward with our analysis, we decided to remove the tagline and keywords columns, which had a high number of null values. <br><br>Next, we will try to gather more descriptive information from the dataset.</p><blockquote><code>df.describe()</code></blockquote><figure class="kg-card kg-image-card"><img src="https://res.cloudinary.com/premices/image/upload/v1583940342/Screen_Shot_2020-03-11_at_17.18.30_zzuejp.png" class="kg-image" alt="Investigating the TMDB movie&#xA0;dataset" loading="lazy"></figure><p>After examining the popularity column in the TMDB movie data, we observed some outliers that appear to be valid data points. Therefore, we decided to retain the original data rather than remove these outliers.</p><p>We also noticed that the budget, revenue, and runtime columns contain many zero values. Initially, we considered the possibility that these movies were not released, but upon examining the release_year column, we found that the minimum value (1996) is a valid year and that there were no null values. This suggests that these movies were indeed released, but may have missing data for budget, revenue, and runtime. In order to determine the cause of these zero values, we will closely examine these records and try to gather more information about them.</p><!--kg-card-begin: markdown--><p><code>df_budget_zero.head(3)</code></p>
<!--kg-card-end: markdown--><figure class="kg-card kg-image-card"><img src="https://res.cloudinary.com/premices/image/upload/v1583940342/Screen_Shot_2020-03-11_at_17.18.49_nqxyhy.png" class="kg-image" alt="Investigating the TMDB movie&#xA0;dataset" loading="lazy"></figure><p>Then for the revenue</p><!--kg-card-begin: markdown--><p><code>df_revenue_zero.head(3)</code></p>
<!--kg-card-end: markdown--><figure class="kg-card kg-image-card"><img src="https://res.cloudinary.com/premices/image/upload/v1583940342/Screen_Shot_2020-03-11_at_17.18.58_j3sjwd.png" class="kg-image" alt="Investigating the TMDB movie&#xA0;dataset" loading="lazy"></figure><p>After investigating the records with zero values for budget and revenue, we found that these values were likely missing data rather than indicating that the movies were not released. However, we also discovered that some of these records had other inconsistencies or missing data that could potentially affect the results of our analysis. As a result, we decided to drop these records rather than impute the missing values or set them to zero.</p><p>Next, we will check the number of null values in the dataset to determine whether we should drop or impute these values as well.</p><p>First for the budget zero values</p><!--kg-card-begin: markdown--><p><code>df_budget_0count.head(2)</code></p>
<!--kg-card-end: markdown--><figure class="kg-card kg-image-card"><img src="https://res.cloudinary.com/premices/image/upload/v1583941289/Screen_Shot_2020-03-11_at_17.40.40_tus7bz.png" class="kg-image" alt="Investigating the TMDB movie&#xA0;dataset" loading="lazy"></figure><p>As suggested by the results, there are a lot of zero values than non-zero values. Dropping them out would corrupt the results. I better set them as null instead.</p><p>Then for the revenue zero values</p><!--kg-card-begin: markdown--><p><code>df_revenue_0count.head(2)</code></p>
<!--kg-card-end: markdown--><figure class="kg-card kg-image-card"><img src="https://res.cloudinary.com/premices/image/upload/v1583941289/Screen_Shot_2020-03-11_at_17.40.48_kb0egw.png" class="kg-image" alt="Investigating the TMDB movie&#xA0;dataset" loading="lazy"></figure><p>Same situation. Set to null</p><p>Finally for the runtime</p><figure class="kg-card kg-image-card"><img src="https://res.cloudinary.com/premices/image/upload/v1583941289/Screen_Shot_2020-03-11_at_17.40.58_udrblh.png" class="kg-image" alt="Investigating the TMDB movie&#xA0;dataset" loading="lazy"></figure><p>The number of zeroes is negligible, they can be dropped out</p><h1 id="summary">Summary</h1><p>In this first part of the series, we focused on preparing the TMDB movie data for analysis. We removed some columns that had a lot of null values or were not necessary for answering our research questions, and we also dropped duplicated data. We then removed null values in certain columns and replaced zero values with null values in the budget and revenue columns. Finally, we dropped any rows with a runtime of zero.</p><hr><p>Thank you for reading! In the next part of the series, we will continue with data cleaning and begin to explore the data in more depth. Stay tuned!</p><p>Ciao &#x1F44B;&#x1F3FE;</p>]]></content:encoded></item></channel></rss>