My Blog

  • How to Speed Up Data Analysis with Junk Dimensions

    How to Speed Up Data Analysis with Junk Dimensions

    In the realm of data warehousing and analytics, the concept of junk dimensions might seem counterintuitive at first glance. Contrary to their name, junk dimensions play a pivotal role in optimising data performance and simplifying complex data structures. Read more

  • Fillna() The Kimball Way – Handle Incorrect/Missing Dimension Values

    Fillna() The Kimball Way – Handle Incorrect/Missing Dimension Values

    The Kimball methodology emphasises the importance of dimensional modelling in designing data warehouses. One key challenge in this methodology involves dealing with missing or incorrect dimension values within fact tables. The fillna() function in Python, commonly used in data manipulation libraries like Pandas, can be leveraged to handle such scenarios efficiently. Read more

  • Should Your Data Be 100% Accurate?

    Should Your Data Be 100% Accurate?

    When it comes to data warehousing and analytics engineering, the pursuit of data accuracy is a never-ending mission. It holds the promise of giving us valuable insights, helping us make informed decisions, and increasing operational efficiency. However, this pursuit comes with its fair share of challenges. Professionals in the field often find themselves faced with… Read more

  • Is Your Data Lying To You?

    Is Your Data Lying To You?

    In our contemporary data landscape, dashboards have emerged as pivotal instruments in guiding decision-making across many sectors. These visual representations of data offer profound insights capable of shaping strategies, fostering innovation, and streamlining operations. However, the presumption of data objectivity often masks the biases ingrained within these dashboards, potentially distorting our perceptions and influencing decisions.… Read more

  • The Godfather of Visualisations and His Favourite Charts

    The Godfather of Visualisations and His Favourite Charts

    Edward Tufte, a pioneering figure in the realm of data visualisation, is renowned for his advocacy of simplicity in presenting complex information. His vision transcends traditional boundaries, emphasising clarity, precision, and elegance in visualizations across diverse data types. Let’s delve into Tufte’s influential ideas on simplicity and their application in various visualisation formats. Read more

  • A Beginner’s Guide to Handling Missing Values in Python

    A Beginner’s Guide to Handling Missing Values in Python

    Dealing with missing data is an essential part of data analysis and machine learning. In real-world datasets, missing values are commonplace and can hinder the accuracy and reliability of your analysis. Fortunately, Python provides various tools and libraries to effectively handle missing data, allowing you to clean, pre-process, and analyse datasets with ease. Read more