30. 05. 2022.

How to get 100% cache hit rate by using Change Data Capture & Redis

In this blog I’ll explain how to get 100% cache hit rate by using CDC (Change Data Capture) technology and Redis cache.   There are multiple benefits of having caching layer in front of back-end database system. By fetching data from the cache instead of back-end we are actually free up valuable database resources for […]

29. 12. 2021.

Apache Ignite – distributed In-memory SQL database

Apache Ignite is one of the very few In-memory SQL compliant distributed databases/data grid among open-source projects. It’s often called “Redis done right” or “Redis on steroid”, because Redis looks primitive and limited when compared with Apache Ignite. Ignite offers great flexibility and lot of features that can easily fit to many use cases. Instead […]

23. 12. 2021.

YugabyteDb – distributed SQL database for a new age

Recently I’ve got a chance to try YugabyteDb, one of the new age databases which try to tackle with new requirements such as scalability, resilience, high availability, Cloud/Hybrid readiness and new architecture styles based on microservices. Although Yugabyte is relatively young company, it attracts a lot of attention, not only from architects/developers/admins, but also from […]

29. 11. 2021.

Complex near real-time transformations in data pipelines

For many years, ETL daily batch job was the dominant way to perform data transformations before loading in Data Warehouse. These days requirements are quite different starting with the most important one which is to ensure that new data has to be available for AI/ML and analysis near real time. Moreover, classical DWH databases are […]

01. 04. 2021.

Trino (ex. Presto) – troubleshooting distributed transactions among various data sources

In this post I’ll demonstrate one of many use cases of Presto technology, that you might overlooked – How to troubleshoot distributed transactions which are very common these days as a result of a complex Microservices architecture. In the following SELECT statement I’ll combine three different data sources: Oracle Postgres Kafka by using good old […]

17. 03. 2021.

Trino (ex. Presto) – high performance distributed query engine

In this article I’ll share some of my experiences with Trino (ex. Presto) – high performance distributed query engine.   First some intro about the project Presto. Couple of members from the Facebook infrastructure team created the project Presto to address problems they have with 300 Petabytes Hadoop Data Warehouse. The main goal of the […]