29. 12. 2021.

Apache Ignite – distributed In-memory SQL database

Apache Ignite is one of the very few In-memory SQL compliant distributed databases/data grid among open-source projects. It’s often called “Redis done right” or “Redis on steroid”, because Redis looks primitive and limited when compared with Apache Ignite. Ignite offers great flexibility and lot of features that can easily fit to many use cases. Instead […]

23. 12. 2021.

YugabyteDb – distributed SQL database for a new age

Recently I’ve got a chance to try YugabyteDb, one of the new age databases which try to tackle with new requirements such as scalability, resilience, high availability, Cloud/Hybrid readiness and new architecture styles based on microservices. Although Yugabyte is relatively young company, it attracts a lot of attention, not only from architects/developers/admins, but also from […]

06. 12. 2021.

How to create a real time machine learning pipeline with StreamSets Transformer

Artificial Intelligence (AI) with its subset ML (Machine learning) is probably one of the hottest topics in IT industry today. Many companies are struggling to implement AI algorithms into data pipelines to make smarter decisions with more or less success. First of all, the AI is a wide topics which requires knowledge of math, statistics, […]

29. 11. 2021.

Complex near real-time transformations in data pipelines

For many years, ETL daily batch job was the dominant way to perform data transformations before loading in Data Warehouse. These days requirements are quite different starting with the most important one which is to ensure that new data has to be available for AI/ML and analysis near real time. Moreover, classical DWH databases are […]

22. 10. 2021.

When visual tool for monitoring appears to lie

A few days ago I was asked to take a look at two queries that shows up among the top queries in the Oracle SQL Developer instance viewer. I’ve extracted two statements that are relevant for this case.     The select statements for both records are almost identical:   From the SELECT statement it […]

18. 08. 2021.

Functional monitoring of Microservices architecture by using Apache Superset

Many of you who have started to develop modern apps by using Microservices approach, have already learned that development tools, debuggers, performance monitoring and tracing lag behind the desired architecture. Situation is even worse when it comes to functional monitoring, where your goal is to find out what is going on with your system from […]

05. 06. 2021.

Missing columns in PrestoSQL

One of the first issues when starting to use PrestoSQL distributed query engine is related to missing columns of certain data types, especially numeric and all variants of date. This issue is usually because of missing precision at the data source, which is not only one of the most common, but also one of the […]

04. 06. 2021.

Hybernate FetchType Eager performance issue

Quite recently I had one interesting case related to the quality of code generated by the Hybernate, in which I developed code which runs 25 thousand times faster than the code generated by the framework.   It’s well known when you decide to use any framework to speed up code development process, you silently agree […]

04. 05. 2021.

Tuning Connection pool in modern Microservice architecture

Connection pool has always been a great way to ensure a low latency when establishing connection with a database, while at the same time keeping the number of open sessions under control. It’s one of the best ways to balance speed with resource consumption. With connection pool in place, connection is already established and ready […]

01. 04. 2021.

Trino (ex. Presto) – troubleshooting distributed transactions among various data sources

In this post I’ll demonstrate one of many use cases of Presto technology, that you might overlooked – How to troubleshoot distributed transactions which are very common these days as a result of a complex Microservices architecture. In the following SELECT statement I’ll combine three different data sources: Oracle Postgres Kafka by using good old […]