Solution space

In the previous article, we talked about the need for a design for a power grid management system. Let's try to create one and offer some guidelines on how it could be implemented.

Architecture

Logical design

I'm thinking we should aim for a classic data ingestion and processing architecture. It includes the following components:

Data Ingestion:

  • External interface of the system.

  • Handles real-time data streams from power grid components and feeds it to a message queue.

  • Scalable to handle high volume of incoming data and to fulfill high availability requirement.

Queue:

  • Decouples data ingestion from data processing to prevent back-pressure on processing and to make it possible to independently scale ingestion.

Processing:

  • Reads from the message queue and performs real-time data processing (aggregation) for monitoring and control.

  • Another responsibility is batch processing for analytics and reporting.

  • Persists data into a data store.

Real-time & Historical Data:

  • Data store(s) to handle operational data and time series data.

Application Services:

  • Core and analytics logic for system functionality.

  • Connected to the persistent data store(s).

Presentation:

  • Internal UI for system users.

  • Dashboards for monitoring and operation, business reports.

  • Functions and data of the application services are exposed here.

Deployment diagram

What would this system look like in reality? Typically, application or business logic is delivered in Docker images. However, the system relies on functionality that is usually provided at runtime and not implemented by the development team. In our case, these dependencies are the message queues and data store. These components are managed by the cloud provider and are not considered the responsibility of the application.

Alright, so we have Docker images. We can run these within a Kubernetes cluster to gain some useful extra features, such as:

  • Orchestration, scheduling

  • Load balancing, service discovery

  • Scaling, self-healing

  • Rolling updates and rollbacks

  • Configuration management

That would be a typical deployment today: a few services in Docker within a Kubernetes cluster.

Do we really need a distributed system for this application? Perhaps a modular monolith would be enough. We need high availability, so we should have more than one instance of this application, preferably in multiple availability zones. However, this doesn't necessarily mean that we have to implement each component as a standalone service.

Let's explore the possible technologies we could use to implement the components, and maybe we can gain a better understanding of the required deployment.

Implementation

Data Ingestion:

  • Multi-threaded adapter which collects data from the grid and pumps it into the queue.

  • The adapter must implement modern grid equipment protocols like IEC 61850.

Queue:

  • Kafka would be an obvious choice here or RabbitMQ maybe.

Processing:

  • Apache Flink or Kafka Streams.

Data Stores:

  • Some in-memory DB for real-time data, such as Redis.

  • Traditional data store for reporting and analytics like PostgreSQL, maybe a time-series database.

Application Services:

  • Java or a similar battle-tested, enterprise-grade language.

Presentation:

  • Trendy JavaScript framework with TypeScript support (e.g. Vue.js) or Grafana for visualization.

Looking at this stack, I see no issues with starting to implement this application as a modular monolith since:

  • One team could deliver this software. If we have two or more teams, then we might want to consider having two or more autonomous services in our solution design.

  • It can be written in one language for one runtime platform.

  • High availability can be achieved by deploying multiple instances of the modular monolith.

My only concern is that we might need to scale the Data Ingestion and Processing components separately from the rest of the system. We can address this by planning ahead and expecting the change that we might have to extract these two components into their own services later, if production usage shows it's necessary. We have to define clear bounderies around those components to make our job easier in the future.

Therefore, using a modular monolith seems like a reasonable choice.

Before we jump into the implementation part, we should continue with a prototype of the application. That will help the business further clarify the requirements and understand the problem domain in more detail.

Let's create the prototype in the next article of this series.