The real-time mobile subscribers monitoring platform — Broscorp

The real-time mobile subscribers monitoring platform

Problem statement

People using mobile phones in highly populated regions produce vast and unpredictable loads on the telecom infrastructure. The high amount of data/voice usage an area which doesn’t have equipment that can handle it will produce voice breaks, slow internet speed or even outage. Quick and immediate tracking of such problems can dramatically improve the quality in these areas, which results in higher client satisfaction.

Purpose of the project

Collect real-time data from mobile cells in order to:

  1. Track Quality of Service in real-time.
  2. Analyze usage patterns.
  3. Make a retrospective analysis.

Our client (a mobile operator in Asia) has millions of subscribers but didn’t have a reliable and scalable solution to track the aforementioned problems. In the modern world, we have vast amounts of data which become impossible to handle in an old-fashioned way. The market also challenges you to react faster than your competitors. So you have to process data, visualize and then finally react appropriately.

Project description

The project consists of two main parts:

  1. Data processing aka ETL
  2. Data visualization

First, let’s take a look at Data processing. Data processing should be reliable, fast, scalable and results should be stored in analytical storage for further analysis. As a processing engine, we decided to use Spark and as analytical storage, we chose HP Vertica. All the needed transformation and filtering were happening in Spark Jobs developed in Scala language. So, as a result, we did a pipe Kafka -> Spark -> Vertica.

Once data is in place and ready to be consumed it’s time to talk about data visualization.

Each mobile phone is connected to several mobile cells so we were getting coordinates that told us within a 50-metre accuracy where the client was at any time. The thing is, we were not interested in one client but in how many clients were in a 50×50 square metre area. So we broke up the map into 50×50 square metre regions and calculated how many users we had in each. Each square represents a decent number of users who make calls, use mobile internet, send SMS. As a result, this process map became covered with thousands of small squares. Each level of quality has its own colour. So by watching the map, you can immediately see areas with poor quality. In order to make such visualization interactive we implemented a playback feature that imitates real-time updates. As a result, you can see how your network deals with the load over the day and in different city areas. Moreover, you can go back in time, select the time range in the past and replay it and of course, you can drill down to see what was the average mobile internet speed, voice quality, how much time that particular 50×50 m square was overcrowded, detect patterns, etc.

Technical overview

The combination of technologies was selected to satisfy at most a 5 minute lag from real-time.

Kafka cluster that acts as a queue of raw data coming from mobile cells.
Spark cluster acts as a real-time processing engine, loading data from Kafka, transforming and pre aggregating it. Then the result is loaded to HP Vertica.
HP Vertica cluster acts as Data Warehouse to store the data.
Cyclops Web Application acts as UI to visualize the data using custom integration with Google Maps API.

Result and value

Finally, we had created a fully functional system that fulfils all requirements. In case the mobile operator would start getting data at a higher speed than now, the solution we created can be easily scaled by adding more similar hardware without redevelopment. Data storage also can be easily scaled, because HP Vertica can handle up to 100 TB. The user interface passed internal testing and has shown the ability to display all the relevant information without any lags. In the end, after all the verification solution had been successfully launched on the client’s hardware it has subsequently been supplying it with relevant, reliable information.