Grid computing on map-reduce architecture

Mario Miosga successfully defending Bachelor thesis

Seville, June 2015. Mario Miosga defended successfully his Bachelor Thesis at the Universidad de Sevilla. Mario is a student from HTWG Konstanz moving with the ERASMUS agreement to Seville. At the Universidad de Sevilla he implemented a map-reduce system on a comparable small hardware infrastructure (Intel Galileo) and in comparison he used the high-performance computing cluster in Seville. The thesis has been co-supervised by Prof. Dr. Juan Antonio Ortega and Prof. Dr. Ralf Seepold. There is a long successful research cooperation between both laboratories. This thesis was running also in this frame of cooperation. We congratulate Mario for his excellent work and we thank you very much our colleagues in Seville for their support.

 Mario Seville

(Mario Miosga during the presentation at Universidad de Seville)
Title of the thesis: User centralized data analysis based on a grid computing on map-reduce architecture

When it comes to the subject of personal data, people are still very restrained. Especially data that concerns with health, they are very carefully about it. But with today’s technology, it should be possible to collect this kind of information and let it remotly been analyzed by professionals. This is one of the use-case scenarios which I faced, while developing a system that makes this possible. The Hardware I was using were four Intel Galileo Gen 2 development boards and a regular server. The goal was, to develop a system, that should enable analyzing collected data of a specific user remotely. Therefor, the Intel Galileo boards should be the machines, where each user stores only their own data on. The server should be the junction, to interact with each single board, as well as to interact with all boards at ones. To reach this goal, I decided to use the Apache Hadoop project, which enables to build a cluster with multiple nodes. Despite the fact, that the hardware of the boards I was using, was not even close to the recommended requirements for a Hadoop cluster, the results that I have achieved are really surprising. To enable a better analyze of the performance as well as to have a result to which I can compare the Intel Galileo boards to, I used regular servers where I have setup the same system.


  • mario_1
  • mario_2

News Archive