English      Slovensko
DIST Department of information sciences and technologies

Location: FAMNIT-1 MP2 at 16:00

Lecturer: Branko Kavšek, UP FAMNIT, IJS

Title: MapReduce and Hadoop: Parallel, scalable processing of massive data

What are big data? How to store it? How to process it? We will discuss the MapReduce paradigm and its open source Hadoop implementation, which has become in the last decade the most popular and most common way of working with big data. In the first part of the seminar, the focus will be on the three phases of MapReduce (phase "Map", phase "Shuffle" and phase "Reduce") to better undestand the theory behind the paradigm. The second part of the seminar will be devoted to the presentation of some typical use cases. We will present an open source implementation of MapReduce called Hadoop and process a couple of examples of big data and look at how to use Hadoop on real-life problems.


Edit: added photo