Book review: Data-Intensive Text Processing with MapReduce

As part of my studies not directly connected to my job, in the last month I’ve finished this interesting book, which gives you a good overview about MapReduce.

Jimmy Lin and Chris Dyer nailed this one: the book is really clear and leaves room for further studies, maybe more practical ones: the book starts with the definition of MapReduce, from the algorithm to the execution framework and the analyzes each part of the algorithm and theyr variants in some frameworks, like Hadoop1.

After studying the components of MapReduce you will take a practical look at possible implementations2, from graph algorithms to the pagerank one: note that there are so many references in the book, so if you will be into it, you’re gonna find yourself screwed in a nerdy loop :)

The EM chapter (dealing with expectation-maximization algorithms) was pretty difficult for me, as it’s too much time that I don’t take math that seriously, but I – however – was able to follow all the theory explained there.

Something that I really appreciated was the closing remark stating that, just like every technology, is not the right pick for every problem – think about stateful large scale data-processing algorithms.

I strongly recommend you to read such this kind of book: the approach followed by the writers is so engineered and some examples they give, like the stupid backoff, are pearls for your working experience.

  1. But please remember, this book is not any kind of Hadoop guide
  2. Which are code-agnostic, as everything is written in a pretty clear pseudo-code

Hi there! I recently wrote an ebook on web application security, currently sold on leanpub, the Amazon Kindle store and gumroad.

It contains 160+ pages of content dedicated to securing web applications and improving your security awareness when building web apps, with chapters ranging from explaining how to secure HTTP cookies with the right flags to understanding why it is important to consider joining a bug bounty program.

Feel free to skim through some of the free chapters published on this blog and, if the content seems interesting enough to you, grab a copy on leanpub, the Amazon Kindle store, gumroad or simply checkout right down below!

Buy the Web Application Security ebook for $9.99

In the mood for some more reading?

...or check the archives.