Yahoo's S4 stream computing platform

The internet had a huge bump during the late ‘90s, and all the majors needed something web-scale in order to conquer the ‘net.

Google, after its launch, suddenly became the king of the web, destroying the competitors year by year.

Far from being a simple search engine, it evolved giving us incredible services, like Gmail and Docs: web-ready and web-scale services able to endure massive traffic and data-exchange.

Furthermore, the search engine was accumulating an extraordinary amount of records and it probably needed a brand new algorithm to index and process its contents.

It was 2004, and Google came out with MapReduce, a new way of thinking the distribution of the workload.

Beyond MapReduce

MapReduce rocks, everybody thought.

But then the problems came out: MR is solid and ass-kicking when you need to process tons of permanent data; when you need to do huge batch operations you should use it, nobody hesitates.

But there is a scenario where MR probably isn’t the best choice for your needs: processing huge streams of data.

Enters S4

S4 – I really don’t know the reason behind this name – is a framework developed by Yahoo for processing continuoos streams of data.

Its entire architecture is thought to obviusly be event-driven, where computational units, known as processing elements (PE), process and possibly re-dispatch events emitted by the internal framework.

It already supports client adapters and an high level API, so you should be able to integrate with the platform in any language you want to: unfortunately, the docs have a PERL example :)


Hi there! I recently wrote an ebook on web application security, currently sold on leanpub, the Amazon Kindle store and gumroad.

It contains 160+ pages of content dedicated to securing web applications and improving your security awareness when building web apps, with chapters ranging from explaining how to secure HTTP cookies with the right flags to understanding why it is important to consider joining a bug bounty program.

Feel free to skim through some of the free chapters published on this blog and, if the content seems interesting enough to you, grab a copy on leanpub, the Amazon Kindle store, gumroad or simply checkout right down below!

Buy the Web Application Security ebook for $6.99

In the mood for some more reading?

...or check the archives.