`
kanwoerzi
  • 浏览: 1647855 次
文章分类
社区版块
存档分类
最新评论

Google’s MapReduce Programming Model-Revisted

 
阅读更多

Google’s MapReduce Programming Model-Revisted

Google's MapReduce programming model serves for processing large data sets in a massively parallel manner. We deliver the first rigorous description of the model including its advancement as Google's domain-specific language Sawzall. To this end, we reverse-engineer the seminal papers on MapReduce and Sawzall, and we capture our findings as an executable specification. We also identify and resolve some obscurities in the informal presentation given in the seminal papers. We use typed functional programming (specifically Haskell) as a tool for design recovery and executable specification. Our development comprises three components: (i) the basic program skeleton that underlies MapReduce computations; (ii) the opportunities for parallelism in executing MapReduce computations; (iii) the fundamental characteristics of Sawzall's aggregators as an advancement of the MapReduce approach. Our development does not formalize the more implementational aspects of an actual, distributed execution of MapReduce computations.

Keywords:Data processing; Parallel programming; Distributed programming; Software design; Executable specification; Typed functional programming; MapReduce; Sawzall; Map; Reduce; List homomorphism; Haskell

http://portal.acm.org/citation.cfm?id=1290812

Google’s MapReduce programming model — Revisited

Ralf LämmelCorresponding Author Contact Information,a,E-mail The Corresponding Author

aData Programmability Team, Microsoft Corp., Redmond, WA, USA

Received 9 February 2006;
revised 10 July 2007;
accepted 10 July 2007.
Available online 18 July 2007.

Abstract

Google’s MapReduce programming model serves for processing large data sets in a massively parallel manner. We deliver the first rigorous description of the model including its advancement as Google’s domain-specific language Sawzall. To this end, we reverse-engineer the seminal papers on MapReduce and Sawzall, and we capture our findings as an executable specification. We also identify and resolve some obscurities in the informal presentation given in the seminal papers. We use typed functional programming (specifically Haskell) as a tool for design recovery and executable specification. Our development comprises three components: (i) the basic program skeleton that underlies MapReduce computations; (ii) the opportunities for parallelism in executing MapReduce computations; (iii) the fundamental characteristics of Sawzall’s aggregators as an advancement of the MapReduce approach. Our development does not formalize the more implementational aspects of an actual, distributed execution of MapReduce computations.

Keywords:Data processing; Parallel programming; Distributed programming; Software design; Executable specification; Typed functional programming; MapReduce; Sawzall; Map; Reduce; List homomorphism; Haskell


Corresponding Author Contact InformationCorresponding address: Universität Koblenz-Landau, Institut für Informatik B 128, Universitätsstrasse 1, D-56070 Koblenz, Germany.

http://www.sciencedirect.com/science/article/pii/S0167642307001281

[PDF]

Google's MapReduce Programming Model — Revisited

-[]
文件格式:PDF/Adobe Acrobat -快速查看
作者:R Lämmel-被引用次数:111-相关文章
Google's MapReduce Programming Model — Revisited. ∗. Ralf Lämmel. Data Programmability Team. Microsoft Corp. Redmond, WA, USA. Abstract...
citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.104...

Author: Ralf Lämmel Data Programmability Team, Microsoft Corp., Redmond, WA, USA
Published in:
·Journal
Science of Computer Programmingarchive

Volume 68 Issue 3, October, 2007
Elsevier North-Holland, Inc.Amsterdam, The Netherlands, The Netherlands
tableofcontentsdoi>10.1016/j.scico.2007.07.001


分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics