Google’s MapReduce Programming Model-Revisted
Google's MapReduce programming model serves for processing large data sets in a massively parallel manner. We deliver the first rigorous description of the model including its advancement as Google's domain-specific language Sawzall. To this end, we reverse-engineer the seminal papers on MapReduce and Sawzall, and we capture our findings as an executable specification. We also identify and resolve some obscurities in the informal presentation given in the seminal papers. We use typed functional programming (specifically Haskell) as a tool for design recovery and executable specification. Our development comprises three components: (i) the basic program skeleton that underlies MapReduce computations; (ii) the opportunities for parallelism in executing MapReduce computations; (iii) the fundamental characteristics of Sawzall's aggregators as an advancement of the MapReduce approach. Our development does not formalize the more implementational aspects of an actual, distributed execution of MapReduce computations.
Keywords:Data processing; Parallel programming; Distributed programming; Software design; Executable specification; Typed functional programming; MapReduce; Sawzall; Map; Reduce; List homomorphism; Haskell
http://portal.acm.org/citation.cfm?id=1290812
Google’s MapReduce programming model — Revisited
Ralf Lämmel,a,
aData Programmability Team, Microsoft Corp., Redmond, WA, USA
Received 9 February 2006;
revised 10 July 2007;
accepted 10 July 2007.
Available online 18 July 2007.
Abstract
Google’s MapReduce programming model serves for processing large data sets in a massively parallel manner. We deliver the first rigorous description of the model including its advancement as Google’s domain-specific language Sawzall. To this end, we reverse-engineer the seminal papers on MapReduce and Sawzall, and we capture our findings as an executable specification. We also identify and resolve some obscurities in the informal presentation given in the seminal papers. We use typed functional programming (specifically Haskell) as a tool for design recovery and executable specification. Our development comprises three components: (i) the basic program skeleton that underlies MapReduce computations; (ii) the opportunities for parallelism in executing MapReduce computations; (iii) the fundamental characteristics of Sawzall’s aggregators as an advancement of the MapReduce approach. Our development does not formalize the more implementational aspects of an actual, distributed execution of MapReduce computations.
Keywords:Data processing; Parallel programming; Distributed programming; Software design; Executable specification; Typed functional programming; MapReduce; Sawzall; Map; Reduce; List homomorphism; Haskell
Corresponding address: Universität Koblenz-Landau, Institut für Informatik B 128, Universitätsstrasse 1, D-56070 Koblenz, Germany.
http://www.sciencedirect.com/science/article/pii/S0167642307001281
[PDF]
-[]
文件格式:PDF/Adobe Acrobat -
快速查看作者:R Lämmel-
被引用次数:111-
相关文章Google's MapReduce Programming Model — Revisited. ∗. Ralf Lämmel. Data Programmability Team. Microsoft Corp. Redmond, WA, USA. Abstract
...citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.104...
Published in: |
·Journal |
Science of Computer Programmingarchive
|
Volume 68 Issue 3, October, 2007 Elsevier North-Holland, Inc.Amsterdam, The Netherlands, The Netherlands tableofcontentsdoi>10.1016/j.scico.2007.07.001
|
|
分享到:
相关推荐
赠送jar包:hadoop-mapreduce-client-jobclient-2.6.5.jar; 赠送原API文档:hadoop-mapreduce-client-jobclient-2.6.5-javadoc.jar; 赠送源代码:hadoop-mapreduce-client-jobclient-2.6.5-sources.jar; 赠送...
Hadoop 3.x(MapReduce)----【MapReduce 概述】---- 代码 Hadoop 3.x(MapReduce)----【MapReduce 概述】---- 代码 Hadoop 3.x(MapReduce)----【MapReduce 概述】---- 代码 Hadoop 3.x(MapReduce)----...
hadoop-mapreduce-examples-2.7.1.jar
赠送jar包:hadoop-mapreduce-client-jobclient-2.6.5.jar; 赠送原API文档:hadoop-mapreduce-client-jobclient-2.6.5-javadoc.jar; 赠送源代码:hadoop-mapreduce-client-jobclient-2.6.5-sources.jar; 赠送...
赠送jar包:hadoop-mapreduce-client-app-2.6.5.jar; 赠送原API文档:hadoop-mapreduce-client-app-2.6.5-javadoc.jar; 赠送源代码:hadoop-mapreduce-client-app-2.6.5-sources.jar; 赠送Maven依赖信息文件:...
赠送jar包:hadoop-mapreduce-client-app-2.7.3.jar; 赠送原API文档:hadoop-mapreduce-client-app-2.7.3-javadoc.jar; 赠送源代码:hadoop-mapreduce-client-app-2.7.3-sources.jar; 赠送Maven依赖信息文件:...
赠送jar包:hadoop-mapreduce-client-core-2.7.3.jar; 赠送原API文档:hadoop-mapreduce-client-core-2.7.3-javadoc.jar; 赠送源代码:hadoop-mapreduce-client-core-2.7.3-sources.jar; 赠送Maven依赖信息文件:...
赠送jar包:hadoop-mapreduce-client-core-2.5.1.jar; 赠送原API文档:hadoop-mapreduce-client-core-2.5.1-javadoc.jar; 赠送源代码:hadoop-mapreduce-client-core-2.5.1-sources.jar; 赠送Maven依赖信息文件:...
赠送jar包:hadoop-mapreduce-client-core-2.6.5.jar 赠送原API文档:hadoop-mapreduce-client-core-2.6.5-javadoc.jar 赠送源代码:hadoop-mapreduce-client-core-2.6.5-sources.jar 包含翻译后的API文档:...
Hadoop 3.x(MapReduce)----【Hadoop 序列化】---- 代码 Hadoop 3.x(MapReduce)----【Hadoop 序列化】---- 代码 Hadoop 3.x(MapReduce)----【Hadoop 序列化】---- 代码 Hadoop 3.x(MapReduce)----【Hadoop ...
hadoop-mapreduce-examples-2.6.5.jar 官方案例源码
赠送jar包:hadoop-mapreduce-client-app-2.6.5.jar; 赠送原API文档:hadoop-mapreduce-client-app-2.6.5-javadoc.jar; 赠送源代码:hadoop-mapreduce-client-app-2.6.5-sources.jar; 赠送Maven依赖信息文件:...
Hadoop实现了一个分布式文件系统(Hadoop Distributed File System),简称HDFS。HDFS有高容错性的特点,并且设计用来部署在低廉的(low-cost)硬件上;而且它提供高吞吐量(high throughput)来访问应用程序的数据...
hadoop-mapreduce-examples-2.0.0-alpha.jar
赠送jar包:hadoop-mapreduce-client-common-2.7.3.jar; 赠送原API文档:hadoop-mapreduce-client-common-2.7.3-javadoc.jar; 赠送源代码:hadoop-mapreduce-client-common-2.7.3-sources.jar; 赠送Maven依赖信息...
赠送jar包:hadoop-mapreduce-client-jobclient-2.5.1.jar; 赠送原API文档:hadoop-mapreduce-client-jobclient-2.5.1-javadoc.jar; 赠送源代码:hadoop-mapreduce-client-jobclient-2.5.1-sources.jar; 赠送...
赠送jar包:hadoop-mapreduce-client-jobclient-2.7.3.jar; 赠送原API文档:hadoop-mapreduce-client-jobclient-2.7.3-javadoc.jar; 赠送源代码:hadoop-mapreduce-client-jobclient-2.7.3-sources.jar; 赠送...
赠送jar包:hadoop-mapreduce-client-jobclient-2.5.1.jar; 赠送原API文档:hadoop-mapreduce-client-jobclient-2.5.1-javadoc.jar; 赠送源代码:hadoop-mapreduce-client-jobclient-2.5.1-sources.jar; 赠送...
赠送jar包:hadoop-mapreduce-client-shuffle-2.5.1.jar; 赠送原API文档:hadoop-mapreduce-client-shuffle-2.5.1-javadoc.jar; 赠送源代码:hadoop-mapreduce-client-shuffle-2.5.1-sources.jar; 赠送Maven依赖...
赠送jar包:hadoop-mapreduce-client-common-2.6.5.jar; 赠送原API文档:hadoop-mapreduce-client-common-2.6.5-javadoc.jar; 赠送源代码:hadoop-mapreduce-client-common-2.6.5-sources.jar; 赠送Maven依赖信息...