The traditional Hadoop MapReduce framework is a simple programming model for large scale parallel and distributed data processing. However, the model is not structured for semantic-oriented large data processing since it is not expressive. This paper presents a tree-oriented approach to enable expressiveness in the traditional Hadoop MapReduce framework. The new tree based MapReduce structure provides for group based processing, level based processing, and traversal order based processing. Stand-alone or nested, these processing constructs provides the required expressivity for semantic-oriented large data processing. This is accomplished yet preserving the fundamental benefit of traditional MapReduce framework—fault-tolerant processing.
- Hadoop MapReduce
- Parallel trees