Treffer: ALOJA: A benchmarking and predictive platform for big data performance analysis

Title:
ALOJA: A benchmarking and predictive platform for big data performance analysis
Contributors:
Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Barcelona Supercomputing Center, Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
Publisher Information:
Springer
Publication Year:
2016
Collection:
Universitat Politècnica de Catalunya, BarcelonaTech: UPCommons - Global access to UPC knowledge
Document Type:
Konferenz conference object
File Description:
14 p.; application/pdf
Language:
English
Relation:
http://link.springer.com/chapter/10.1007/978-3-319-49748-8_4; info:eu-repo/grantAgreement/MINECO/6PN/TIN2012-34557; info:eu-repo/grantAgreement/MINECO/1PE/TIN2012-34557; http://hdl.handle.net/2117/100159
DOI:
10.1007/978-3-319-49748-8_4
Rights:
Open Access
Accession Number:
edsbas.9C777693
Database:
BASE

Weitere Informationen

The main goals of the ALOJA research project from BSC-MSR, are to explore and automate the characterization of cost-effectivenessof Big Data deployments. The development of the project over its first year, has resulted in a open source benchmarking platform, an online public repository of results with over 42,000 Hadoop job runs, and web-based analytic tools to gather insights about system's cost-performance1. This article describes the evolution of the project's focus and research lines from over a year of continuously benchmarking Hadoop under dif- ferent configuration and deployments options, presents results, and dis cusses the motivation both technical and market-based of such changes. During this time, ALOJA's target has evolved from a previous low-level profiling of Hadoop runtime, passing through extensive benchmarking and evaluation of a large body of results via aggregation, to currently leveraging Predictive Analytics (PA) techniques. Modeling benchmark executions allow us to estimate the results of new or untested configu- rations or hardware set-ups automatically, by learning techniques from past observations saving in benchmarking time and costs. ; This work is partially supported the BSC-Microsoft Research Centre, the Span- ish Ministry of Education (TIN2012-34557), the MINECO Severo Ochoa Research program (SEV-2011-0067) and the Generalitat de Catalunya (2014-SGR-1051). ; Peer Reviewed ; Postprint (author's final draft)