Online Katalog der GeoSphere Austria (Standort Neulinggasse)

Home

Login

Katalogkarte GBA

Katalogkarte ISBD

Suche präzisieren

Drucken

Download RIS

Zum vorherigen Treffer

Datensatz 1 von 1

Zum nächsten Treffer


Titel	STELLA: A domain-specific embedded language for stencil codes on structured grids
VerfasserIn	Tobias Gysi, Oliver Fuhrer, Carlos Osuna, Benjamin Cumming, Thomas Schulthess
Konferenz	EGU General Assembly 2014
Medientyp	Artikel
Sprache	Englisch
Digitales Dokument	PDF
Erschienen	In: GRA - Volume 16 (2014)
Datensatznummer	250093587
Publikation (Nr.)	EGU/EGU2014-8464.pdf



Zusammenfassung
Adapting regional weather and climate models (RCMs) for hybrid many-core computing architectures is a formidable challenge. Achieving high performance on different supercomputing architectures while retaining a single source code are often perceived as contradicting goals. Typically, the numerical algorithms employed are tightly inter-twined with hardware dependent implementation choices and optimizations such as for example data-structures and loop order. While Fortran is currently the de-facto standard for programming RCMs, no single such standard for porting such models to graphics processing units (GPUs) has yet emerged. The approaches used can be grouped into three main categories: compiler directives (OpenACC, PGI compiler directives), custom programming languages (CUDA, OpenCL) and domain-specific libraries or languages. STELLA (STencil Loop LAnguage) is a domain-specific embedded language (DSEL) built using generic programming in C++ which is targeted at stencil codes on structured grids. It allows a high-level specification of the algorithm while separating hardware dependent implementation details into back-ends. Currently, a back-end for multi-core CPUs using the OpenMP programming model and a back-end for NVIDIA GPUs using the CUDA programming mode has been developed. We will present the domain-specific language and its features such as software managed caching. With the example of an implementation of the dynamical core of a RCM (COSMO) we will compare performance with respect to the original Fortran implementation both on both CPUs and GPUs. Finally, we will discuss advantages and disadvantages of our approach as compared to other approaches such as source-to-source translators.