Click on a search word OR use the drop-down choices to search for a paper.
2007
Anand, Christopher Kumar; Kahl, Wolfram
SQRL Report No. 43 “A Domain-Specific Language for the Generation of Optimized SIMD-Parallel Assembly Code” Technical Report
2007.
Abstract | Links | BibTeX | Tags: code generation for SIMD-parallelism, domain-specific languages, high-performance floating-point function evaluation, special functions
@techreport{Anand2007,
title = {SQRL Report No. 43 “A Domain-Specific Language for the Generation of Optimized SIMD-Parallel Assembly Code”},
author = {Christopher Kumar Anand and Wolfram Kahl},
url = {http://www.cas.mcmaster.ca/sqrl/papers/SQRLreport43.pdf},
year = {2007},
date = {2007-05-15},
abstract = {We present a domain-specific language embedded into Haskell that allows mathematicians to formulate novel high-performance SIMD-parallel algorithms for the evaluation of special functions.
Developing such functions involves explorations both of mathematical properties of the functions which lead to effective (rational) polynomial approximations, and of specific properties of the binary representation of floating point numbers. Our framework includes support for estimating the effectiveness of different approximation schemes in Maple. Once a scheme is chosen, the Maple generated component is integrated into the code generation setup. Numerical experimentation can then be performed interactively, with support functions for running standard tests and tabulating results. Once a satisfactory formulation is achieved, a code graph representation of the algorithm can be passed to other components which produce C function bodies, or to a state-of-the-art scheduler which produces optimal or near-optimal schedules, currently targeting the “Cell Broadband Engine” processor.
Encapsulating a considerable amount of knowledge about specific “tricks” in DSL constructs allows us produce algorithm specifications that are precise, readable, and compile to optimal-quality assembly code, while formulations of the equivalent algorithms in C would be almost impossible to understand and maintain.},
keywords = {code generation for SIMD-parallelism, domain-specific languages, high-performance floating-point function evaluation, special functions},
pubstate = {published},
tppubtype = {techreport}
}
We present a domain-specific language embedded into Haskell that allows mathematicians to formulate novel high-performance SIMD-parallel algorithms for the evaluation of special functions.
Developing such functions involves explorations both of mathematical properties of the functions which lead to effective (rational) polynomial approximations, and of specific properties of the binary representation of floating point numbers. Our framework includes support for estimating the effectiveness of different approximation schemes in Maple. Once a scheme is chosen, the Maple generated component is integrated into the code generation setup. Numerical experimentation can then be performed interactively, with support functions for running standard tests and tabulating results. Once a satisfactory formulation is achieved, a code graph representation of the algorithm can be passed to other components which produce C function bodies, or to a state-of-the-art scheduler which produces optimal or near-optimal schedules, currently targeting the “Cell Broadband Engine” processor.
Encapsulating a considerable amount of knowledge about specific “tricks” in DSL constructs allows us produce algorithm specifications that are precise, readable, and compile to optimal-quality assembly code, while formulations of the equivalent algorithms in C would be almost impossible to understand and maintain.
Developing such functions involves explorations both of mathematical properties of the functions which lead to effective (rational) polynomial approximations, and of specific properties of the binary representation of floating point numbers. Our framework includes support for estimating the effectiveness of different approximation schemes in Maple. Once a scheme is chosen, the Maple generated component is integrated into the code generation setup. Numerical experimentation can then be performed interactively, with support functions for running standard tests and tabulating results. Once a satisfactory formulation is achieved, a code graph representation of the algorithm can be passed to other components which produce C function bodies, or to a state-of-the-art scheduler which produces optimal or near-optimal schedules, currently targeting the “Cell Broadband Engine” processor.
Encapsulating a considerable amount of knowledge about specific “tricks” in DSL constructs allows us produce algorithm specifications that are precise, readable, and compile to optimal-quality assembly code, while formulations of the equivalent algorithms in C would be almost impossible to understand and maintain.