SlideShare a Scribd company logo
Haskell vs. F# vs. Scala
A High-level Language Features and Parallelism Support Comparison1

        Prabhat Totoo, Pantazis Deligiannis, Hans-Wolfgang Loidl

                              Denpendable Systems Group
                    School of Mathematical and Computer Sciences
                              Heriot-Watt University, UK
                               Copenhagen, Denmark

                              15 September 2012


      comparison of parallel programming support in the 3 languages
      - performance
      - programmability

      we look at:
              language features and high-level abstractions for parallelism
              experimental results from n-body problem on multi-cores
              discussion about performance, programmability and pragmatic aspects

 Totoo, Deligiannis, Loidl (HWU)   Haskell vs. F# vs. Scala         15-Sep-2012   1 / 29

      promises of functional languages for parallel programming
              referential transparency due to the absence of side effects
              high-level abstractions through HOFs, function composition
              skeletons encapsulating common parallel patterns
      ”specify what instead of how to compute something ”
      SPJ: ”The future of parallel is declarative ”

 Totoo, Deligiannis, Loidl (HWU)   Haskell vs. F# vs. Scala          15-Sep-2012   2 / 29
Recent trends in language design

      mainstream languages integrating functional features in their design
              e.g. Generics (Java), lambda expression (C#, C++), delegates (C#)
              improves expressiveness in the language by providing functional
      Multi-paradigm languages - make functional programming more
              - Microsoft .NET platform
              - functional-oriented; with imperative and OO constructs
              - JVM
              - emphasize on OO but combined with powerful functional features
      library support for parallel patterns (skeletons)
              e.g. TPL

 Totoo, Deligiannis, Loidl (HWU)   Haskell vs. F# vs. Scala         15-Sep-2012   3 / 29
Summary table

                               Table: Summary of language features

                         Key Features                                 Parallelism Support
               Haskell pure functional, lazy evaluation,              par and pseq
                       static/inferred typing                         Evaluation strategies
               F#        functional, imperative, object oriented,     Async Workflows
                         strict evaluation, static/inferred typing,   TPL
                         .NET interoperability                        PLINQ
               Scala     functional, imperative, object oriented, Parallel Collections
                         strict evaluation, strong/inferred typing, Actors
                         Java interoperability

 Totoo, Deligiannis, Loidl (HWU)           Haskell vs. F# vs. Scala                   15-Sep-2012   4 / 29

        pure functional language, generally functions have no side-effects
        lazy by default
        typing: static, strong, type inference
        advanced type system supporting ADTs, typeclasses, type
        monads used to:
              chain computation (no default eval order)
              IO monad separates pure from side-effecting computations
        main implementation: GHC
        - highly-optimised compiler and RTS

 Totoo, Deligiannis, Loidl (HWU)   Haskell vs. F# vs. Scala        15-Sep-2012   5 / 29
Haskell - parallel support

      includes support for semi-explicit parallelism through the Glasgow
      parallel Haskell (GpH) extensions
               provides a single primitive for parallelism: par
           1                  p a r : : a −> b −> b

               to annotate expression that can usefully be evaluated in parallel
               (potential NOT mandatory parallelism)
               and a second primitive to enforce sequential ordering which is needed
               to arrange parallel computations: pseq
           1                  p s e q : : a −> b −> b

 Totoo, Deligiannis, Loidl (HWU)        Haskell vs. F# vs. Scala       15-Sep-2012   6 / 29
Haskell - parallel support

        GpH e.g.
    1             f = s1 + s2                                       −− s e q u e n t i a l
    2             f = s 1 ‘ par ‘ s 2 ‘ pseq ‘ ( s 1 + s 2 )        −− p a r a l l e l

        Evaluation strategies
              abstractions built on top of the basic primitives
              separates coordination aspects from main computation
        parallel map using strategies
    1             parMap s t r a t f x s = map f x s ‘ u s i n g ‘ p a r L i s t s t r a t

              parList Spark each of the elements of a list using the given strategy
              strat e.g. rdeepseq Strategy that fully evaluates its argument

 Totoo, Deligiannis, Loidl (HWU)       Haskell vs. F# vs. Scala               15-Sep-2012    7 / 29
Haskell - parallel support

      Other abstractions for parallelism:
              Par monad
              - more explicit approach
              - uses IVars, put and get for communication
              - abstract some common functions e.g. parallel map
              - exploits data parallelism, both regular and nested-data
              - uses process abstraction, similar to λ-abstraction
              - for distributed-memory but also good performance on shared-memory

     Prabhat Totoo, Hans-Wolfgang Loidl. Parallel Haskell implementations of the
n-body problem.
 Totoo, Deligiannis, Loidl (HWU)   Haskell vs. F# vs. Scala            15-Sep-2012   8 / 29

       functional-oriented language
       SML-like language with OO extensions
       implemented on top of .NET, interoperable with languages such as
       can make use of arbitrary .NET libraries from F#
       strict by default, but has lazy values as well
       advanced type system: discriminated union (=ADT), object types
       (for .NET interoperability)
       value vs. variable

Totoo, Deligiannis, Loidl (HWU)   Haskell vs. F# vs. Scala   15-Sep-2012   9 / 29
F# - parallel support

      uses the .NET Parallel Extensions library
      - high-level constructs to write/execute parallel programs
      Tasks Parallel Library (TPL)
      - hide low-level thread creation, management, scheduling details
              - main construct for task parallelism
              Task: provides high-level abstraction compared to working directly
              with threads; does not return a value
              Task<TResult>: represents an operation that calculates a value of
              type TResult eventually i.e. a future

 Totoo, Deligiannis, Loidl (HWU)    Haskell vs. F# vs. Scala         15-Sep-2012   10 / 29
F# - parallel support

      Tasks Parallel Library (TPL)
              Parallel Class - for data parallelism
              - basic loop parallelisation using Parallel.For and
              PLINQ - declarative model for data parallelism
              - uses tasks internally
      Async Workflows
      - use the async {...} keyword; doesn’t block calling thread -
      provide basic parallelisation
      - intended mainly for operations involving I/O e.g. run multiple
      downloads in parallel

 Totoo, Deligiannis, Loidl (HWU)    Haskell vs. F# vs. Scala        15-Sep-2012   11 / 29

        designed to be a “better Java”...
        multiparadigm language; integrating OO and functional model
        evaluation: strict; typing: static, strong and inferred
        strong interoperability with Java (targeted for JVM)
        still very tied with the concept of objects
        language expressiveness is extended by functional features

 Totoo, Deligiannis, Loidl (HWU)   Haskell vs. F# vs. Scala       15-Sep-2012   12 / 29
Scala - parallel support

      Parallel Collections Framework
               implicit (data) parallelism
               use par on sequential collection to invoke parallel implementation
               subsequent operations are parallel
               use seq to turn collection back to sequential
           1                  x s . map ( x => f x )                   // s e q u e n t i a l
           3                  x s . p a r . map ( x => f x ) . s e q   // p a r a l l e l

               built on top of Fork/Join framework
               thread pool implementation schedules tasks among available processors

 Totoo, Deligiannis, Loidl (HWU)            Haskell vs. F# vs. Scala                 15-Sep-2012   13 / 29
Scala - parallel support

               message-passing model; no shared state
               similar concurrency model to Erlang
               lightweight processes communicates by exchanging asynchronous
               messages go into mailboxes; processed using pattern matching
           1                  a ! msg                        // s e n d message
           3                 receive     {                   // p r o c e s s m a i l b o x
           4                   case     msg pattern          1 => a c t i o n 1
           5                   case     msg pattern          2 => a c t i o n 2
           6                   case     msg pattern          3 => a c t i o n 3
           7                 }

 Totoo, Deligiannis, Loidl (HWU)         Haskell vs. F# vs. Scala                      15-Sep-2012   14 / 29
N-body problem

      problem of predicting and simulating the motion of a system of N
      bodies that interact with each other gravitationally
      simulation proceeds over a number of time steps
      in each step
              calculate acceleration of each body wrt the others,
              then update position and velocity
      solving methods:
              all-pairs: direct body-to-body comparison, not feasible for large N
              Barnes-Hut algorithm: efficient approximation method, more advanced

                                   consists of 2 phases:
                                     - tree construction
                                     - force calculation (most compute-intensive)

 Totoo, Deligiannis, Loidl (HWU)    Haskell vs. F# vs. Scala         15-Sep-2012   15 / 29
Sequential Implementation and optimisations

      Approach: start off with an efficient seq implementation, preserving
      opportunities for parallelism
      generic optimisations:
              - eliminating multiple traversal of data structures
              tail-call elimination
              compiler optimisations

 Totoo, Deligiannis, Loidl (HWU)     Haskell vs. F# vs. Scala       15-Sep-2012   16 / 29
              eliminates stack overflow by making functions tail recursive
              add strictness annotations where required
              use strict fields and UNPACK pragma in datatype definitions
              shortcut fusion e.g. foldr/build

        Introduce parallelism at the top-level map function
        Core parallel code for Haskell
    1            c h u n k s i z e = ( l e n g t h b s ) ‘ quot ‘ ( n u m C a p a b i l i t i e s ∗ 4 )
    2            n e w b s = map f b s ‘ u s i n g ‘ p a r L i s t C h u n k c h u n k s i z e

        Parallel tuning
              chunking to control granularity, not too many small tasks; not too few
              large tasks

 Totoo, Deligiannis, Loidl (HWU)            Haskell vs. F# vs. Scala                      15-Sep-2012     17 / 29
F# (1)
          translate Haskell code into F#
          keeping the code functional, so mostly syntax changes
          some general optimisations apply
                 + inlining

                                                         1   ( ∗ TPL ∗ )
1   ( ∗ Async W o r k f l o w s ∗ )                      2
2   l e t pmap async f xs =                              3   (∗ E x p l i c i t t a s k s c r e a t i o n . ∗)
3       s e q { f o r x i n x s −> a s y n c {           4   l e t p m a p t p l t a s k s f ( x s : l i s t < >)
               return f x } }                                         =
4       |> Async . P a r a l l e l                       5       L i s t . map ( f u n x −>
5       |> Async . R u n S y n c h r o n o u s l y       6           Task< >. F a c t o r y . S t a r t N e w ( f u n
6       |> Seq . t o L i s t                                                ( ) −> f x ) . R e s u l t
                                                         7       ) xs

1   ( ∗ PLINQ − u s e s TPL i n t e r n a l l y ∗ )
2   l e t p m a p p l i n q f ( x s : l i s t < >) =
3       x s . A s P a r a l l e l ( ) . S e l e c t ( f u n x −> f x ) |> Seq . t o L i s t

     Totoo, Deligiannis, Loidl (HWU)             Haskell vs. F# vs. Scala                        15-Sep-2012     18 / 29
F# (2)

          imperative style
1   (∗ P a r a l l e l . For ∗)
3   l e t p m a p t p l p a r f o r f ( x s : a r r a y < >) =
4         l e t new xs = Array . z e r o C r e a t e xs . Length
5         P a r a l l e l . F o r ( 0 , x s . Length , ( f u n i −>
6                n e w x s . [ i ] <− f ( x s . [ i ] ) ) ) |> i g n o r e
7         new xs

          Parallel tuning
                  maximum degree of parallelism
                        - specifies max number of concurrently executing tasks
                        - to control the granularity of tasks

     Totoo, Deligiannis, Loidl (HWU)          Haskell vs. F# vs. Scala          15-Sep-2012   19 / 29

          translate Haskell and F# implementations into Scala
          keeping the code as much functional as possible
                 - unnecessary object initialisations removal

1   // P a r a l l e l C o l l e c t i o n s .
2   b o d i e s . p a r . map ( ( b : Body ) => new Body ( b . mass , u p d a t e P o s ( b ) ,
            updateVel (b) ) ) . seq

1   // P a r a l l e l map u s i n g F u t u r e s .
2   d e f pmap [ T ] ( f : T => T) ( x s : L i s t [ T ] ) : L i s t [ T ] = {
3       v a l t a s k s = x s . map ( ( x : T) => F u t u r e s . f u t u r e { f ( x ) } )
4       t a s k s . map ( f u t u r e => f u t u r e . a p p l y ( ) )
5   }

     Totoo, Deligiannis, Loidl (HWU)          Haskell vs. F# vs. Scala                        15-Sep-2012   20 / 29
Experimental setup

      Platforms and language implementations:

                                   Linux (64-bit)               Windows (32-bit)
                                   2.33GHz (8 cores)            2.80GHz (4 cores with HT)
                   Haskell         GHC 7.4.1                    GHC 7.4.1
                   F#              F# 2.0 / Mono 2.11.1         F# 2.0 / .NET 4.0
                   Scala           Scala 2.10 / JVM 1.7         Scala 2.10 / JVM 1.7

      Input size: 80,000 bodies, 1 iteration

 Totoo, Deligiannis, Loidl (HWU)               Haskell vs. F# vs. Scala                15-Sep-2012   21 / 29
Sequential Runtimes

          Haskell (Linux)          479.94s
         F# (Win)                   28.43s
         Scala (Linux)             55.44s

 Totoo, Deligiannis, Loidl (HWU)             Haskell vs. F# vs. Scala   15-Sep-2012   22 / 29
Sequential Runtimes

          Haskell (Linux)          479.94s
         F# (Win)                   28.43s
         Scala (Linux)             55.44s

 Totoo, Deligiannis, Loidl (HWU)             Haskell vs. F# vs. Scala   15-Sep-2012   22 / 29
Sequential Runtimes

          Haskell (Linux)          479.94s                              25.28
         F# (Win)                   28.43s                              21.12
         Scala (Linux)             55.44s                               39.04
                                   before                               after

 Totoo, Deligiannis, Loidl (HWU)             Haskell vs. F# vs. Scala           15-Sep-2012   22 / 29
Sequential Runtimes

          Haskell (Linux)          479.94s                                 25.28
         F# (Win)                   28.43s                                 21.12
         Scala (Linux)             55.44s                                  39.04
                                   before                                  after

                                                    Linux        Windows
                                      Haskell      25.28           17.64
                                     F#            118.12          21.12
                                     Scala         39.04           66.65

 Totoo, Deligiannis, Loidl (HWU)             Haskell vs. F# vs. Scala              15-Sep-2012   22 / 29
Parallel Runtimes
                             Haskell (EvalStrat)       F# (PLINQ)       Scala (Actors)
                     seq            25.28                  118.12           39.04
                      1             26.38                  196.14           40.01
                      2             14.48                  120.78           22.34
                      4             7.41                    80.91           14.88
                      8             4.50                    70.67           13.26

                                       Table: Linux (8 cores)

                               Haskell (EvalStrat)       F# (PLINQ)      Scala (Actors)
                    seq              17.64                    21.12          66.65
                     1               18.05                    21.39          67.24
                     2                9.41                    17.32          58.66
                     4                6.80                    10.56          33.84
                  8 (HT)             4.77                     8.64           25.28

                                     Table: Windows (4 cores)

 Totoo, Deligiannis, Loidl (HWU)             Haskell vs. F# vs. Scala                    15-Sep-2012   23 / 29
Linux (8 cores)

 Totoo, Deligiannis, Loidl (HWU)   Haskell vs. F# vs. Scala   15-Sep-2012   24 / 29
Windows (4 cores)

 Totoo, Deligiannis, Loidl (HWU)   Haskell vs. F# vs. Scala   15-Sep-2012   25 / 29

              Haskell outperforms F# and Scala on both platforms
              - has been possible after many optimisations
              poor performance - F#/Mono (vs. F#/.NET runtime)
              poor performance - Scala on Windows
              Haskell - implication of laziness - hard to estimate how much work is
              Scala - verbose, unlike other functional languages
              in general, initial parallelism easy to specify
              chunking required to work for Haskell, but PLINQ and ParColl are
              more implicit
              F# and Scala retain much control in the implementation
              - not easy to tune parallel program

 Totoo, Deligiannis, Loidl (HWU)      Haskell vs. F# vs. Scala          15-Sep-2012   26 / 29

              - good tool support e.g. for space and time profiling
              - threadscope: visualisation tool to see work distribution
              - Profiling and Analysis tools available in VS2011 Beta Pro -
              Concurrency Visualizer
              - benefits from free tools available for JVM

 Totoo, Deligiannis, Loidl (HWU)    Haskell vs. F# vs. Scala         15-Sep-2012   27 / 29

      employ skeleton-based, semi-explicit and explicit approaches to
      (near) best runtimes using highest-level of abstraction
      start with an optimised sequential program
      but use impure features only after parallelisation
      Haskell provides least intrusive mechanism for parallelisation
      F# provides the preferred mechanism for data-parallelism with the
      SQL-like PLINQ
      extra programming effort using actor-based code in Scala not justified

 Totoo, Deligiannis, Loidl (HWU)   Haskell vs. F# vs. Scala     15-Sep-2012   28 / 29
Future Work: Eden on Clusters6
                    Eden Iteration Skeletons: Naive NBody with 15000 bodies, 10 iteration



   Time (s)



               4                    linear speedup
                    1        2         4          8       16           32     64       128

     Mischa Dieterle, Thomas Horstmeyer, Jost Berthold and Rita Loogen: Iterating
Skeletons - Structured Parallelism by Composition, IFL’12
 Totoo, Deligiannis, Loidl (HWU)                   Haskell vs. F# vs. Scala                  15-Sep-2012   29 / 29
Thank you for listening!

    Full paper and sources:
    {pt114, pd85, H.W.Loidl}

More Related Content

Haskell vs. F# vs. Scala

  • 1. Haskell vs. F# vs. Scala A High-level Language Features and Parallelism Support Comparison1 Prabhat Totoo, Pantazis Deligiannis, Hans-Wolfgang Loidl Denpendable Systems Group School of Mathematical and Computer Sciences Heriot-Watt University, UK FHPC’12 Copenhagen, Denmark 15 September 2012 1 URL:
  • 2. Goal comparison of parallel programming support in the 3 languages - performance - programmability we look at: language features and high-level abstractions for parallelism experimental results from n-body problem on multi-cores discussion about performance, programmability and pragmatic aspects Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 1 / 29
  • 3. Motivation promises of functional languages for parallel programming referential transparency due to the absence of side effects high-level abstractions through HOFs, function composition skeletons encapsulating common parallel patterns ”specify what instead of how to compute something ” SPJ: ”The future of parallel is declarative ” Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 2 / 29
  • 4. Recent trends in language design mainstream languages integrating functional features in their design e.g. Generics (Java), lambda expression (C#, C++), delegates (C#) improves expressiveness in the language by providing functional constructs Multi-paradigm languages - make functional programming more approachable F# - Microsoft .NET platform - functional-oriented; with imperative and OO constructs Scala - JVM - emphasize on OO but combined with powerful functional features library support for parallel patterns (skeletons) e.g. TPL Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 3 / 29
  • 6. Summary table Table: Summary of language features Key Features Parallelism Support Haskell pure functional, lazy evaluation, par and pseq static/inferred typing Evaluation strategies F# functional, imperative, object oriented, Async Workflows strict evaluation, static/inferred typing, TPL .NET interoperability PLINQ Scala functional, imperative, object oriented, Parallel Collections strict evaluation, strong/inferred typing, Actors Java interoperability Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 4 / 29
  • 7. Haskell2 pure functional language, generally functions have no side-effects lazy by default typing: static, strong, type inference advanced type system supporting ADTs, typeclasses, type polymorphism monads used to: chain computation (no default eval order) IO monad separates pure from side-effecting computations main implementation: GHC - highly-optimised compiler and RTS 2 Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 5 / 29
  • 8. Haskell - parallel support includes support for semi-explicit parallelism through the Glasgow parallel Haskell (GpH) extensions GpH provides a single primitive for parallelism: par 1 p a r : : a −> b −> b to annotate expression that can usefully be evaluated in parallel (potential NOT mandatory parallelism) and a second primitive to enforce sequential ordering which is needed to arrange parallel computations: pseq 1 p s e q : : a −> b −> b Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 6 / 29
  • 9. Haskell - parallel support GpH e.g. 1 f = s1 + s2 −− s e q u e n t i a l 2 f = s 1 ‘ par ‘ s 2 ‘ pseq ‘ ( s 1 + s 2 ) −− p a r a l l e l Evaluation strategies abstractions built on top of the basic primitives separates coordination aspects from main computation parallel map using strategies 1 parMap s t r a t f x s = map f x s ‘ u s i n g ‘ p a r L i s t s t r a t parList Spark each of the elements of a list using the given strategy strat e.g. rdeepseq Strategy that fully evaluates its argument Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 7 / 29
  • 10. Haskell - parallel support Other abstractions for parallelism: Par monad - more explicit approach - uses IVars, put and get for communication - abstract some common functions e.g. parallel map DPH - exploits data parallelism, both regular and nested-data Eden - uses process abstraction, similar to λ-abstraction - for distributed-memory but also good performance on shared-memory machines3 3 Prabhat Totoo, Hans-Wolfgang Loidl. Parallel Haskell implementations of the n-body problem. Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 8 / 29
  • 11. F#4 functional-oriented language SML-like language with OO extensions implemented on top of .NET, interoperable with languages such as C# can make use of arbitrary .NET libraries from F# strict by default, but has lazy values as well advanced type system: discriminated union (=ADT), object types (for .NET interoperability) value vs. variable 4 Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 9 / 29
  • 12. F# - parallel support uses the .NET Parallel Extensions library - high-level constructs to write/execute parallel programs Tasks Parallel Library (TPL) - hide low-level thread creation, management, scheduling details Tasks - main construct for task parallelism Task: provides high-level abstraction compared to working directly with threads; does not return a value Task<TResult>: represents an operation that calculates a value of type TResult eventually i.e. a future Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 10 / 29
  • 13. F# - parallel support Tasks Parallel Library (TPL) Parallel Class - for data parallelism - basic loop parallelisation using Parallel.For and Parallel.ForEach PLINQ - declarative model for data parallelism - uses tasks internally Async Workflows - use the async {...} keyword; doesn’t block calling thread - provide basic parallelisation - intended mainly for operations involving I/O e.g. run multiple downloads in parallel Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 11 / 29
  • 14. Scala5 designed to be a “better Java”... multiparadigm language; integrating OO and functional model evaluation: strict; typing: static, strong and inferred strong interoperability with Java (targeted for JVM) still very tied with the concept of objects language expressiveness is extended by functional features 5 Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 12 / 29
  • 15. Scala - parallel support Parallel Collections Framework implicit (data) parallelism use par on sequential collection to invoke parallel implementation subsequent operations are parallel use seq to turn collection back to sequential 1 x s . map ( x => f x ) // s e q u e n t i a l 2 3 x s . p a r . map ( x => f x ) . s e q // p a r a l l e l built on top of Fork/Join framework thread pool implementation schedules tasks among available processors Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 13 / 29
  • 16. Scala - parallel support Actors message-passing model; no shared state similar concurrency model to Erlang lightweight processes communicates by exchanging asynchronous messages messages go into mailboxes; processed using pattern matching 1 a ! msg // s e n d message 2 3 receive { // p r o c e s s m a i l b o x 4 case msg pattern 1 => a c t i o n 1 5 case msg pattern 2 => a c t i o n 2 6 case msg pattern 3 => a c t i o n 3 7 } Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 14 / 29
  • 18. N-body problem problem of predicting and simulating the motion of a system of N bodies that interact with each other gravitationally simulation proceeds over a number of time steps in each step calculate acceleration of each body wrt the others, then update position and velocity solving methods: all-pairs: direct body-to-body comparison, not feasible for large N Barnes-Hut algorithm: efficient approximation method, more advanced consists of 2 phases: - tree construction - force calculation (most compute-intensive) Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 15 / 29
  • 19. Sequential Implementation and optimisations Approach: start off with an efficient seq implementation, preserving opportunities for parallelism generic optimisations: deforestation - eliminating multiple traversal of data structures tail-call elimination compiler optimisations Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 16 / 29
  • 20. Haskell Optimisations eliminates stack overflow by making functions tail recursive add strictness annotations where required use strict fields and UNPACK pragma in datatype definitions shortcut fusion e.g. foldr/build Introduce parallelism at the top-level map function Core parallel code for Haskell 1 c h u n k s i z e = ( l e n g t h b s ) ‘ quot ‘ ( n u m C a p a b i l i t i e s ∗ 4 ) 2 n e w b s = map f b s ‘ u s i n g ‘ p a r L i s t C h u n k c h u n k s i z e rdeepseq Parallel tuning chunking to control granularity, not too many small tasks; not too few large tasks Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 17 / 29
  • 21. F# (1) translate Haskell code into F# keeping the code functional, so mostly syntax changes some general optimisations apply + inlining 1 ( ∗ TPL ∗ ) 1 ( ∗ Async W o r k f l o w s ∗ ) 2 2 l e t pmap async f xs = 3 (∗ E x p l i c i t t a s k s c r e a t i o n . ∗) 3 s e q { f o r x i n x s −> a s y n c { 4 l e t p m a p t p l t a s k s f ( x s : l i s t < >) return f x } } = 4 |> Async . P a r a l l e l 5 L i s t . map ( f u n x −> 5 |> Async . R u n S y n c h r o n o u s l y 6 Task< >. F a c t o r y . S t a r t N e w ( f u n 6 |> Seq . t o L i s t ( ) −> f x ) . R e s u l t 7 ) xs 1 ( ∗ PLINQ − u s e s TPL i n t e r n a l l y ∗ ) 2 l e t p m a p p l i n q f ( x s : l i s t < >) = 3 x s . A s P a r a l l e l ( ) . S e l e c t ( f u n x −> f x ) |> Seq . t o L i s t Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 18 / 29
  • 22. F# (2) imperative style 1 (∗ P a r a l l e l . For ∗) 2 3 l e t p m a p t p l p a r f o r f ( x s : a r r a y < >) = 4 l e t new xs = Array . z e r o C r e a t e xs . Length 5 P a r a l l e l . F o r ( 0 , x s . Length , ( f u n i −> 6 n e w x s . [ i ] <− f ( x s . [ i ] ) ) ) |> i g n o r e 7 new xs Parallel tuning maximum degree of parallelism - specifies max number of concurrently executing tasks chunking/partitioning - to control the granularity of tasks Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 19 / 29
  • 23. Scala translate Haskell and F# implementations into Scala keeping the code as much functional as possible optimisations - unnecessary object initialisations removal 1 // P a r a l l e l C o l l e c t i o n s . 2 b o d i e s . p a r . map ( ( b : Body ) => new Body ( b . mass , u p d a t e P o s ( b ) , updateVel (b) ) ) . seq 1 // P a r a l l e l map u s i n g F u t u r e s . 2 d e f pmap [ T ] ( f : T => T) ( x s : L i s t [ T ] ) : L i s t [ T ] = { 3 v a l t a s k s = x s . map ( ( x : T) => F u t u r e s . f u t u r e { f ( x ) } ) 4 t a s k s . map ( f u t u r e => f u t u r e . a p p l y ( ) ) 5 } Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 20 / 29
  • 25. Experimental setup Platforms and language implementations: Linux (64-bit) Windows (32-bit) 2.33GHz (8 cores) 2.80GHz (4 cores with HT) Haskell GHC 7.4.1 GHC 7.4.1 F# F# 2.0 / Mono 2.11.1 F# 2.0 / .NET 4.0 Scala Scala 2.10 / JVM 1.7 Scala 2.10 / JVM 1.7 Input size: 80,000 bodies, 1 iteration Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 21 / 29
  • 26. Sequential Runtimes Haskell (Linux) 479.94s F# (Win) 28.43s Scala (Linux) 55.44s Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 22 / 29
  • 27. Sequential Runtimes Haskell (Linux) 479.94s F# (Win) 28.43s Scala (Linux) 55.44s before Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 22 / 29
  • 28. Sequential Runtimes Haskell (Linux) 479.94s 25.28 F# (Win) 28.43s 21.12 Scala (Linux) 55.44s 39.04 before after Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 22 / 29
  • 29. Sequential Runtimes Haskell (Linux) 479.94s 25.28 F# (Win) 28.43s 21.12 Scala (Linux) 55.44s 39.04 before after Linux Windows Haskell 25.28 17.64 F# 118.12 21.12 Scala 39.04 66.65 Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 22 / 29
  • 30. Parallel Runtimes Haskell (EvalStrat) F# (PLINQ) Scala (Actors) seq 25.28 118.12 39.04 1 26.38 196.14 40.01 2 14.48 120.78 22.34 4 7.41 80.91 14.88 8 4.50 70.67 13.26 Table: Linux (8 cores) Haskell (EvalStrat) F# (PLINQ) Scala (Actors) seq 17.64 21.12 66.65 1 18.05 21.39 67.24 2 9.41 17.32 58.66 4 6.80 10.56 33.84 8 (HT) 4.77 8.64 25.28 Table: Windows (4 cores) Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 23 / 29
  • 31. Speedups Linux (8 cores) Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 24 / 29
  • 32. Speedups Windows (4 cores) Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 25 / 29
  • 33. Observations Performance Haskell outperforms F# and Scala on both platforms - has been possible after many optimisations poor performance - F#/Mono (vs. F#/.NET runtime) poor performance - Scala on Windows Programmability Haskell - implication of laziness - hard to estimate how much work is involved Scala - verbose, unlike other functional languages in general, initial parallelism easy to specify chunking required to work for Haskell, but PLINQ and ParColl are more implicit F# and Scala retain much control in the implementation - not easy to tune parallel program Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 26 / 29
  • 34. Observations Pragmatics Haskell - good tool support e.g. for space and time profiling - threadscope: visualisation tool to see work distribution F# - Profiling and Analysis tools available in VS2011 Beta Pro - Concurrency Visualizer Scala - benefits from free tools available for JVM Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 27 / 29
  • 35. Conclusions employ skeleton-based, semi-explicit and explicit approaches to parallelism (near) best runtimes using highest-level of abstraction start with an optimised sequential program but use impure features only after parallelisation Haskell provides least intrusive mechanism for parallelisation F# provides the preferred mechanism for data-parallelism with the SQL-like PLINQ extra programming effort using actor-based code in Scala not justified Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 28 / 29
  • 36. Future Work: Eden on Clusters6 Eden Iteration Skeletons: Naive NBody with 15000 bodies, 10 iteration 512 256 128 64 Time (s) 32 16 8 localLoop/allToAllIter loopControl/allToAllRD 4 linear speedup 1 2 4 8 16 32 64 128 Processors 6 Mischa Dieterle, Thomas Horstmeyer, Jost Berthold and Rita Loogen: Iterating Skeletons - Structured Parallelism by Composition, IFL’12 Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 29 / 29
  • 37. Thank you for listening! Full paper and sources: fhpc12.html Email: {pt114, pd85, H.W.Loidl}