SlideShare a Scribd company logo
Haskell vs. F# vs. Scala
A High-level Language Features and Parallelism Support Comparison1


        Prabhat Totoo, Pantazis Deligiannis, Hans-Wolfgang Loidl

                              Denpendable Systems Group
                    School of Mathematical and Computer Sciences
                              Heriot-Watt University, UK
                                     FHPC’12
                               Copenhagen, Denmark


                              15 September 2012




1
    URL:http://www.macs.hw.ac.uk/~dsg/gph/papers/abstracts/fhpc12.html
Goal

      comparison of parallel programming support in the 3 languages
      - performance
      - programmability




      we look at:
              language features and high-level abstractions for parallelism
              experimental results from n-body problem on multi-cores
              discussion about performance, programmability and pragmatic aspects



 Totoo, Deligiannis, Loidl (HWU)   Haskell vs. F# vs. Scala         15-Sep-2012   1 / 29
Motivation




      promises of functional languages for parallel programming
              referential transparency due to the absence of side effects
              high-level abstractions through HOFs, function composition
              skeletons encapsulating common parallel patterns
      ”specify what instead of how to compute something ”
      SPJ: ”The future of parallel is declarative ”




 Totoo, Deligiannis, Loidl (HWU)   Haskell vs. F# vs. Scala          15-Sep-2012   2 / 29
Recent trends in language design

      mainstream languages integrating functional features in their design
              e.g. Generics (Java), lambda expression (C#, C++), delegates (C#)
              improves expressiveness in the language by providing functional
              constructs
      Multi-paradigm languages - make functional programming more
      approachable
              F#
              - Microsoft .NET platform
              - functional-oriented; with imperative and OO constructs
              Scala
              - JVM
              - emphasize on OO but combined with powerful functional features
      library support for parallel patterns (skeletons)
              e.g. TPL


 Totoo, Deligiannis, Loidl (HWU)   Haskell vs. F# vs. Scala         15-Sep-2012   3 / 29
THE LANGUAGES
Summary table


                               Table: Summary of language features

                         Key Features                                 Parallelism Support
               Haskell pure functional, lazy evaluation,              par and pseq
                       static/inferred typing                         Evaluation strategies
               F#        functional, imperative, object oriented,     Async Workflows
                         strict evaluation, static/inferred typing,   TPL
                         .NET interoperability                        PLINQ
               Scala     functional, imperative, object oriented, Parallel Collections
                         strict evaluation, strong/inferred typing, Actors
                         Java interoperability




 Totoo, Deligiannis, Loidl (HWU)           Haskell vs. F# vs. Scala                   15-Sep-2012   4 / 29
Haskell2


        pure functional language, generally functions have no side-effects
        lazy by default
        typing: static, strong, type inference
        advanced type system supporting ADTs, typeclasses, type
        polymorphism
        monads used to:
              chain computation (no default eval order)
              IO monad separates pure from side-effecting computations
        main implementation: GHC
        - highly-optimised compiler and RTS



   2
       http://haskell.org/
 Totoo, Deligiannis, Loidl (HWU)   Haskell vs. F# vs. Scala        15-Sep-2012   5 / 29
Haskell - parallel support


      includes support for semi-explicit parallelism through the Glasgow
      parallel Haskell (GpH) extensions
      GpH
               provides a single primitive for parallelism: par
           1                  p a r : : a −> b −> b

               to annotate expression that can usefully be evaluated in parallel
               (potential NOT mandatory parallelism)
               and a second primitive to enforce sequential ordering which is needed
               to arrange parallel computations: pseq
           1                  p s e q : : a −> b −> b




 Totoo, Deligiannis, Loidl (HWU)        Haskell vs. F# vs. Scala       15-Sep-2012   6 / 29
Haskell - parallel support


        GpH e.g.
    1             f = s1 + s2                                       −− s e q u e n t i a l
    2             f = s 1 ‘ par ‘ s 2 ‘ pseq ‘ ( s 1 + s 2 )        −− p a r a l l e l

        Evaluation strategies
              abstractions built on top of the basic primitives
              separates coordination aspects from main computation
        parallel map using strategies
    1             parMap s t r a t f x s = map f x s ‘ u s i n g ‘ p a r L i s t s t r a t

              parList Spark each of the elements of a list using the given strategy
              strat e.g. rdeepseq Strategy that fully evaluates its argument




 Totoo, Deligiannis, Loidl (HWU)       Haskell vs. F# vs. Scala               15-Sep-2012    7 / 29
Haskell - parallel support


      Other abstractions for parallelism:
              Par monad
              - more explicit approach
              - uses IVars, put and get for communication
              - abstract some common functions e.g. parallel map
              DPH
              - exploits data parallelism, both regular and nested-data
              Eden
              - uses process abstraction, similar to λ-abstraction
              - for distributed-memory but also good performance on shared-memory
              machines3



   3
     Prabhat Totoo, Hans-Wolfgang Loidl. Parallel Haskell implementations of the
n-body problem.
 Totoo, Deligiannis, Loidl (HWU)   Haskell vs. F# vs. Scala            15-Sep-2012   8 / 29
F#4


       functional-oriented language
       SML-like language with OO extensions
       implemented on top of .NET, interoperable with languages such as
       C#
       can make use of arbitrary .NET libraries from F#
       strict by default, but has lazy values as well
       advanced type system: discriminated union (=ADT), object types
       (for .NET interoperability)
       value vs. variable



  4
      http://research.microsoft.com/fsharp/
Totoo, Deligiannis, Loidl (HWU)   Haskell vs. F# vs. Scala   15-Sep-2012   9 / 29
F# - parallel support


      uses the .NET Parallel Extensions library
      - high-level constructs to write/execute parallel programs
      Tasks Parallel Library (TPL)
      - hide low-level thread creation, management, scheduling details
              Tasks
              - main construct for task parallelism
              Task: provides high-level abstraction compared to working directly
              with threads; does not return a value
              Task<TResult>: represents an operation that calculates a value of
              type TResult eventually i.e. a future




 Totoo, Deligiannis, Loidl (HWU)    Haskell vs. F# vs. Scala         15-Sep-2012   10 / 29
F# - parallel support


      Tasks Parallel Library (TPL)
              Parallel Class - for data parallelism
              - basic loop parallelisation using Parallel.For and
              Parallel.ForEach
              PLINQ - declarative model for data parallelism
              - uses tasks internally
      Async Workflows
      - use the async {...} keyword; doesn’t block calling thread -
      provide basic parallelisation
      - intended mainly for operations involving I/O e.g. run multiple
      downloads in parallel




 Totoo, Deligiannis, Loidl (HWU)    Haskell vs. F# vs. Scala        15-Sep-2012   11 / 29
Scala5



        designed to be a “better Java”...
        multiparadigm language; integrating OO and functional model
        evaluation: strict; typing: static, strong and inferred
        strong interoperability with Java (targeted for JVM)
        still very tied with the concept of objects
        language expressiveness is extended by functional features




   5
       http://www.scala-lang.org/
 Totoo, Deligiannis, Loidl (HWU)   Haskell vs. F# vs. Scala       15-Sep-2012   12 / 29
Scala - parallel support


      Parallel Collections Framework
               implicit (data) parallelism
               use par on sequential collection to invoke parallel implementation
               subsequent operations are parallel
               use seq to turn collection back to sequential
           1                  x s . map ( x => f x )                   // s e q u e n t i a l
           2
           3                  x s . p a r . map ( x => f x ) . s e q   // p a r a l l e l

               built on top of Fork/Join framework
               thread pool implementation schedules tasks among available processors




 Totoo, Deligiannis, Loidl (HWU)            Haskell vs. F# vs. Scala                 15-Sep-2012   13 / 29
Scala - parallel support

      Actors
               message-passing model; no shared state
               similar concurrency model to Erlang
               lightweight processes communicates by exchanging asynchronous
               messages
               messages go into mailboxes; processed using pattern matching
           1                  a ! msg                        // s e n d message
           2
           3                 receive     {                   // p r o c e s s m a i l b o x
           4                   case     msg pattern          1 => a c t i o n 1
           5                   case     msg pattern          2 => a c t i o n 2
           6                   case     msg pattern          3 => a c t i o n 3
           7                 }




 Totoo, Deligiannis, Loidl (HWU)         Haskell vs. F# vs. Scala                      15-Sep-2012   14 / 29
IMPLEMENTATION
N-body problem

      problem of predicting and simulating the motion of a system of N
      bodies that interact with each other gravitationally
      simulation proceeds over a number of time steps
      in each step
              calculate acceleration of each body wrt the others,
              then update position and velocity
      solving methods:
              all-pairs: direct body-to-body comparison, not feasible for large N
              Barnes-Hut algorithm: efficient approximation method, more advanced



                                   consists of 2 phases:
                                     - tree construction
                                     - force calculation (most compute-intensive)


 Totoo, Deligiannis, Loidl (HWU)    Haskell vs. F# vs. Scala         15-Sep-2012   15 / 29
Sequential Implementation and optimisations



      Approach: start off with an efficient seq implementation, preserving
      opportunities for parallelism
      generic optimisations:
              deforestation
              - eliminating multiple traversal of data structures
              tail-call elimination
              compiler optimisations




 Totoo, Deligiannis, Loidl (HWU)     Haskell vs. F# vs. Scala       15-Sep-2012   16 / 29
Haskell
        Optimisations
              eliminates stack overflow by making functions tail recursive
              add strictness annotations where required
              use strict fields and UNPACK pragma in datatype definitions
              shortcut fusion e.g. foldr/build

        Introduce parallelism at the top-level map function
        Core parallel code for Haskell
    1            c h u n k s i z e = ( l e n g t h b s ) ‘ quot ‘ ( n u m C a p a b i l i t i e s ∗ 4 )
    2            n e w b s = map f b s ‘ u s i n g ‘ p a r L i s t C h u n k c h u n k s i z e
                        rdeepseq


        Parallel tuning
              chunking to control granularity, not too many small tasks; not too few
              large tasks


 Totoo, Deligiannis, Loidl (HWU)            Haskell vs. F# vs. Scala                      15-Sep-2012     17 / 29
F# (1)
          translate Haskell code into F#
          keeping the code functional, so mostly syntax changes
          some general optimisations apply
                 + inlining

                                                         1   ( ∗ TPL ∗ )
1   ( ∗ Async W o r k f l o w s ∗ )                      2
2   l e t pmap async f xs =                              3   (∗ E x p l i c i t t a s k s c r e a t i o n . ∗)
3       s e q { f o r x i n x s −> a s y n c {           4   l e t p m a p t p l t a s k s f ( x s : l i s t < >)
               return f x } }                                         =
4       |> Async . P a r a l l e l                       5       L i s t . map ( f u n x −>
5       |> Async . R u n S y n c h r o n o u s l y       6           Task< >. F a c t o r y . S t a r t N e w ( f u n
6       |> Seq . t o L i s t                                                ( ) −> f x ) . R e s u l t
                                                         7       ) xs

1   ( ∗ PLINQ − u s e s TPL i n t e r n a l l y ∗ )
2   l e t p m a p p l i n q f ( x s : l i s t < >) =
3       x s . A s P a r a l l e l ( ) . S e l e c t ( f u n x −> f x ) |> Seq . t o L i s t


     Totoo, Deligiannis, Loidl (HWU)             Haskell vs. F# vs. Scala                        15-Sep-2012     18 / 29
F# (2)

          imperative style
1   (∗ P a r a l l e l . For ∗)
2
3   l e t p m a p t p l p a r f o r f ( x s : a r r a y < >) =
4         l e t new xs = Array . z e r o C r e a t e xs . Length
5         P a r a l l e l . F o r ( 0 , x s . Length , ( f u n i −>
6                n e w x s . [ i ] <− f ( x s . [ i ] ) ) ) |> i g n o r e
7         new xs


          Parallel tuning
                  maximum degree of parallelism
                        - specifies max number of concurrently executing tasks
                  chunking/partitioning
                        - to control the granularity of tasks




     Totoo, Deligiannis, Loidl (HWU)          Haskell vs. F# vs. Scala          15-Sep-2012   19 / 29
Scala

          translate Haskell and F# implementations into Scala
          keeping the code as much functional as possible
          optimisations
                 - unnecessary object initialisations removal

1   // P a r a l l e l C o l l e c t i o n s .
2   b o d i e s . p a r . map ( ( b : Body ) => new Body ( b . mass , u p d a t e P o s ( b ) ,
            updateVel (b) ) ) . seq

1   // P a r a l l e l map u s i n g F u t u r e s .
2   d e f pmap [ T ] ( f : T => T) ( x s : L i s t [ T ] ) : L i s t [ T ] = {
3       v a l t a s k s = x s . map ( ( x : T) => F u t u r e s . f u t u r e { f ( x ) } )
4       t a s k s . map ( f u t u r e => f u t u r e . a p p l y ( ) )
5   }




     Totoo, Deligiannis, Loidl (HWU)          Haskell vs. F# vs. Scala                        15-Sep-2012   20 / 29
RESULTS
Experimental setup


      Platforms and language implementations:


                                   Linux (64-bit)               Windows (32-bit)
                                   2.33GHz (8 cores)            2.80GHz (4 cores with HT)
                   Haskell         GHC 7.4.1                    GHC 7.4.1
                   F#              F# 2.0 / Mono 2.11.1         F# 2.0 / .NET 4.0
                   Scala           Scala 2.10 / JVM 1.7         Scala 2.10 / JVM 1.7


      Input size: 80,000 bodies, 1 iteration




 Totoo, Deligiannis, Loidl (HWU)               Haskell vs. F# vs. Scala                15-Sep-2012   21 / 29
Sequential Runtimes



          Haskell (Linux)          479.94s
         F# (Win)                   28.43s
         Scala (Linux)             55.44s




 Totoo, Deligiannis, Loidl (HWU)             Haskell vs. F# vs. Scala   15-Sep-2012   22 / 29
Sequential Runtimes



          Haskell (Linux)          479.94s
         F# (Win)                   28.43s
         Scala (Linux)             55.44s
                                   before




 Totoo, Deligiannis, Loidl (HWU)             Haskell vs. F# vs. Scala   15-Sep-2012   22 / 29
Sequential Runtimes



          Haskell (Linux)          479.94s                              25.28
         F# (Win)                   28.43s                              21.12
         Scala (Linux)             55.44s                               39.04
                                   before                               after




 Totoo, Deligiannis, Loidl (HWU)             Haskell vs. F# vs. Scala           15-Sep-2012   22 / 29
Sequential Runtimes



          Haskell (Linux)          479.94s                                 25.28
         F# (Win)                   28.43s                                 21.12
         Scala (Linux)             55.44s                                  39.04
                                   before                                  after


                                                    Linux        Windows
                                      Haskell      25.28           17.64
                                     F#            118.12          21.12
                                     Scala         39.04           66.65




 Totoo, Deligiannis, Loidl (HWU)             Haskell vs. F# vs. Scala              15-Sep-2012   22 / 29
Parallel Runtimes
                             Haskell (EvalStrat)       F# (PLINQ)       Scala (Actors)
                     seq            25.28                  118.12           39.04
                      1             26.38                  196.14           40.01
                      2             14.48                  120.78           22.34
                      4             7.41                    80.91           14.88
                      8             4.50                    70.67           13.26

                                       Table: Linux (8 cores)



                               Haskell (EvalStrat)       F# (PLINQ)      Scala (Actors)
                    seq              17.64                    21.12          66.65
                     1               18.05                    21.39          67.24
                     2                9.41                    17.32          58.66
                     4                6.80                    10.56          33.84
                  8 (HT)             4.77                     8.64           25.28

                                     Table: Windows (4 cores)


 Totoo, Deligiannis, Loidl (HWU)             Haskell vs. F# vs. Scala                    15-Sep-2012   23 / 29
Speedups
Linux (8 cores)




 Totoo, Deligiannis, Loidl (HWU)   Haskell vs. F# vs. Scala   15-Sep-2012   24 / 29
Speedups
Windows (4 cores)




 Totoo, Deligiannis, Loidl (HWU)   Haskell vs. F# vs. Scala   15-Sep-2012   25 / 29
Observations

      Performance
              Haskell outperforms F# and Scala on both platforms
              - has been possible after many optimisations
              poor performance - F#/Mono (vs. F#/.NET runtime)
              poor performance - Scala on Windows
      Programmability
              Haskell - implication of laziness - hard to estimate how much work is
              involved
              Scala - verbose, unlike other functional languages
              in general, initial parallelism easy to specify
              chunking required to work for Haskell, but PLINQ and ParColl are
              more implicit
              F# and Scala retain much control in the implementation
              - not easy to tune parallel program



 Totoo, Deligiannis, Loidl (HWU)      Haskell vs. F# vs. Scala          15-Sep-2012   26 / 29
Observations



      Pragmatics
              Haskell
              - good tool support e.g. for space and time profiling
              - threadscope: visualisation tool to see work distribution
              F#
              - Profiling and Analysis tools available in VS2011 Beta Pro -
              Concurrency Visualizer
              Scala
              - benefits from free tools available for JVM




 Totoo, Deligiannis, Loidl (HWU)    Haskell vs. F# vs. Scala         15-Sep-2012   27 / 29
Conclusions


      employ skeleton-based, semi-explicit and explicit approaches to
      parallelism
      (near) best runtimes using highest-level of abstraction
      start with an optimised sequential program
      but use impure features only after parallelisation
      Haskell provides least intrusive mechanism for parallelisation
      F# provides the preferred mechanism for data-parallelism with the
      SQL-like PLINQ
      extra programming effort using actor-based code in Scala not justified




 Totoo, Deligiannis, Loidl (HWU)   Haskell vs. F# vs. Scala     15-Sep-2012   28 / 29
Future Work: Eden on Clusters6
                    Eden Iteration Skeletons: Naive NBody with 15000 bodies, 10 iteration
              512

              256

              128

              64
   Time (s)




              32

              16

               8
                              localLoop/allToAllIter
                            loopControl/allToAllRD
               4                    linear speedup
                    1        2         4          8       16           32     64       128
                                                  Processors

   6
     Mischa Dieterle, Thomas Horstmeyer, Jost Berthold and Rita Loogen: Iterating
Skeletons - Structured Parallelism by Composition, IFL’12
 Totoo, Deligiannis, Loidl (HWU)                   Haskell vs. F# vs. Scala                  15-Sep-2012   29 / 29
Thank you for listening!




    Full paper and sources:
    http://www.macs.hw.ac.uk/~dsg/gph/papers/abstracts/
    fhpc12.html
    Email:
    {pt114, pd85, H.W.Loidl}@hw.ac.uk

More Related Content

Haskell vs. F# vs. Scala

  • 1. Haskell vs. F# vs. Scala A High-level Language Features and Parallelism Support Comparison1 Prabhat Totoo, Pantazis Deligiannis, Hans-Wolfgang Loidl Denpendable Systems Group School of Mathematical and Computer Sciences Heriot-Watt University, UK FHPC’12 Copenhagen, Denmark 15 September 2012 1 URL:http://www.macs.hw.ac.uk/~dsg/gph/papers/abstracts/fhpc12.html
  • 2. Goal comparison of parallel programming support in the 3 languages - performance - programmability we look at: language features and high-level abstractions for parallelism experimental results from n-body problem on multi-cores discussion about performance, programmability and pragmatic aspects Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 1 / 29
  • 3. Motivation promises of functional languages for parallel programming referential transparency due to the absence of side effects high-level abstractions through HOFs, function composition skeletons encapsulating common parallel patterns ”specify what instead of how to compute something ” SPJ: ”The future of parallel is declarative ” Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 2 / 29
  • 4. Recent trends in language design mainstream languages integrating functional features in their design e.g. Generics (Java), lambda expression (C#, C++), delegates (C#) improves expressiveness in the language by providing functional constructs Multi-paradigm languages - make functional programming more approachable F# - Microsoft .NET platform - functional-oriented; with imperative and OO constructs Scala - JVM - emphasize on OO but combined with powerful functional features library support for parallel patterns (skeletons) e.g. TPL Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 3 / 29
  • 6. Summary table Table: Summary of language features Key Features Parallelism Support Haskell pure functional, lazy evaluation, par and pseq static/inferred typing Evaluation strategies F# functional, imperative, object oriented, Async Workflows strict evaluation, static/inferred typing, TPL .NET interoperability PLINQ Scala functional, imperative, object oriented, Parallel Collections strict evaluation, strong/inferred typing, Actors Java interoperability Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 4 / 29
  • 7. Haskell2 pure functional language, generally functions have no side-effects lazy by default typing: static, strong, type inference advanced type system supporting ADTs, typeclasses, type polymorphism monads used to: chain computation (no default eval order) IO monad separates pure from side-effecting computations main implementation: GHC - highly-optimised compiler and RTS 2 http://haskell.org/ Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 5 / 29
  • 8. Haskell - parallel support includes support for semi-explicit parallelism through the Glasgow parallel Haskell (GpH) extensions GpH provides a single primitive for parallelism: par 1 p a r : : a −> b −> b to annotate expression that can usefully be evaluated in parallel (potential NOT mandatory parallelism) and a second primitive to enforce sequential ordering which is needed to arrange parallel computations: pseq 1 p s e q : : a −> b −> b Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 6 / 29
  • 9. Haskell - parallel support GpH e.g. 1 f = s1 + s2 −− s e q u e n t i a l 2 f = s 1 ‘ par ‘ s 2 ‘ pseq ‘ ( s 1 + s 2 ) −− p a r a l l e l Evaluation strategies abstractions built on top of the basic primitives separates coordination aspects from main computation parallel map using strategies 1 parMap s t r a t f x s = map f x s ‘ u s i n g ‘ p a r L i s t s t r a t parList Spark each of the elements of a list using the given strategy strat e.g. rdeepseq Strategy that fully evaluates its argument Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 7 / 29
  • 10. Haskell - parallel support Other abstractions for parallelism: Par monad - more explicit approach - uses IVars, put and get for communication - abstract some common functions e.g. parallel map DPH - exploits data parallelism, both regular and nested-data Eden - uses process abstraction, similar to λ-abstraction - for distributed-memory but also good performance on shared-memory machines3 3 Prabhat Totoo, Hans-Wolfgang Loidl. Parallel Haskell implementations of the n-body problem. Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 8 / 29
  • 11. F#4 functional-oriented language SML-like language with OO extensions implemented on top of .NET, interoperable with languages such as C# can make use of arbitrary .NET libraries from F# strict by default, but has lazy values as well advanced type system: discriminated union (=ADT), object types (for .NET interoperability) value vs. variable 4 http://research.microsoft.com/fsharp/ Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 9 / 29
  • 12. F# - parallel support uses the .NET Parallel Extensions library - high-level constructs to write/execute parallel programs Tasks Parallel Library (TPL) - hide low-level thread creation, management, scheduling details Tasks - main construct for task parallelism Task: provides high-level abstraction compared to working directly with threads; does not return a value Task<TResult>: represents an operation that calculates a value of type TResult eventually i.e. a future Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 10 / 29
  • 13. F# - parallel support Tasks Parallel Library (TPL) Parallel Class - for data parallelism - basic loop parallelisation using Parallel.For and Parallel.ForEach PLINQ - declarative model for data parallelism - uses tasks internally Async Workflows - use the async {...} keyword; doesn’t block calling thread - provide basic parallelisation - intended mainly for operations involving I/O e.g. run multiple downloads in parallel Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 11 / 29
  • 14. Scala5 designed to be a “better Java”... multiparadigm language; integrating OO and functional model evaluation: strict; typing: static, strong and inferred strong interoperability with Java (targeted for JVM) still very tied with the concept of objects language expressiveness is extended by functional features 5 http://www.scala-lang.org/ Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 12 / 29
  • 15. Scala - parallel support Parallel Collections Framework implicit (data) parallelism use par on sequential collection to invoke parallel implementation subsequent operations are parallel use seq to turn collection back to sequential 1 x s . map ( x => f x ) // s e q u e n t i a l 2 3 x s . p a r . map ( x => f x ) . s e q // p a r a l l e l built on top of Fork/Join framework thread pool implementation schedules tasks among available processors Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 13 / 29
  • 16. Scala - parallel support Actors message-passing model; no shared state similar concurrency model to Erlang lightweight processes communicates by exchanging asynchronous messages messages go into mailboxes; processed using pattern matching 1 a ! msg // s e n d message 2 3 receive { // p r o c e s s m a i l b o x 4 case msg pattern 1 => a c t i o n 1 5 case msg pattern 2 => a c t i o n 2 6 case msg pattern 3 => a c t i o n 3 7 } Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 14 / 29
  • 18. N-body problem problem of predicting and simulating the motion of a system of N bodies that interact with each other gravitationally simulation proceeds over a number of time steps in each step calculate acceleration of each body wrt the others, then update position and velocity solving methods: all-pairs: direct body-to-body comparison, not feasible for large N Barnes-Hut algorithm: efficient approximation method, more advanced consists of 2 phases: - tree construction - force calculation (most compute-intensive) Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 15 / 29
  • 19. Sequential Implementation and optimisations Approach: start off with an efficient seq implementation, preserving opportunities for parallelism generic optimisations: deforestation - eliminating multiple traversal of data structures tail-call elimination compiler optimisations Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 16 / 29
  • 20. Haskell Optimisations eliminates stack overflow by making functions tail recursive add strictness annotations where required use strict fields and UNPACK pragma in datatype definitions shortcut fusion e.g. foldr/build Introduce parallelism at the top-level map function Core parallel code for Haskell 1 c h u n k s i z e = ( l e n g t h b s ) ‘ quot ‘ ( n u m C a p a b i l i t i e s ∗ 4 ) 2 n e w b s = map f b s ‘ u s i n g ‘ p a r L i s t C h u n k c h u n k s i z e rdeepseq Parallel tuning chunking to control granularity, not too many small tasks; not too few large tasks Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 17 / 29
  • 21. F# (1) translate Haskell code into F# keeping the code functional, so mostly syntax changes some general optimisations apply + inlining 1 ( ∗ TPL ∗ ) 1 ( ∗ Async W o r k f l o w s ∗ ) 2 2 l e t pmap async f xs = 3 (∗ E x p l i c i t t a s k s c r e a t i o n . ∗) 3 s e q { f o r x i n x s −> a s y n c { 4 l e t p m a p t p l t a s k s f ( x s : l i s t < >) return f x } } = 4 |> Async . P a r a l l e l 5 L i s t . map ( f u n x −> 5 |> Async . R u n S y n c h r o n o u s l y 6 Task< >. F a c t o r y . S t a r t N e w ( f u n 6 |> Seq . t o L i s t ( ) −> f x ) . R e s u l t 7 ) xs 1 ( ∗ PLINQ − u s e s TPL i n t e r n a l l y ∗ ) 2 l e t p m a p p l i n q f ( x s : l i s t < >) = 3 x s . A s P a r a l l e l ( ) . S e l e c t ( f u n x −> f x ) |> Seq . t o L i s t Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 18 / 29
  • 22. F# (2) imperative style 1 (∗ P a r a l l e l . For ∗) 2 3 l e t p m a p t p l p a r f o r f ( x s : a r r a y < >) = 4 l e t new xs = Array . z e r o C r e a t e xs . Length 5 P a r a l l e l . F o r ( 0 , x s . Length , ( f u n i −> 6 n e w x s . [ i ] <− f ( x s . [ i ] ) ) ) |> i g n o r e 7 new xs Parallel tuning maximum degree of parallelism - specifies max number of concurrently executing tasks chunking/partitioning - to control the granularity of tasks Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 19 / 29
  • 23. Scala translate Haskell and F# implementations into Scala keeping the code as much functional as possible optimisations - unnecessary object initialisations removal 1 // P a r a l l e l C o l l e c t i o n s . 2 b o d i e s . p a r . map ( ( b : Body ) => new Body ( b . mass , u p d a t e P o s ( b ) , updateVel (b) ) ) . seq 1 // P a r a l l e l map u s i n g F u t u r e s . 2 d e f pmap [ T ] ( f : T => T) ( x s : L i s t [ T ] ) : L i s t [ T ] = { 3 v a l t a s k s = x s . map ( ( x : T) => F u t u r e s . f u t u r e { f ( x ) } ) 4 t a s k s . map ( f u t u r e => f u t u r e . a p p l y ( ) ) 5 } Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 20 / 29
  • 25. Experimental setup Platforms and language implementations: Linux (64-bit) Windows (32-bit) 2.33GHz (8 cores) 2.80GHz (4 cores with HT) Haskell GHC 7.4.1 GHC 7.4.1 F# F# 2.0 / Mono 2.11.1 F# 2.0 / .NET 4.0 Scala Scala 2.10 / JVM 1.7 Scala 2.10 / JVM 1.7 Input size: 80,000 bodies, 1 iteration Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 21 / 29
  • 26. Sequential Runtimes Haskell (Linux) 479.94s F# (Win) 28.43s Scala (Linux) 55.44s Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 22 / 29
  • 27. Sequential Runtimes Haskell (Linux) 479.94s F# (Win) 28.43s Scala (Linux) 55.44s before Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 22 / 29
  • 28. Sequential Runtimes Haskell (Linux) 479.94s 25.28 F# (Win) 28.43s 21.12 Scala (Linux) 55.44s 39.04 before after Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 22 / 29
  • 29. Sequential Runtimes Haskell (Linux) 479.94s 25.28 F# (Win) 28.43s 21.12 Scala (Linux) 55.44s 39.04 before after Linux Windows Haskell 25.28 17.64 F# 118.12 21.12 Scala 39.04 66.65 Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 22 / 29
  • 30. Parallel Runtimes Haskell (EvalStrat) F# (PLINQ) Scala (Actors) seq 25.28 118.12 39.04 1 26.38 196.14 40.01 2 14.48 120.78 22.34 4 7.41 80.91 14.88 8 4.50 70.67 13.26 Table: Linux (8 cores) Haskell (EvalStrat) F# (PLINQ) Scala (Actors) seq 17.64 21.12 66.65 1 18.05 21.39 67.24 2 9.41 17.32 58.66 4 6.80 10.56 33.84 8 (HT) 4.77 8.64 25.28 Table: Windows (4 cores) Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 23 / 29
  • 31. Speedups Linux (8 cores) Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 24 / 29
  • 32. Speedups Windows (4 cores) Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 25 / 29
  • 33. Observations Performance Haskell outperforms F# and Scala on both platforms - has been possible after many optimisations poor performance - F#/Mono (vs. F#/.NET runtime) poor performance - Scala on Windows Programmability Haskell - implication of laziness - hard to estimate how much work is involved Scala - verbose, unlike other functional languages in general, initial parallelism easy to specify chunking required to work for Haskell, but PLINQ and ParColl are more implicit F# and Scala retain much control in the implementation - not easy to tune parallel program Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 26 / 29
  • 34. Observations Pragmatics Haskell - good tool support e.g. for space and time profiling - threadscope: visualisation tool to see work distribution F# - Profiling and Analysis tools available in VS2011 Beta Pro - Concurrency Visualizer Scala - benefits from free tools available for JVM Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 27 / 29
  • 35. Conclusions employ skeleton-based, semi-explicit and explicit approaches to parallelism (near) best runtimes using highest-level of abstraction start with an optimised sequential program but use impure features only after parallelisation Haskell provides least intrusive mechanism for parallelisation F# provides the preferred mechanism for data-parallelism with the SQL-like PLINQ extra programming effort using actor-based code in Scala not justified Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 28 / 29
  • 36. Future Work: Eden on Clusters6 Eden Iteration Skeletons: Naive NBody with 15000 bodies, 10 iteration 512 256 128 64 Time (s) 32 16 8 localLoop/allToAllIter loopControl/allToAllRD 4 linear speedup 1 2 4 8 16 32 64 128 Processors 6 Mischa Dieterle, Thomas Horstmeyer, Jost Berthold and Rita Loogen: Iterating Skeletons - Structured Parallelism by Composition, IFL’12 Totoo, Deligiannis, Loidl (HWU) Haskell vs. F# vs. Scala 15-Sep-2012 29 / 29
  • 37. Thank you for listening! Full paper and sources: http://www.macs.hw.ac.uk/~dsg/gph/papers/abstracts/ fhpc12.html Email: {pt114, pd85, H.W.Loidl}@hw.ac.uk