Papers by Bruno da Silva Barbosa
This report is a result of a study about LU decomposition exploring partial pivoting with Matlab.... more This report is a result of a study about LU decomposition exploring partial pivoting with Matlab. In this work we'll gonna use two provided Matlab codes based on BlAS2 and BLAS3 and implement partial pivoting in both. The first one is called BLAS2LU.m wich applies a row permutation to matrix wich has m rows and n columns where m ≥ n. The second code provided is BLAS3LU.m wich applies a block LU factorization and calls BLAS2LU to perform multiple block factorization. Both codes initially without pivoting. The main goal of this work is modifying original codes and implement partial pivoting on BLAS2LU.m and BLAS3LU.m. Our partial pivoting implementation will call BLAS2LUPP and BLAS3LUPP respectively. On experimental component of this work we will test both codes with matrices generated randomly with different dimensions. For both solutions produced, we'll compute the numerical error using the permutation matrix and a speedup analysis to draw some conclusions.
This is an academic work developed on University of Minho
This report is a result of a study about Monte Carlo algorithm applied to Travelling Salesman Pro... more This report is a result of a study about Monte Carlo algorithm applied to Travelling Salesman Problem (TSP) exploring the Simulated Annealing (SA) meta-heuristic. We've a discrete space of cities and the algorithm finds the shortest route that starts at one of the towns, goes once through everyone of the others and returns to the first one. The main goal is explore the possibility of having a zero cost solution with n cities and p processors running in parallel. To perform this analysis we'll gonna use a TSP algorithm with MATLAB.
This is an academic work developed on University Of Minho.
This report is a result of a study about computational improvement in DD3IMP software package thr... more This report is a result of a study about computational improvement in DD3IMP software package through High Performance Computing. DD3IMP is a software package for Finite Element Method (FEM) based on numerical simulation. The program simulates the forming processes of sheet metals and elastoplasticity through deep drawing using FEM. The performance of this software is directly based on the performance of the linear equations system solver it uses. In the currently version of DD3IMP, the main solver is Direct Sparse Solver (DSS) from Intel R Math Kernel Library (MKL). This is an optimized solver wich has revealed the best performance for solving the linear system of equations in DD3IMP using Intel's processor's based machines. The entire program is written in Fortran programming language with about 500 routines and 60k lines of code. The entire program is already parallelized in shared memory paradigm using OpenMP directives. In this work we're going to explore the DD3IMP program and using some profilling tools to detect were the program is more computational expensive and explore the possibilities of increasing their performance. The program will be analysed using SeARCH Cluster nodes based on Intel R Xeon R Processor with Ivy-Bridge microarchitecture, and a team laptop based on Intel R Core R i7 based on Haswell microarchitecture. I. The package DD3IMP (Deep-Drawing 3D Implicit FE Solver) The program DD3IMP is a software package for conforma-tion and elastoplasticity simulation for sheet metals through deep drawing using finite element methods. The program was developed and implemented in Fortran 95 has more than 500 routines and 60k lines of code. The part of DD3IMP wich performs more work is related to solving a linear equations system multiple times. Solving a linear equation system can be a task computation-ally intensive. Since DD3IMP solves this kind of linear equation system multiple times, it's resolution can configure a bottleneck for performance scalability. As we'll see in the next sections, since the most computational heavy regions of this software corresponds to solving a linear equation system, the global performance is directly affected by the solver it uses. The equation system involved on DD3IMP is a matrix-vector multiplication: Ax = b where A is a non-symmetric sparse matrix (symmetric in structure but non-symmetric in values) in a CSR format that represents the mesh structure, x is the displacements vector and b is the vector of external forces. The actual solver implemented in DD3IMP is DSS (Direct Sparse Solver) from Intel's Math Kernel Library (MKL). The previous one was a solver based on conjugate gradient method-the conjugate gradient squared (CGS) combined with ILU pre-conditioner-wich was substituted by DSS due to performance reasons. However, these two solver are currently available on DD3IMP package and user can select wich one to use. The main differences between these two solvers is that CGS is an iterative method and DSS is a direct method. The iterative methods are commonly known for being computacionally efficient and fast convergence. However previous studies in this software reveals that DSS was the faster solver running on Intel's processors based machines in OpenMP implementation of DD3IMP. This optimized library is particularly efficient when having a large sized problems. The scalability of DSS allow the program to scale almost linearly as we'll see. II. Starting Point and Case Studies The starting point is DD3IMP sequential and a parallel version with OpenMP directives. As we'll see on the profiling section , in actual paralelized version, more than 97% of DD3IMP execution time is running in parallel. We also have three different case studies. Since DD3IMP is a finite element method package, the program uses numerical techniques for finding approximate solutions to boundary value problems using differential equations.
Uploads
Papers by Bruno da Silva Barbosa
This is an academic work developed on University of Minho
This is an academic work developed on University Of Minho.
This is an academic work developed on University of Minho
This is an academic work developed on University Of Minho.