The need to get confidence in binary programs without access to their source code has pushed effo... more The need to get confidence in binary programs without access to their source code has pushed efforts forward to directly analyze executable programs. However, low-level programs lack high-level structures (such as types, control-flow graph, etc.), preventing the straightforward application of sourcecode analysis techniques. Especially, conditional jumps rely on low-level flag predicates, whereas they often encode high-level "natural" conditions on program variables. Most static analyzers are unable to infer any interesting information from these low-level conditions, leading to serious precision loss compared with source-level analysis. In this paper, we propose template-based recovery, an automatic approach for retrieving high-level predicates from their low-level flag versions. Especially, the technique is sound, efficient, platform-independent and it achieves very high ratio of recovery. This method allows more precise analyses and helps to understand machine encoding of conditionals rather than relying on error-prone human interpretation or (syntactic) pattern-based reasoning.
This article presents the open source BINSEC platform for (formal) binary-level code analysis. Th... more This article presents the open source BINSEC platform for (formal) binary-level code analysis. The platform is based on an extension of the DBA Intermediate Representation, and it is composed of three main modules: a front-end including several syntactic disassembly algorithms and heavy simplification of the resulting IR, a simulator supporting the recent low-level region-based memory model, and a generic static analysis module.
Arrays are ubiquitous in the context of software verification. However, effective reasoning over ... more Arrays are ubiquitous in the context of software verification. However, effective reasoning over arrays is still rare in CP, as local reasoning is dramatically ill-conditioned for constraints over arrays. In this paper, we propose an approach combining both global symbolic reasoning and local filtering in order to solve constraint systems involving arrays (with accesses, updates and size constraints) and finite-domain constraints over their elements and indexes. Our approach, named fdcc, is based on a combination of a congruence closure algorithm for the standard theory of arrays and a CP solver over finite domains. The tricky part of the work lies in the bi-directional communication mechanism between both solvers. We identify the significant information to share, and design ways to master the communication overhead. Experiments on random instances show that fdcc solves more formulas than any portfolio combination of the two solvers taken in isolation, while overhead is kept reasonable.
2009 International Conference on Software Testing Verification and Validation, 2009
Recent advances in path-based test (data) generation open the way to the systematic testing of la... more Recent advances in path-based test (data) generation open the way to the systematic testing of large scale programs. However, these technologies are still suffering from two major bottlenecks: efficient constraint solving and the path explosion phenomenon. We focus in this paper on the second issue and propose three complementary heuristics geared toward lowering path explosion. All these heuristics are both easy to implement and lightweight, and each one deals with a distinct source of path explosion. We provide theoretical and experimental evidence that they can achieve a significant reduction in the search space.
Refinement-based CFG Reconstruction from Executables⋆,⋆⋆ Sébastien Bardin, Philippe Herrmann, and... more Refinement-based CFG Reconstruction from Executables⋆,⋆⋆ Sébastien Bardin, Philippe Herrmann, and Franck Védrine CEA, LIST, Gif-sur-Yvette CEDEX, 91191 ... The only safe techniques are those by Reps et al.[4, 5]-based mainly on stride intervals propagation, and by ...
Tools and Algorithms for the Construction and Analysis of Systems, 2004
We compute reachability sets of counter automata. Even if the reachability set is not necessarily... more We compute reachability sets of counter automata. Even if the reachability set is not necessarily recursive, we use symbolic representation and acceleration to increase convergence. For functions defined by translations over a polyhedral domain, we give a new acceleration algorithm which is polynomial in the size of the function and exponential in its dimension, while the more generic algorithm is exponential in both the size of the function and its dimension. This algorithm has been implemented in the tool Fast. We apply it to a complex industrial protocol, the TTP membership algorithm. This protocol has been widely studied. For the first time, the protocol is automatically proved to be correct for 1 fault and N stations, and using abstraction we prove the correctness for 2 faults and N stations also.
Automated Technology for Verification and Analysis, 2004
Symbolic representations and acceleration algorithms are emerging methods to extend model-checkin... more Symbolic representations and acceleration algorithms are emerging methods to extend model-checking to infinite state space systems. However until now, there is no general theory of acceleration, and designing acceleration algorithms for new data types is a complex task. On the other hand, protocols rarely manipulate new data types, rather new combinations of well-studied data types. For this reason, in this
Fast is a tool designed for the analysis of counter systems, i.e. automata extended with unbounde... more Fast is a tool designed for the analysis of counter systems, i.e. automata extended with unbounded integer variables. Despite the reachability set is not recursive in general, Fast implements several innovative techniques such as acceleration and circuit selection to solve this problem in practice. In its latest version, the tool is built upon an open architecture: the Presburger library is manipulated through a clear and convenient interface, thus any Presburger arithmetics package can be plugged to the tool. We provide four implementations of the interface using Lash, Mona, Omega and a new shared automata package with computation cache. Finally new features are available, like different acceleration algorithms.
Automated Technology for Verification and Analysis, 2005
Symbolic model checking provides partially effective verification procedures that can handle syst... more Symbolic model checking provides partially effective verification procedures that can handle systems with an infinite state space. So-called "acceleration techniques" enhance the convergence of fixpoint computations by computing the transitive closure of some transitions. In this paper we develop a new framework for symbolic model checking with accelerations. We also propose and analyze new symbolic algorithms using accelerations to compute reachability sets.
fast is a tool for the analysis of infinite systems. This paper describes the underlying theory, ... more fast is a tool for the analysis of infinite systems. This paper describes the underlying theory, the architecture choices that have been made in the tool design. The user must provide a model to analyse, the property to check and a computation policy. Several such policies are proposed as a standard in the package, others can be added by the user. fast capabilities are compared with those of other tools. A range of case studies from the literature has been investigated.
We aim at checking safety properties on systems with pointers which are naturally infinite state ... more We aim at checking safety properties on systems with pointers which are naturally infinite state systems. In this paper, we introduce Symbolic Memory States, a new symbolic representation well suited to the verification of systems with pointers. We show SMS enjoys all the good properties needed to check safety properties, such as closure under union, canonicity of the representation and decidable inclusion. We also introduce pointer automata, a model for programs using dynamic allocation of memory. We define the properties we want to check in this model and we give undecidability results. The verification part is still work in progress.
International Journal on Software Tools for Technology Transfer, 2008
Fast is a tool for the analysis of systems manipulating unbounded integer variables. We check saf... more Fast is a tool for the analysis of systems manipulating unbounded integer variables. We check safety properties by computing the reachability set of the system under study. Even if this reachability set is not necessarily recursive, we use innovative techniques, namely symbolic representation, acceleration and circuit selection, to increase convergence. Fast has proved to perform very well on case studies. This paper describes the tool, from the underlying theory to the architecture choices. Finally, Fast capabilities are compared with those of other tools. A range of case studies from the literature is investigated.
Automatic test data generation (ATG) is a major topic in software engineering. In this paper, we ... more Automatic test data generation (ATG) is a major topic in software engineering. In this paper, we seek to bridge the gap between the coverage criteria supported by symbolic ATG tools and the most advanced coverage criteria found in the literature. We define a new testing criterion, label coverage, and prove it to be both expressive and amenable to efficient automation. We propose several innovative techniques resulting in an effective black-box support for label coverage, while a direct approach induces an exponential blow-up of the search space. Initial experiments show that ATG for label coverage can be achieved at a reasonable cost and that our optimisations yield very significant savings.
The need to get confidence in binary programs without access to their source code has pushed effo... more The need to get confidence in binary programs without access to their source code has pushed efforts forward to directly analyze executable programs. However, low-level programs lack high-level structures (such as types, control-flow graph, etc.), preventing the straightforward application of sourcecode analysis techniques. Especially, conditional jumps rely on low-level flag predicates, whereas they often encode high-level "natural" conditions on program variables. Most static analyzers are unable to infer any interesting information from these low-level conditions, leading to serious precision loss compared with source-level analysis. In this paper, we propose template-based recovery, an automatic approach for retrieving high-level predicates from their low-level flag versions. Especially, the technique is sound, efficient, platform-independent and it achieves very high ratio of recovery. This method allows more precise analyses and helps to understand machine encoding of conditionals rather than relying on error-prone human interpretation or (syntactic) pattern-based reasoning.
This article presents the open source BINSEC platform for (formal) binary-level code analysis. Th... more This article presents the open source BINSEC platform for (formal) binary-level code analysis. The platform is based on an extension of the DBA Intermediate Representation, and it is composed of three main modules: a front-end including several syntactic disassembly algorithms and heavy simplification of the resulting IR, a simulator supporting the recent low-level region-based memory model, and a generic static analysis module.
Arrays are ubiquitous in the context of software verification. However, effective reasoning over ... more Arrays are ubiquitous in the context of software verification. However, effective reasoning over arrays is still rare in CP, as local reasoning is dramatically ill-conditioned for constraints over arrays. In this paper, we propose an approach combining both global symbolic reasoning and local filtering in order to solve constraint systems involving arrays (with accesses, updates and size constraints) and finite-domain constraints over their elements and indexes. Our approach, named fdcc, is based on a combination of a congruence closure algorithm for the standard theory of arrays and a CP solver over finite domains. The tricky part of the work lies in the bi-directional communication mechanism between both solvers. We identify the significant information to share, and design ways to master the communication overhead. Experiments on random instances show that fdcc solves more formulas than any portfolio combination of the two solvers taken in isolation, while overhead is kept reasonable.
2009 International Conference on Software Testing Verification and Validation, 2009
Recent advances in path-based test (data) generation open the way to the systematic testing of la... more Recent advances in path-based test (data) generation open the way to the systematic testing of large scale programs. However, these technologies are still suffering from two major bottlenecks: efficient constraint solving and the path explosion phenomenon. We focus in this paper on the second issue and propose three complementary heuristics geared toward lowering path explosion. All these heuristics are both easy to implement and lightweight, and each one deals with a distinct source of path explosion. We provide theoretical and experimental evidence that they can achieve a significant reduction in the search space.
Refinement-based CFG Reconstruction from Executables⋆,⋆⋆ Sébastien Bardin, Philippe Herrmann, and... more Refinement-based CFG Reconstruction from Executables⋆,⋆⋆ Sébastien Bardin, Philippe Herrmann, and Franck Védrine CEA, LIST, Gif-sur-Yvette CEDEX, 91191 ... The only safe techniques are those by Reps et al.[4, 5]-based mainly on stride intervals propagation, and by ...
Tools and Algorithms for the Construction and Analysis of Systems, 2004
We compute reachability sets of counter automata. Even if the reachability set is not necessarily... more We compute reachability sets of counter automata. Even if the reachability set is not necessarily recursive, we use symbolic representation and acceleration to increase convergence. For functions defined by translations over a polyhedral domain, we give a new acceleration algorithm which is polynomial in the size of the function and exponential in its dimension, while the more generic algorithm is exponential in both the size of the function and its dimension. This algorithm has been implemented in the tool Fast. We apply it to a complex industrial protocol, the TTP membership algorithm. This protocol has been widely studied. For the first time, the protocol is automatically proved to be correct for 1 fault and N stations, and using abstraction we prove the correctness for 2 faults and N stations also.
Automated Technology for Verification and Analysis, 2004
Symbolic representations and acceleration algorithms are emerging methods to extend model-checkin... more Symbolic representations and acceleration algorithms are emerging methods to extend model-checking to infinite state space systems. However until now, there is no general theory of acceleration, and designing acceleration algorithms for new data types is a complex task. On the other hand, protocols rarely manipulate new data types, rather new combinations of well-studied data types. For this reason, in this
Fast is a tool designed for the analysis of counter systems, i.e. automata extended with unbounde... more Fast is a tool designed for the analysis of counter systems, i.e. automata extended with unbounded integer variables. Despite the reachability set is not recursive in general, Fast implements several innovative techniques such as acceleration and circuit selection to solve this problem in practice. In its latest version, the tool is built upon an open architecture: the Presburger library is manipulated through a clear and convenient interface, thus any Presburger arithmetics package can be plugged to the tool. We provide four implementations of the interface using Lash, Mona, Omega and a new shared automata package with computation cache. Finally new features are available, like different acceleration algorithms.
Automated Technology for Verification and Analysis, 2005
Symbolic model checking provides partially effective verification procedures that can handle syst... more Symbolic model checking provides partially effective verification procedures that can handle systems with an infinite state space. So-called "acceleration techniques" enhance the convergence of fixpoint computations by computing the transitive closure of some transitions. In this paper we develop a new framework for symbolic model checking with accelerations. We also propose and analyze new symbolic algorithms using accelerations to compute reachability sets.
fast is a tool for the analysis of infinite systems. This paper describes the underlying theory, ... more fast is a tool for the analysis of infinite systems. This paper describes the underlying theory, the architecture choices that have been made in the tool design. The user must provide a model to analyse, the property to check and a computation policy. Several such policies are proposed as a standard in the package, others can be added by the user. fast capabilities are compared with those of other tools. A range of case studies from the literature has been investigated.
We aim at checking safety properties on systems with pointers which are naturally infinite state ... more We aim at checking safety properties on systems with pointers which are naturally infinite state systems. In this paper, we introduce Symbolic Memory States, a new symbolic representation well suited to the verification of systems with pointers. We show SMS enjoys all the good properties needed to check safety properties, such as closure under union, canonicity of the representation and decidable inclusion. We also introduce pointer automata, a model for programs using dynamic allocation of memory. We define the properties we want to check in this model and we give undecidability results. The verification part is still work in progress.
International Journal on Software Tools for Technology Transfer, 2008
Fast is a tool for the analysis of systems manipulating unbounded integer variables. We check saf... more Fast is a tool for the analysis of systems manipulating unbounded integer variables. We check safety properties by computing the reachability set of the system under study. Even if this reachability set is not necessarily recursive, we use innovative techniques, namely symbolic representation, acceleration and circuit selection, to increase convergence. Fast has proved to perform very well on case studies. This paper describes the tool, from the underlying theory to the architecture choices. Finally, Fast capabilities are compared with those of other tools. A range of case studies from the literature is investigated.
Automatic test data generation (ATG) is a major topic in software engineering. In this paper, we ... more Automatic test data generation (ATG) is a major topic in software engineering. In this paper, we seek to bridge the gap between the coverage criteria supported by symbolic ATG tools and the most advanced coverage criteria found in the literature. We define a new testing criterion, label coverage, and prove it to be both expressive and amenable to efficient automation. We propose several innovative techniques resulting in an effective black-box support for label coverage, while a direct approach induces an exponential blow-up of the search space. Initial experiments show that ATG for label coverage can be achieved at a reasonable cost and that our optimisations yield very significant savings.
Testing is the primary approach for detecting software defects. A major challenge faced by tester... more Testing is the primary approach for detecting software defects. A major challenge faced by testers lies in crafting efficient test suites, able to detect a maximum number of bugs with manageable effort. To do so, they rely on coverage criteria, which define some precise test objectives to be covered. However, many common criteria specify a significant number of objectives that occur to be infeasible or redundant in practice, like covering dead code or semantically equal mutants. Such objectives are well-known to be harmful to the design of test suites, impacting both the efficiency and precision of testers' effort. This work introduces a sound and scalable formal technique able to prune out a significant part of the infeasible and redundant objectives produced by a large panel of white-box criteria. In a nutshell, we reduce this challenging problem to proving the validity of logical assertions in the code under test. This technique is implemented in a tool that relies on weakest-precondition calculus and SMT solving for proving the assertions. The tool is built on top of the Frama-C verification platform, which we carefully tune for our specific scalability needs. The experiments reveal that the tool can prune out up to 27% of test objectives in a program and scale to applications of 200K lines of code.
Uploads
Papers by S. Bardin