Computing Information Flow Using Symbolic Model Checking

Rohit Chadha

University of Missouri

Umang Mathur

UIUC

Stefan Schwoon

LSV, ENS Cachan

October 10, 2015

FSTTCS 2014

Outline

Introduction
Preliminaries
Summary Calculation
Information Leakage
Symbolic Algorithms
Moped-QLeak
Related Work
Conclusions and Future Work

Information Leakage

Information about the secret inputs using publicly observable outputs

Less leakage is desirable - Comparison across programs

No leakage

Full Leakage

Outputs are independent of inputs

Unique input for given output

char* path = getenv("PATH");
... 
sprintf(stderr, "cannot find \
exe on path %s\n", path);

try {
    ...
} catch (Exception e) {
    e.printStackTrace();
}

Need to quantify leakage

Measuring Leakage


def example1 (input) :
    output = input % 8
    return output


def example2 (input) :
    output = input % 32
    return output

Are both the functions below equally desirable in terms of information leakage ?

No ! example1 leaks lesser information than example2

Dining Cryptographers

Cryptographers A, B and C dine out together

Payment

NSA

\{

Determine if NSA paid or not without revealing information about cryptographers

Dining Cryptographers : Protocol

2 Stage Protocol:

Every two cryptographers establish a shared 1-bit secret

Each cryptographer publicly announces a bit:

XOR of shared bits, if did not pay
¬ (XOR of shared bits), otherwise

XOR(Announcement_A , Announcement_B , Announcement_C ) = 0

XOR(Announcement_A , Announcement_B , Announcement_C ) = 0

iff

\text{ NSA paid for the dinner}

\text{ NSA paid for the dinner}

Measuring Leakage : Metrics

Min-entropy : Vulnerability of the secret inputs to being guessed correctly in a single attempt

Shannon entropy : Expected number of guesses required to correctly guess secret input

\text{ME}_\text{U}(P) = \log \sum\limits_{o \in O} \max\limits_{s \in S} \mu(\mathcal{S} = s | \mathcal{O} = o)

\text{ME}_\text{U}(P) = \log \sum\limits_{o \in O} \max\limits_{s \in S} \mu(\mathcal{S} = s | \mathcal{O} = o)

\text{SE}_\text{U}(P) = \log{\lvert S \rvert} - \frac{1}{\lvert S \rvert}\sum\limits_{o \in O} \lvert P^{-1}(o) \rvert \text{ } \log \lvert P^{-1}(o) \rvert

\text{SE}_\text{U}(P) = \log{\lvert S \rvert} - \frac{1}{\lvert S \rvert}\sum\limits_{o \in O} \lvert P^{-1}(o) \rvert \text{ } \log \lvert P^{-1}(o) \rvert

Global variables : Input and output
Local variables: Internal calculations
Program statements : transform global and local variables
For Program P,
iff P does not terminate on
Summary - Joint probability distribution μ, when extended to probabilistic framework

Boolean Programs

\mathcal{G}

\mathcal{G}

F_P : 2^{\mathcal{G}} \mapsto 2^{\mathcal{G}} \cup \{\bot\}

F_P : 2^{\mathcal{G}} \mapsto 2^{\mathcal{G}} \cup \{\bot\}

F_P(\bar{g_o}) = \bot

F_P(\bar{g_o}) = \bot

\bar{g_o}

\bar{g_o}

Essentially BDDs with possibly many terminals

Algebraic Decision Diagrams

Formally,

Set of variables
Algebraic set M (we have M = [0,1]; M = {0,1} gives BDDs)
ADD :

\mathcal{V}

\mathcal{V}

2^{\mathcal{V}} \mapsto M

2^{\mathcal{V}} \mapsto M

Efficient reduced representations, like ROBDDs

ADD

Reduced form of ADD

Computing Program Summary

Program statement
Can be represented efficiently as MTBBDs
Compose statements
Arrive at a fixed point (Summary μ )

Stmt1 : x = ¬x

l \rightarrow \mu_l

l \rightarrow \mu_l

Calculating Entropy Leakage

Program P with secret inputs and public outputs
Global variables
Initialize to 0
Reset to 0 at the end
Summary

T_P : 2^{\mathcal{G}} \times 2^{\mathcal{G}'} \mapsto [0,1]

T_P : 2^{\mathcal{G}} \times 2^{\mathcal{G}'} \mapsto [0,1]

S

O

\mathcal{G} : S \cup O

\mathcal{G} : S \cup O

O

S

\text{ME}_\text{U}(P) = \log \sum\limits_{o \in O} \max\limits_{s \in S} \mu(\mathcal{S} = s | \mathcal{O} = o)

\text{ME}_\text{U}(P) = \log \sum\limits_{o \in O} \max\limits_{s \in S} \mu(\mathcal{S} = s | \mathcal{O} = o)

Min-Entropy : Symbolic Algorithm

Shannon-Entropy : Symbolic Algorithm

\text{SE}_\text{U}(P) = \log{\lvert S \rvert} - \frac{1}{\lvert S \rvert}\sum\limits_{o \in O} \lvert P^{-1}(o) \rvert \text{ } \log \lvert P^{-1}(o) \rvert

\text{SE}_\text{U}(P) = \log{\lvert S \rvert} - \frac{1}{\lvert S \rvert}\sum\limits_{o \in O} \lvert P^{-1}(o) \rvert \text{ } \log \lvert P^{-1}(o) \rvert

Moped-QLeak

Extends tool Moped
Source - C, C++
Input language Remopla - arrays, integers, structs, etc.,
Additional pchoice construct for probabilistic statements

Moped-QLeak

Modifications/Optimizations made:

Algebraic Operations
Variable Orderings

Salient features:

Handles large number of bits (30 bits)
Time taken in milliseconds
Consistently outperforms sqifc (Malacaria et. al)

Download : http://bengal.missouri.edu/~chadhar/mql/

Related Work

(Köpf et. al.,) : iteratively refine equivalence classes (deterministic only)
(Klebanov et. al.,) : program to SMT formula, count outputs (deterministic, straight line programs)
(Parket et. al.,) : explicit state model checking
(Biondi et. al.,) : forward symbolic execution; use explicit channel matrix for entropy calculations

Comparison

Comparative Analysis of Leakage Tools on Scalable Case Studies, Biondi et. al. (SPIN 2015)
Comparison across 3 tools
- Moped-QLeak
- QUAIL
- LeakWatch
Real life case studies:
- energy consumption data in smart grid
  network
- voters’ voting preferences with different
  types of votingprotocols
Moped-QLeak beats the other two in speed.

Conclusions

Symbolic algorithms for measuring information leakage
Interagble in any BDD based reachability analysis tool
Summary calculation is the overhead - BDD size (algebraic operations) and variable orderings

Future Work

Support recursive programs : ProPed
- Moped: Recursion and symbolic program verification but no probability
- PRISM: Symbolic program analysis and probability but no recursion
- PReMo: Recursion and probability but explicit state model checking
Other symbolic verification approaches: CEGAR

\cup

\cup

\cup

\cup

ProPed = Moped PRISM PReMo

Thank You !

Created Using slides.com