scalib.modeling.LDAClassifier#

class scalib.modeling.LDAClassifier(nc, p)[source]#

Models the leakage \(\mathbf{l}\) with \(n_s\) dimensions using the linear discriminant analysis classifier (LDA) with integrated dimensionality reduction.

Deprecated since version 0.6.1: Use LdaAcc instead.

Based on the training data, linear discriminant analysis build a linear dimentionality reduction to \(p\) dimensions that maximizes class separation. Then, a multivariate gaussian template is fitted for each class (using the same covariance matrix for all the classes) in the reduced dimensionality space to predict leakage likelihood [1].

Let \(\mathbf{W}\) be the dimensionality reduction matrix of size (\(p\), \(n_s\)). The likelihood is

\[\mathsf{\hat{f}}(\mathbf{l} | x) = \frac{1}{\sqrt{(2\pi)^{p} \cdot |\mathbf{\Sigma} |}} \cdot \exp^{\frac{1}{2} (\mathbf{W} \cdot \mathbf{l} - \mathbf{\mu}_x) \mathbf{\Sigma} ( \mathbf{W} \cdot \mathbf{l}-\mathbf{\mu}_x)'}\]

where \(\mathbf{\mu}_x\) is the mean of the leakage for class \(x\) in the projected space (\(\mu_x = \mathbb{E}(\mathbf{W}\mathbf{l}_x)\), where \(\mathbf{l}_x\) denotes the leakage traces of class \(x\)) and \(\mathbf{\Sigma}\) its covariance (\(\mathbf{\Sigma} = \mathbb{Cov}(\mathbf{W}\mathbf{l}_x - \mathbf{\mu}_x)\)).

LDAClassifier provides the probability of each class with predict_proba() thanks to Bayes’ law such that

\[\hat{\mathsf{pr}}(x|\mathbf{l}) = \frac{\hat{\mathsf{f}}(\mathbf{l}|x)} {\sum_{x^*=0}^{n_c-1} \hat{\mathsf{f}}(\mathbf{l}|x^*)}.\]

Example

>>> from scalib.modeling import LDAClassifier
>>> import numpy as np
>>> # 5000 traces of length 10, with value between 0 and 255
>>> traces = np.random.randint(0,256,(5000,10),dtype=np.int16)
>>> # classes between 0 and 15
>>> x = np.random.randint(0,16,5000,dtype=np.uint16)
>>> lda = LDAClassifier(16,3)
>>> lda.fit_u(traces, x)
>>> lda.solve()
>>> # predict classes for new traces
>>> nt = np.random.randint(0,256,(20,10),dtype=np.int16)
>>> predicted_proba = lda.predict_proba(nt)

Notes

This should have similar behavior as scikit-learn’s LDA, but it has better performance and numerical properties (at the cost of flexibility).

References

Parameters:
  • nc (int) – Number of possible classes (e.g., 256 for 8-bit target). nc must be smaller than \(2^{16}\).

  • p (int) – Number of dimensions in the linear subspace.

Methods

fit_u(traces, x[, gemm_mode])

Update statistical model estimates with fresh data.

get_mus()

Return means matrix (classes means).

get_sb()

Return \(S_{B}\) matrix (between-class scatter).

get_sw()

Return \(S_{W}\) matrix (within-class scatter).

predict_proba(traces)

Computes the probability for each of the classes for the traces.

solve([done])

Estimates the PDF parameters that is the projection matrix \(\mathbf{W}\), the means \(\mathbf{\mu}_x\) and the covariance \(\mathbf{\Sigma}\).

fit_u(traces, x, gemm_mode=None)[source]#

Update statistical model estimates with fresh data.

Parameters:
  • traces (numpy.typing.NDArray.numpy.int16) – Array that contains the traces. The array must be of dimension (n,ns) and its type must be int16.

  • x (numpy.typing.NDArray.numpy.uint16) – Labels for each trace. Must be of shape (n) and must be uint16.

  • gemm_mode (int | None) – Depreciated, kept for API compatibility.

solve(done=False)[source]#

Estimates the PDF parameters that is the projection matrix \(\mathbf{W}\), the means \(\mathbf{\mu}_x\) and the covariance \(\mathbf{\Sigma}\).

Parameters:

done (bool) – True if the object will not be futher updated (clears some internal state, saving memory).

Notes

Once this has been called, predictions can be performed.

predict_proba(traces)[source]#

Computes the probability for each of the classes for the traces.

Parameters:

traces (numpy.typing.NDArray.numpy.int16) – Array that contains the traces. The array must be of dimension (n,ns).

Returns:

Probabilities. Shape (n, nc).

Return type:

array_like, f64

get_sw()[source]#

Return \(S_{W}\) matrix (within-class scatter).

get_sb()[source]#

Return \(S_{B}\) matrix (between-class scatter).

get_mus()[source]#

Return means matrix (classes means). Shape: (nc, ns).