scalib.modeling.LDAClassifier#
- class scalib.modeling.LDAClassifier(nc, p)[source]#
Models the leakage \(\mathbf{l}\) with \(n_s\) dimensions using the linear discriminant analysis classifier (LDA) with integrated dimensionality reduction.
Deprecated since version 0.6.1: Use
LdaAccinstead.Based on the training data, linear discriminant analysis build a linear dimentionality reduction to \(p\) dimensions that maximizes class separation. Then, a multivariate gaussian template is fitted for each class (using the same covariance matrix for all the classes) in the reduced dimensionality space to predict leakage likelihood [1].
Let \(\mathbf{W}\) be the dimensionality reduction matrix of size (\(p\), \(n_s\)). The likelihood is
\[\mathsf{\hat{f}}(\mathbf{l} | x) = \frac{1}{\sqrt{(2\pi)^{p} \cdot |\mathbf{\Sigma} |}} \cdot \exp^{\frac{1}{2} (\mathbf{W} \cdot \mathbf{l} - \mathbf{\mu}_x) \mathbf{\Sigma} ( \mathbf{W} \cdot \mathbf{l}-\mathbf{\mu}_x)'}\]where \(\mathbf{\mu}_x\) is the mean of the leakage for class \(x\) in the projected space (\(\mu_x = \mathbb{E}(\mathbf{W}\mathbf{l}_x)\), where \(\mathbf{l}_x\) denotes the leakage traces of class \(x\)) and \(\mathbf{\Sigma}\) its covariance (\(\mathbf{\Sigma} = \mathbb{Cov}(\mathbf{W}\mathbf{l}_x - \mathbf{\mu}_x)\)).
LDAClassifierprovides the probability of each class withpredict_proba()thanks to Bayes’ law such that\[\hat{\mathsf{pr}}(x|\mathbf{l}) = \frac{\hat{\mathsf{f}}(\mathbf{l}|x)} {\sum_{x^*=0}^{n_c-1} \hat{\mathsf{f}}(\mathbf{l}|x^*)}.\]Example
>>> from scalib.modeling import LDAClassifier >>> import numpy as np >>> # 5000 traces of length 10, with value between 0 and 255 >>> traces = np.random.randint(0,256,(5000,10),dtype=np.int16) >>> # classes between 0 and 15 >>> x = np.random.randint(0,16,5000,dtype=np.uint16) >>> lda = LDAClassifier(16,3) >>> lda.fit_u(traces, x) >>> lda.solve() >>> # predict classes for new traces >>> nt = np.random.randint(0,256,(20,10),dtype=np.int16) >>> predicted_proba = lda.predict_proba(nt)
Notes
This should have similar behavior as scikit-learn’s LDA, but it has better performance and numerical properties (at the cost of flexibility).
References
- Parameters:
nc (int) – Number of possible classes (e.g., 256 for 8-bit target).
ncmust be smaller than \(2^{16}\).p (int) – Number of dimensions in the linear subspace.
Methods
fit_u(traces, x[, gemm_mode])Update statistical model estimates with fresh data.
get_mus()Return means matrix (classes means).
get_sb()Return \(S_{B}\) matrix (between-class scatter).
get_sw()Return \(S_{W}\) matrix (within-class scatter).
predict_proba(traces)Computes the probability for each of the classes for the traces.
solve([done])Estimates the PDF parameters that is the projection matrix \(\mathbf{W}\), the means \(\mathbf{\mu}_x\) and the covariance \(\mathbf{\Sigma}\).
- fit_u(traces, x, gemm_mode=None)[source]#
Update statistical model estimates with fresh data.
- Parameters:
traces (numpy.typing.NDArray.numpy.int16) – Array that contains the traces. The array must be of dimension
(n,ns)and its type must be int16.x (numpy.typing.NDArray.numpy.uint16) – Labels for each trace. Must be of shape
(n)and must be uint16.gemm_mode (int | None) – Depreciated, kept for API compatibility.
- solve(done=False)[source]#
Estimates the PDF parameters that is the projection matrix \(\mathbf{W}\), the means \(\mathbf{\mu}_x\) and the covariance \(\mathbf{\Sigma}\).
- Parameters:
done (bool) – True if the object will not be futher updated (clears some internal state, saving memory).
Notes
Once this has been called, predictions can be performed.