From 871dc28f247156ba79c41cfc17665d84d2a481e3 Mon Sep 17 00:00:00 2001 From: Aaron Kaiser Date: Wed, 24 May 2023 15:35:28 +0200 Subject: [PATCH] Overhauled EdDSA section --- thesis/sections/eddsa.tex | 143 ++++++++++++++++++-------------------- 1 file changed, 66 insertions(+), 77 deletions(-) diff --git a/thesis/sections/eddsa.tex b/thesis/sections/eddsa.tex index ed15e0c..6542ecf 100644 --- a/thesis/sections/eddsa.tex +++ b/thesis/sections/eddsa.tex @@ -1,72 +1,15 @@ \section{EdDSA Signatures} \label{sec:eddsa} -% TODO: Referenz zum ersten Paper 2011 oder lieber zum journal paper 2012? +This section takes a closer look at the differences between the existing EdDSA specifications and points out the differences between the standards and the original Schnorr signature scheme. -This section takes a closer look at the existing specifications of the EdDSA signature scheme and specifies a version which will be analyzed in this thesis. +As mentioned above, there are two papers by Bernstein et. al., that define the EdDSA signature scheme \cite{CHES:BDLSY11} \cite{EPRINT:BJLSY15}. The 2015 paper \cite{EPRINT:BJLSY15} describes a more generic version of the EdDSA signature scheme than the original publication \cite{CHES:BDLSY11}. According to \cite{EPRINT:BJLSY15}, the EdDSA signature scheme is defined by 11 parameters, as shown in the table \ref{tab:parameter}. The paper also describes two variants of EdDSA. One is called PureEdDSA and the other is called HashEdDSA. HashEdDSA is a prehashing variant of the PureEdDSA signature scheme. This means that, in HashEdDSA, the message is being hashed by a hash function before it is signed or verified. Both variants can be described by the definition of the EdDSA signature scheme, by using a different perhash function. In PureEdDSA the prehash function is simply the identity function. Another important variation in the EdDSA standard is the decoding of the signature. \cite{EPRINT:BJLSY15} describes two variations on how signatures can be decoded during verification. Both variations are described further in this section, as they have a major impact on the security of the EdDSA signature scheme. -This work will take a closer look at the \cma security of the EdDSA signature scheme. EdDSA was introduced as the Ed25519 signature scheme using the twisted Edwards curve Edwards25519, which is birationally equivalent to the Weierstrass curve Curve25519 \cite{JCEng:BDLSY12}. Later in 2015 the paper "EdDSA for more Curves" by Bernstein et al. introduces a more general version of EdDSA \cite{EPRINT:BJLSY15}. The paper also introduces a variant of EdDSA using prehashing. The RFC 8032 "Edwards-Curve Digital Signature Algorithm (EdDSA)" from 2017 specifies a version of EdDSA with the inclusion of an additional input parameter \textit{context} for the \sign and \verify procedure \cite{josefsson_edwards-curve_2017}. This version was also included into the FIPS 186-5 "Digital Signature Standard (DSS)" standard \cite{moody_digital_2023}. +There also exist two major standards for the EdDSA signature scheme. The first is the RFC 8032, which was introduced by the IETF in 2017 \cite{josefsson_edwards-curve_2017}. n addition to publishing concrete parameterizations for the Ed25519 and Ed448 signature schemes, it also includes a variant of the EdDSA signature scheme that includes a context. The context is a separate string that can be used to separate the use of EdDSA between different protocols. As argued below, the inclusion of this context does not affect the security of the signature scheme and can be modeled as being part of the message. -In the prehashing variant of EdDSA the signature is calculated on the hash value of the message. The message is used twice during the generation of the signature. Thus the message needs to be buffered or transmitted twice during the generation of the signature. Therefore the prehashing variant offers an performance advantage on memory and bandwidth constraint devices. The context is an additional input parameter which has to be equal during generation and verification of the signature and is used to bind the signature to a given context. +The 2023 FIPS 186-5 standard \cite{moody_digital_2023} also includes the EdDSA signature scheme as specified in the RFC 8032. -In the following, when speaking from the EdDSA signature scheme, the original variant, without prehashing and context, is meant. It is argued that the context can be modeled as being part of the message. Regarding the prehashing a standard proof for prehashing variants of UF-CMA secure signature schemes can be used. In the case of EdDSA the prehash function $H'(\inp)$ is the identity function. Figure \ref{fig:eddsa} defines the EdDSA signature scheme -%TODO standard proof for prehashing referenzieren - -\subsection{EdDSA Parameter} - -The generic version of EdDSA from the "EdDSA for more Curves" paper, the RFC 8032 and the FIPS 186-5 standard is parameterized by the following 11 parameters \cite{EPRINT:BJLSY15} \cite{josefsson_edwards-curve_2017} \cite{moody_digital_2023}. - -The list of the parameters can be found in table \ref{tab:parameter}. - -\subsection{Encoding of Group Elements} - -The encoding function encodes points on the twisted Edwards cuve into b-bit bitstring and vice versa. It is assumed to be unambiguous, with each point on the twisted Edwards curve having exactly one bitstring representing that point and invalid bitstring being rejected during decoding of the point. This way by decoding a b-bit bitstring into an cuve point implicitly ensures that the decoded point is a valid point on the specified twisted Edwards curve. The requirement for this property comes from the specified encoding function in \cite{EPRINT:BJLSY15}. - -\subsection{Message Space} - -The message space $\messagespace$ is defined as a bitstring of arbitrary length. To make the proof also apply to the EdDSA variant with context the context can be modeled to be part of the message. - -Looking at the RFC and the FIPS standard the context is passed into a "dom" function which concatenates the context with some additional data. The resulting data is then passed as additional data to each hash function call during the generation and verification of the signature. Since the proofs are conducted in the random oracle model the position of the data in the hash function call and the actual content of the message and the context are not relevant for the output of the random oracle call. Unless the reduction explicitly uses the content of message, which they do not in this case. Therefore, the context can be modeled as being part of the message. - -\subsection{Signature} - -The signature is a defined as a $2b$ bitstrig of the encoded curve points $\groupelement{R}$ concatenated with the $b$-bit little endian encoding of the scalar $S$. - -$S$ being defined as the $b$-bit little endian encoding poses a problem. It might be possible that the decoded $S$ is larger than the order $L$ of the generator. The original paper \cite{EPRINT:BJLSY15} proposes two variants to handle decoded $S$ values, which are larger than $L$. The first approach is to replace $S$ with $S \pmod L$ and continue the verification of the signature. This will be called the lax parsing. The other approach is to reject all $S$ values larger than $L$ and fail the verification of the signature in that case. Parsing the integer like this will be called strict parsing. - -The later proofs show that those two approaches lead to different security properties of the signature. Using strict parsing results in the SUF-CMA security while using lax parsing "only" ensures EUF-CMA security. - -The RFC as well as the FIPS standard both require strict parsing. - -\subsection{Differences from Schnorr Signatures} - -As already pointed out in \cite{SP:BCJZ21} there are some minor differences from traditional Schnorr signature which prevent already existing proofs of the Schnorr signature scheme to be applied to EdDSA. This section points out the differences of the EdDSA signature scheme from traditional Schnorr signature scheme. - -\subsubsection{Group Structure} - -Unlike the standard Schnorr signature scheme, which was defined over a prime order group, the EdDSA signature scheme is defined over a prime order subgroup of a twisted Edwards curve. - -This might pose additional challenges since working with group elements outside the prime order subgroup might have some unintended side effects. In the proofs using the algebraic group model, where this might become relevant, it is argued that the additional group structure from the twisted Edwards curves do not pose an additional treat to the schema. - -\subsubsection{Private Key Clamping} - -Instead of choosing the secret scalar uniformly at random, as done in most other schemes, the secret scalar is generated by hashing a random bitstring, fixing some bits of the hash result to a specific value and then interpreting $n$ bits of the result as the little endian representation of an integer. - -To be more precise from the lower $b$ bit of the $2b$ bit the lowest $c$ bit are set to 0, where $c$ is the cofactor of the twisted Edwards cureve, and the $n$th bit is set to 1. Then the first $n$ bits are interpreted as the secret scalar $s$. - -This is strictly less secure, in the sense of the discrete logarithm problem, then choosing the secret scalar uniformly at random. It also makes proofs in the multi-user setting more challenging, since rerandomization of a public key is not easily possible and therefore the multi-user security of EdDSA can not easily be reduced onto the single-user security of EdDSA. - -To overcome this challenge specific variants of the discrete logarithm problem and the one-more discrete logarithm problem are introduced, which take the specific key generation into account. The hardness of those problems are then examined in the generic group model. - -Choosing the secret scalar like this is supposed to help make implementation constant time and to prevent the leakage of bits through side channel attacks. - -\subsubsection{Key Prefixing} - -The EdDSA signatur scheme also includes the public key as an additional input to the hash function, when generating the challenge. This change does not reduce the security of the signature scheme and mainly revolves around the multi-user security of the signature scheme. - -\subsubsection{Deterministic Nonce Generation} - -The commitment is chosen as the result of a hash function instead of uniformly at random on every signature generation. This makes the signature generation deterministic. Since the hash function is modeled as a random oracle the deterministic generation of the commitment does not pose an additional security risk, since it can be replaced with a random function, as shown in \ref{sec:eddsa'_proof}. +The EdDSA signature scheme is depicted in figure \ref{fig:eddsa}. % TODO: Ist das ok hier einfach zu kopieren? \begin{center} @@ -91,8 +34,6 @@ The commitment is chosen as the result of a hash function instead of uniformly a \end{table} \end{center} - - \begin{figure}[H] \hrule \begin{multicols}{3} @@ -127,14 +68,63 @@ The commitment is chosen as the result of a hash function instead of uniformly a \label{fig:eddsa} \end{figure} +\subsection{Encoding of Group Elements} + +The encoding function encodes points on the twisted Edwards curve into a b-bit bitstring and vice versa. It is assumed that when a b-bit bitstring is decoded, the resulting point is either a valid point on the twisted Edwards curve or the decoding will fail. In this way, decoding a b-bit bitstring into a cuve point implicitly ensures that the decoded point is a valid point on the specified twisted Edwards curve. The encoding function does not ensure that each point has exactly one bitstring representation. This means that there may be multiple bitstrings mapping to the same curve point during decoding. The effect of this is included in the analysis. + +\subsection{Message Space} + +The message space $\messagespace$ is defined as a bitstring of arbitrary length. To make the proof applicable to the EdDSA variant with context, the context can be modeled as part of the message. + +Looking at the RFC and FIPS standards, the context is passed to a "dom" function which concatenates the context with some additional data. The resulting data is then passed as additional data to each hash function call during signature generation and verification. Since the proofs are performed in the random oracle model, the position of the data in the hash function call, the actual content of the message, and the context are not relevant to the output of the random oracle call. Unless the reduction explicitly uses the content of the message, which it does not in this case. Therefore, the context can be modeled as part of the message. +\subsection{Signature} + +The signature is a defined as a $2b$ bitstrig of the encoded curve points $\groupelement{R}$ concatenated with the $b$-bit little endian encoding of the scalar $S$. + +The fact that $S$ is defined as $b$-bit little-endian encoding poses a problem. It is possible that the decoded $S$ is larger than the order $L$ of the generator. The original paper \cite{EPRINT:BJLSY15} proposes two ways to handle decoded $S$ values that are larger than $L$. The first approach is to replace $S$ with $S \pmod L$ and continue verifying the signature. This is called lax parsing. The other approach is to reject all $S$ values greater than $L$ and fail the signature verification in that case. Parsing the integer in this way is called strict parsing. + +The later proofs show that these two approaches lead to different security properties of the signature. Using strict parsing results in SUF-CMA security, while using lax parsing "only" ensures EUF-CMA security. + +Both the RFC and FIPS standards require strict parsing. + +\subsection{Differences from Schnorr Signatures} + +As pointed out in \cite{SP:BCJZ21}, there are some minor differences from the traditional Schnorr signature that prevent existing proofs of the Schnorr signature scheme from being applied to EdDSA. This section points out the differences between the EdDSA signature scheme and the traditional Schnorr signature scheme. + +\subsubsection{Group Structure} + +Unlike the standard Schnorr signature scheme, which is defined over a prime order group, the EdDSA signature scheme is defined over a prime order subgroup of a twisted Edwards curve. + +This may pose additional challenges, since working with group elements outside the prime order subgroup may have some unintended side effects. In the proofs using the algebraic group model, where this might become relevant, it is argued that the additional group structure of the twisted Edwards curves does not pose an additional threat to the scheme. + +\subsubsection{Private Key Clamping} + +Instead of choosing the secret scalar uniformly at random, as done in most other schemes, the secret scalar is generated by hashing a random bitstring, fixing some bits of the hash result to a specific value and then interpreting $n$ bits of the result as the little endian representation of an integer. + +To be more precise from the lower $b$ bit of the $2b$ bit the lowest $c$ bit are set to 0, where $c$ is the cofactor of the twisted Edwards cureve, and the $n$th bit is set to 1. Then the first $n$ bits are interpreted as the secret scalar $s$. + +This is strictly less secure, in the sense of the discrete logarithm problem, than choosing the secret scalar uniformly at random. It also makes proofs in the multi-user setting more challenging, since rerandomization of a public key is not easily possible and therefore the multi-user security of EdDSA cannot be easily reduced onto the single-user security of EdDSA. + +To overcome this challenge, specific variants of the discrete logarithm problem and the one-more discrete logarithm problem are introduced that take into account the specific key generation. The hardness of these problems is then studied in the generic group model. + +Such a choice of the secret scalar should help to make the implementation constant time and to prevent the leakage of bits through side-channel attacks. + +\subsubsection{Key Prefixing} + +The EdDSA signature scheme also includes the public key as an additional input to the hash function when generating the challenge. This change does not reduce the security of the signature scheme and is mainly related to the multi-user security of the signature scheme. Whether key prefixing actually improves multi-user security is much debated \cite{EPRINT:Bernstein15} \cite{C:KilMasPan16}. + +\subsubsection{Deterministic Nonce Generation} + +The commitment is chosen as the result of a hash function instead of being chosen at random each time a signature is generated. This makes signature generation deterministic. Since the hash function is modeled as a random oracle, the deterministic generation of the commitment does not pose any additional security risk, since it can be replaced by a random function, as shown in \ref{sec:eddsa'_proof}. + \subsection{Replacing Hash Function Calls} \label{sec:eddsa'_proof} -To make working with the random oracle easier in the following proofs a variant of the EdDSA signature scheme is introduced which has some calls to the random oracle replaced by directly sampling a value uniformly at random or using a random function. After that it will be shown that the advantage winning the \cma game in both versions of the signature scheme is roughly the same. +To make it easier to work with the random oracle, the following proofs introduce a variant of the EdDSA signature scheme in which some calls to the random oracle are replaced by direct sampling of a value at random or by using a random function. It is then shown that the advantage of winning the \cma game is roughly the same in both versions of the signature scheme. \paragraph{\underline{Introducing EdDSA'}} -The EdDSA' signature scheme is depicted in figure \ref{fig:eddsa'}. The difference to the original EdDSA signature scheme is that the value $h$ is sampled uniformly at random from $\{0,1\}^{2b}$ and $r'$ is the result of a call to random function instead of the hash function. +The EdDSA' signature scheme is shown in figure \ref{fig:eddsa'}. The difference from the original EdDSA signature scheme is that the value $h$ is sampled uniformly at random from $\{0,1\}^{2b}$, and $r'$ is the result of a call to random function instead of the hash function. \begin{figure} \hrule @@ -171,13 +161,12 @@ The EdDSA' signature scheme is depicted in figure \ref{fig:eddsa'}. The differen \label{theorem:adveddsa'} Let $\adversary{A}$ be and adversary against SUF-CMA security of the EdDSA signature scheme. Then - %TODO: richtigre Richtung? \[ \advantage{\text{EdDSA'},\adversary{A}}{\cma}(\secparamter) \leq \advantage{\text{EdDSA},\adversary{A}}{\cma}(\secparamter) + \frac{2 (\hashqueries + 1)}{2^b}. \] \end{theorem} \paragraph{\underline{Proof Overview}} -The different games used in the proof are depicted in figure \ref{fig:eddsa'games}. The proof uses the random oracle model. The main idea that the values $h$ and $r_i$ look uniformly random to the adversary if he never queries the random oracle with $k$ or a value starting with $h_b | ... | h_{2b-1}$. Therefor those calls to the random oracle can be replaced with the sampling of truly random values. +The different games used in the proof are depicted in figure \ref{fig:eddsa'games}. The proof uses the random oracle model. The main idea is that the values $h$ and $r_i$ look uniformly random to the adversary if he never queries the hash function with $k$ or a value starting with $h_b | ... | h_{2b-1}$. Since those values are unknown to the adversary it is only able to guess those values, which is unlikely due to the high entropy of those values. For this reason, these calls to the hash function can be replaced by sampling truly random values. \paragraph{\underline{Formal Proof}} @@ -259,33 +248,33 @@ The different games used in the proof are depicted in figure \ref{fig:eddsa'game \end{figure} \begin{proof} - \item \paragraph{\underline{$G_0:$}} Let $G_0$ be defined in figure \ref{fig:eddsa'games} by excluding all boxes expect the black ones and $G_0$ be $\cma$. By definition, + \item The proof will be conducted by gradually changing the game $G_0$, which is the SUF-CMA game for EdDSA, to $G_4$, which is the SUF-CMA game for EdDSA'. At each step it is argued that the change can be detected with at most negligible probability. + + \item \paragraph{\underline{$G_0:$}} Let $G_0$ be defined in figure \ref{fig:eddsa'games} by excluding all boxes expect the black one. Clearly $G_0$ is the $\cma$ game for EdDSA. By definition, \[ \advantage{\text{EdDSA},\adversary{A}}{\cma}(\secparamter) = \Pr[\cma_{\text{EdDSA}}^{\adversary{A}} \Rightarrow 1] = \Pr[G_0^{\adversary{A}} \Rightarrow 1]. \] - \item \paragraph{\underline{$G_1:$}} Let $G_1$ be defined by additionally including all blue boxes and excluding the black boxes. This change inlines calls to the random oracle and introduces to if conditions in the random oracle which are setting a bad flag if the condition is triggert. Since the behavior of the game does not change the changes are conceptual and the probability of winning the game is not affected. Hence, + \item \paragraph{\underline{$G_1:$}} Let $G_1$ be defined by additionally including all blue boxes and excluding the black boxes. This change inlines the hash function calls and introduces to if conditions in the random oracle that set a bad flag if the abort condition is true. The inlining of the hash function calls ensures that the challenger does not trigger the abort conditions itself. Since the behavior of the game does not change the changes are conceptual and the probability of winning the game is not affected. Hence, \[ \Pr[G_0^{\adversary{A}} \Rightarrow 1] = \Pr[G_1^{\adversary{A}} \Rightarrow 1]. \] - \item \paragraph{\underline{$G_2:$}} $G_2$ now introduces the abort condition in the red box. The game aborts if the flag $bad_1$ is set. For each individual query the $bad_1$ flag is set with a probability at most $\frac{1}{2^b}$. The flag is set if the message equals $k$. $k$ is a value chosen uniformly at random from $\{0,1\}^b$ and is hidden from the adversary. Therefor the adversary can can only guess this value. By the union bound over all hash queries $\hashqueries$ plus the one hash, which is performed by the challenger during signature verification, we obtain $\Pr[bad_1] \leq \frac{\hashqueries + 1}{2^b}$. Since $G_1$ and $G_2$ are identical-until-bad games regarding the $bad_1$ flag, we have + \item \paragraph{\underline{$G_2:$}} $G_2$ now introduces the abort instruction in the red box. The game is aborted if the flag $bad_1$ is set. This abort instruction ensures that the adversary will not be able to get the hash value for the secret key $k$. For each individual query, the $bad_1$ flag is set with a probability at most $\frac{1}{2^b}$. The flag is set if the input to the hash function is equal to $k$. $k$ is a value chosen uniformly at random from $\{0,1\}^b$ and is hidden from the adversary. Therefore, the adversary can only guess this value. By the union bound over all hash queries $\hashqueries$ plus the one, which is performed by the challenger during signature verification, we obtain $\Pr[bad_1] \leq \frac{\hashqueries + 1}{2^b}$. Since $G_1$ and $G_2$ are identical-until-bad games with respect to the $bad_1$ flag, we have \[ |\Pr[G_1^{\adversary{A}} \Rightarrow 1] - \Pr[G_2^{\adversary{A}} \Rightarrow 1]| \leq \Pr[bad_1] \leq \frac{\hashqueries + 1}{2^b}. \] - \item \paragraph{\underline{$G_3:$}} $G_3$ now also introduces the abort condition in the green box. This game also aborts if a message is queried which starts with $h_b | ... | h_{2b-1}$. For each individual query the $bad_2$ flag is set with a probability at most $\frac{1}{2^b}$. The value $h$ is the result of a random oracle call with $k$ as input. Since the adversary is not able to query the random oracle with input $k$, due to the abort condition introduced ion $G_2$, the adversary has no information on $h$. Therefor the adversary can only guess the value of $h$. By the union bound over all hash queries $\hashqueries$ plus the one hash, which is performed by the challenger during signature verification, we obtain $\Pr[bad_2] \leq \frac{\hashqueries + 1}{2^b}$. Since $G_2$ and $G_3$ are identical-until-bad games regarding the $bad_2$ flag, we have + \item \paragraph{\underline{$G_3:$}} $G_3$ now also introduces the abort instruction in the green box. This game also aborts if a message is queried that starts with $h_b | ... | h_{2b-1}$. This abort instruction ensures that the adversary cannot obtain the discrete logarithm of the commitments by querying the hash function. For each individual query, the $bad_2$ flag is set with a probability at most $\frac{1}{2^b}$. The value $h$ is the result of a random oracle call with $k$ as input. Since the adversary is unable to query the random oracle with input $k$ due to the abort condition introduced in $G_2$, the adversary has no information about $h$. Therefore, the adversary can only guess the value of $h$. By the union bound over all hash queries $\hashqueries$ plus the one hash, which is performed by the challenger during signature verification, we obtain $\Pr[bad_2] \leq \frac{\hashqueries + 1}{2^b}$. Since $G_2$ and $G_3$ are identical-until-bad games with respect to the $bad_2$ flag, we have \[ |\Pr[G_2^{\adversary{A}} \Rightarrow 1] - \Pr[G_3^{\adversary{A}} \Rightarrow 1]| \leq \Pr[bad_2] \leq \frac{\hashqueries + 1}{2^b}. \] - %TODO: Signatur von RF genauer beschreiben? - \item \paragraph{\underline{$G_4:$}} $G_4$ replaces the blue boxes in the main game and the \Osign oracle with the orange boxes. This change is only conceptual since the adversary is not able to query the random oracle with the inputs used for those calls and due to the nature of the random oracle model the adversary has no information on those values. Therefore, an adversary can not differentiate between the values being the result of the hash function or chosen uniformly at random. Hence, + \item \paragraph{\underline{$G_4:$}} $G_4$ replaces the blue boxes in the main game and the \Osign oracle with the orange boxes. With this change, the hash value $h$ and the discrete logarithm of the commitments $r'$ are randomly chosen instead of being the result of the hash function call. This change is only conceptual, since the aborts introduced in $G_2$ and $G_3$ ensure that the adversary cannot obtain these values from the hash function, and therefore these values are random to the adversary. Hence, \[ \Pr[G_3^{\adversary{A}} \Rightarrow 1] = \Pr[G_4^{\adversary{A}} \Rightarrow 1]. \] - \item Now $G_4$ is the same as SUF-CMA parameterized with EdDSA'. Therefore, we have + \item Now $G_4$ is the same as SUF-CMA parameterized with EdDSA'. So we have \[ \Pr[G_4^{\adversary{A}} \Rightarrow 1] = \advantage{\text{EdDSA'},\adversary{A}}{\cma}(\secparamter). \] \item This proves theorem \ref{theorem:adveddsa'}. \end{proof} -%TODO: Das kann man sicherlich schöner formulieren -In the following proofs when referring to the EdDSA signature scheme actually the EdDSA' signature scheme is used to make the proof more straight forward. In the end when calculating the loss due to the reduction the loss introduced by the EdDSA' signature scheme will be included. \ No newline at end of file +The proof for the EUF-CMA security is the same for as the proof for the SUF-CM security, with the only difference being the win condition for the adversary. Now that EdDSA' has been introduced, and it has been shown that the for and adversary cannot distinguish between these signature schemes in the SUF-CMA and EUF-CMA setting, the EdDSA' signature scheme is used instead of the EdDSA signature scheme for the proofs in the following section. Using the EdDSA' makes the proofs in the random oracle model easier. \ No newline at end of file