masterthesis/thesis/sections/eddsa.tex

\section{EdDSA Signatures}
\label{sec:eddsa}

This section takes a closer look at the differences between the existing EdDSA specifications and points out the differences between the standards and the original Schnorr signature scheme. This section is partly inspired by \cite{SP:BCJZ21}.

As mentioned above, there are two papers by Bernstein et. al., that define the EdDSA signature scheme \cite{CHES:BDLSY11,EPRINT:BJLSY15}. The 2015 paper \cite{EPRINT:BJLSY15} describes a more generic version of the EdDSA signature scheme than the original publication \cite{CHES:BDLSY11}. According to \cite{EPRINT:BJLSY15}, the EdDSA signature scheme is defined by 11 parameters, as shown in the table \ref{tab:parameter}. The paper also describes two variants of EdDSA. One is called PureEdDSA and the other is called HashEdDSA. HashEdDSA is a prehashing variant of the PureEdDSA signature scheme. This means that, in HashEdDSA, the message is being hashed by a hash function before it is signed or verified. Both variants can be described by the definition of the EdDSA signature scheme, by using a different perhash function. In PureEdDSA the prehash function is simply the identity function. Another important variation in the EdDSA standard is the decoding of the signature. \cite{EPRINT:BJLSY15} describes two variations on how signatures can be decoded during verification. Both variations are described further in this section, as they have a major impact on the security of the EdDSA signature scheme.

There also exist two major standards for the EdDSA signature scheme. The first is the RFC 8032, which was introduced by the IETF in 2017 \cite{josefsson_edwards-curve_2017}. n addition to publishing concrete parameterizations for the Ed25519 and Ed448 signature schemes, it also includes a variant of the EdDSA signature scheme that includes a context. The context is a separate string that can be used to separate the use of EdDSA between different protocols. As argued below, the inclusion of this context does not affect the security of the signature scheme and can be modeled as being part of the message.

The 2023 FIPS 186-5 standard \cite{moody_digital_2023} also includes the EdDSA signature scheme as specified in the RFC 8032.

The EdDSA signature scheme is depicted in figure \ref{fig:eddsa}.

% TODO: Ist das ok hier einfach zu kopieren?
\begin{center}
	\begin{table}[!ht]
		\centering
		\begin{tabularx}{\textwidth}{@{}lX@{}}
			\textbf{Parameter} & \textbf{Description} \\
			\hline
			$q$ & An odd prime power $q$. EdDSA uses an elliptic curve over the finite field $\mathbb{F}_{q}$. \\
			$b$ & An integer $b$ with $2^{b-1} > q$. The bit size of encoded points on the twisted Edwards curve. \\
			$Enc(\inp)$ & A $(b-1)$-bit encoding of elements in the underlying finite field. \\
			$H(\inp)$ & A cryptographic hash function producing $2b$-bit output. \\
			$c$ & The cofactor of the twisted Edwards curve. \\
			$n$ & The number of bits used for the secret scalar of the public key. \\
			$a, d$ & The curve parameter of the twisted Edwards curve. \\
			$B$ & A generator point of the prime order subgroup of $E$. \\
			$L$ & The order of the prime order subgroup. \\
			$H'(\inp)$ & A prehash function applied to the message prior to applying the \sign or \verify procedure.
		\end{tabularx}
		\caption{Parameter of the EdDSA signature scheme}
		\label{tab:parameter}
	\end{table}
\end{center}

\begin{figure}[H]
	\hrule
	\begin{multicols}{3}
		\scriptsize
		\begin{algorithmic}[1]
			\Statex \underline{\textbf{\keygen}}
			\State $k \randomsample \{0,1\}^b$
			\State $(h_0, h_1, ..., h_{2b-1}) \assign H(k)$
			\State $s \leftarrow 2^n + \sum_{i=c}^{n-1} 2^i h_i$
			\State $A \assign sB$
			\State \Return (\encoded{$A$}, $k$)
		\end{algorithmic}
		\columnbreak
		\begin{algorithmic}[1]
			\Statex \underline{\textbf{\sign}($k$, $m$)}
			\State $(h_0, h_1, ..., h_{2b-1}) \assign H(k)$
			\State $s \leftarrow 2^n + \sum_{i=c}^{n-1} 2^i h_i$
			\State $(r'_0, r'_1, ..., r'_{2b-1}) \assign H(h_b | ... | h_{2b-1} | m)$
			\State $r \assign \sum_{i=0}^{2b-1} 2^i r'_i$
			\State $R \assign rB$
			\State $S \assign (r + sH(\encoded{R} | \encoded{A} | m)) \pmod L$
			\State \Return $\sigma \assign (\encoded{R}, S)$
		\end{algorithmic}
		\columnbreak
		\begin{algorithmic}[1]
			\Statex \underline{\textbf{\verify}($\encoded{A}, \sigma \assign (\encoded{R}, S), m$)}
			\State \Return $2^c SB \test 2^c R + 2^c H(\encoded{R} | \encoded{A} | m)A$
		\end{algorithmic}
	\end{multicols}
	\hrule
	\caption{Generic description of the algorithms \keygen, \sign and \verify used by the EdDSA signature scheme}
	\label{fig:eddsa}
\end{figure}

\subsection{Encoding of Group Elements}

The encoding function encodes points on the twisted Edwards curve into a b-bit bitstring and vice versa. It is assumed that when a b-bit bitstring is decoded, the resulting point is either a valid point on the twisted Edwards curve or the decoding will fail. In this way, decoding a b-bit bitstring into a cuve point implicitly ensures that the decoded point is a valid point on the specified twisted Edwards curve. The encoding function does not ensure that each point has exactly one bitstring representation. This means that there may be multiple bitstrings mapping to the same curve point during decoding. The effect of this is included in the analysis.

\subsection{Message Space}

The message space $\messagespace$ is defined as a bitstring of arbitrary length. To make the proof applicable to the EdDSA variant with context, the context can be modeled as part of the message.

Looking at the RFC and FIPS standards, the context is passed to a "dom" function which concatenates the context with some additional data. The resulting data is then passed as additional data to each hash function call during signature generation and verification. Since the proofs are performed in the random oracle model, the position of the data in the hash function call, the actual content of the message, and the context are not relevant to the output of the random oracle call. Unless the reduction explicitly uses the content of the message, which it does not in this case. Therefore, the context can be modeled as part of the message.
\subsection{Signature}

The signature is a defined as a $2b$ bitstrig of the encoded curve points $\groupelement{R}$ concatenated with the $b$-bit little endian encoding of the scalar $S$.

The fact that $S$ is defined as $b$-bit little-endian encoding poses a problem. It is possible that the decoded $S$ is larger than the order $L$ of the generator. The original paper \cite{EPRINT:BJLSY15} proposes two ways to handle decoded $S$ values that are larger than $L$. The first approach is to replace $S$ with $S \pmod L$ and continue verifying the signature. This is called lax parsing. The other approach is to reject all $S$ values greater than $L$ and fail the signature verification in that case. Parsing the integer in this way is called strict parsing.

The later proofs show that these two approaches lead to different security properties of the signature. Using strict parsing results in SUF-CMA security, while using lax parsing "only" ensures EUF-CMA security.

Both the RFC and FIPS standards require strict parsing.

\subsection{Differences from Schnorr Signatures}

As pointed out in \cite{SP:BCJZ21}, there are some minor differences from the traditional Schnorr signature that prevent existing proofs of the Schnorr signature scheme from being applied to EdDSA. This section points out the differences between the EdDSA signature scheme and the traditional Schnorr signature scheme.

\subsubsection{Group Structure}

Unlike the standard Schnorr signature scheme, which is defined over a prime order group, the EdDSA signature scheme is defined over a prime order subgroup of a twisted Edwards curve.

This may pose additional challenges, since working with group elements outside the prime order subgroup may have some unintended side effects. In the proofs using the algebraic group model, where this might become relevant, it is argued that the additional group structure of the twisted Edwards curves does not pose an additional threat to the scheme.

\subsubsection{Private Key Clamping}

Instead of choosing the secret scalar uniformly at random, as done in most other schemes, the secret scalar is generated by hashing a random bitstring, fixing some bits of the hash result to a specific value and then interpreting $n$ bits of the result as the little endian representation of an integer.

To be more precise from the lower $b$ bit of the $2b$ bit the lowest $c$ bit are set to 0, where $c$ is the cofactor of the twisted Edwards cureve, and the $n$th bit is set to 1. Then the first $n$ bits are interpreted as the secret scalar $s$.

This is strictly less secure, in the sense of the discrete logarithm problem, than choosing the secret scalar uniformly at random. It also makes proofs in the multi-user setting more challenging, since rerandomization of a public key is not easily possible and therefore the multi-user security of EdDSA cannot be easily reduced onto the single-user security of EdDSA.

To overcome this challenge, specific variants of the discrete logarithm problem and the one-more discrete logarithm problem are introduced that take into account the specific key generation. The hardness of these problems is then studied in the generic group model.

Such a choice of the secret scalar should help to make the implementation constant time and to prevent the leakage of bits through side-channel attacks.

\subsubsection{Key Prefixing}

The EdDSA signature scheme also includes the public key as an additional input to the hash function when generating the challenge. This change does not reduce the security of the signature scheme and is mainly related to the multi-user security of the signature scheme. Whether key prefixing actually improves multi-user security is much debated \cite{EPRINT:Bernstein15,C:KilMasPan16}.

\subsubsection{Deterministic Nonce Generation}

The commitment is chosen as the result of a hash function instead of being chosen at random each time a signature is generated. This makes signature generation deterministic. Since the hash function is modeled as a random oracle, the deterministic generation of the commitment does not pose any additional security risk, since it can be replaced by a random function, as shown in \ref{sec:eddsa'_proof}.

\subsection{Replacing Hash Function Calls}
\label{sec:eddsa'_proof}

To make it easier to work with the random oracle, the following proofs introduce a variant of the EdDSA signature scheme in which some calls to the random oracle are replaced by direct sampling of a value at random or by using a random function. It is then shown that the advantage of winning the \cma game is roughly the same in both versions of the signature scheme.

\paragraph{\underline{Introducing EdDSA'}}

The EdDSA' signature scheme is shown in figure \ref{fig:eddsa'}. The difference from the original EdDSA signature scheme is that the value $h$ is sampled uniformly at random from $\{0,1\}^{2b}$, and $r'$ is the result of a call to random function instead of the hash function.

\begin{figure}
	\hrule
	\begin{multicols}{3}
		\scriptsize
		\begin{algorithmic}[1]
			\Statex \underline{\textbf{\keygen}}
			\State $(h_0, h_1, ..., h_{2b-1}) \randomsample \{0,1\}^{2b}$
			\State $s \leftarrow 2^n + \sum_{i=c}^{n-1} 2^i h_i$
			\State $A \assign sB$
			\State \Return (\encoded{$A$}, $k \assign (s, h_b | ... | h_{2b-1})$)
		\end{algorithmic}
		\columnbreak
		\begin{algorithmic}[1]
			\Statex \underline{\textbf{\sign}($k \assign (s, h_b | ... | h_{2b-1})$, $m$)}
			\State $(r'_0, r'_1, ..., r'_{2b-1}) \assign RF(h_b | ... | h_{2b-1} | m)$
			\State $r \assign \sum_{i=0}^{2b-1} 2^i r'_i$
			\State $R \assign rB$
			\State $S \assign (r + sH(\encoded{R} | \encoded{A} | m)) \pmod L$
			\State \Return $\sigma \assign (\encoded{R}, S)$
		\end{algorithmic}
		\columnbreak
		\begin{algorithmic}[1]
			\Statex \underline{\textbf{\verify}($\encoded{A}, \sigma \assign (\encoded{R}, S), m$)}
			\State \Return $2^c SB \test 2^c R + 2^c H(\encoded{R} | \encoded{A} | m)A$
		\end{algorithmic}
	\end{multicols}
	\hrule
	\caption{Generic description of the algorithms \keygen, \sign and \verify used by the EdDSA' signature scheme}
	\label{fig:eddsa'}
\end{figure}

\begin{theorem}
	\label{theorem:adveddsa'}
	Let $\adversary{A}$ be and adversary against SUF-CMA security of the EdDSA signature scheme. Then

	\[ \advantage{\text{EdDSA'},\adversary{A}}{\cma}(\secparamter) \leq \advantage{\text{EdDSA},\adversary{A}}{\cma}(\secparamter) + \frac{2 (\hashqueries + 1)}{2^b}. \]
\end{theorem}

\paragraph{\underline{Proof Overview}}

The different games used in the proof are depicted in figure \ref{fig:eddsa'games}. The proof uses the random oracle model. The main idea is that the values $h$ and $r_i$ look uniformly random to the adversary if he never queries the hash function with $k$ or a value starting with $h_b | ... | h_{2b-1}$. Since those values are unknown to the adversary it is only able to guess those values, which is unlikely due to the high entropy of those values. For this reason, these calls to the hash function can be replaced by sampling truly random values.

\paragraph{\underline{Formal Proof}}

\begin{figure}
	\hrule
	\begin{multicols}{2}
		\large
		\begin{algorithmic}[1]
			\Statex \underline{\game $G_0$ / \textcolor{blue}{$G_1$} / \textcolor{red}{$G_2$} / \textcolor{green}{$G_3$} / \textcolor{orange}{$G_4$}}
			\State $k \randomsample \{0,1\}^b$
			\BeginBox[draw=black]
			\State $(h_0, h_1, ..., h_{2b-1}) \assign H(k)$
			\Comment{$G_0$}
			\EndBox
			\BeginBox[draw=blue]
			\State $\textbf{if } \sum[k] = \bot \textbf{ then}$
			\Comment{$G_1 - G_3$}
			\State \quad $\sum[k] \randomsample \{0,1\}^{2b}$
			\State $(h_0, h_1, ..., h_{2b-1}) \assign \sum[k]$
			\EndBox
			\BeginBox[draw=orange]
			\State $(h_0, h_1, ..., h_{2b-1}) \randomsample \{0,1\}^{2b}$
			\Comment{$G_4$}
			\EndBox
			\State $s \leftarrow 2^n + \sum_{i=c}^{n-1} 2^i h_i$
			\State $A \assign sB$
			\State $(\m^*, \signature^*) \randomassign \adversary{A}^{H(\inp), \sign(\inp)}(A)$
			\State \Return $\verify(A, \m^*,\signature^*) \wedge (\m^*, \signature^*) \notin \pset{Q}$
		\end{algorithmic}
		\columnbreak
		\begin{algorithmic}[1]
			\Statex \underline{\oracle \sign($m \in \messagespace$)}
			\BeginBox[draw=black]
			\State $(r'_0, r'_1, ..., r'_{2b-1}) \assign H(h_b | ... | h_{2b-1} | m)$
			\Comment{$G_1$}
			\EndBox
			\BeginBox[draw=blue]
			\State $\textbf{if } \sum[h_b | ... | h_{2b-1} | m] = \bot \textbf{ then}$
			\Comment{$G_1 - G_3$}
			\State \quad $\sum[h_b | ... | h_{2b-1} | m] \randomsample \{0,1\}^{2b}$
			\State $(r'_0, r'_1, ..., r'_{2b-1}) \assign \sum[h_b | ... | h_{2b-1} | m]$
			\EndBox
			\BeginBox[draw=orange]
			\State $(r'_0, r'_1, ..., r'_{2b-1}) = RF(h_b | ... | h_{2b-1} | m)$
			\Comment{$G_4$}
			\EndBox
			\State $r \assign \sum_{i=0}^{2b-1} 2^i r'_i$
			\State $R \assign rB$
			\State $S \assign (r + sH(\encoded{R} | \encoded{A} | m)) \pmod L$
			\State $\signature \assign (\encoded{R}, S)$
			\State $\pset{Q} \assign \pset{Q} \cup \{(\m, \signature)\}$
			\State \Return $\signature$
		\end{algorithmic}
	\end{multicols}
	\begin{algorithmic}[1]
		\Statex \underline{\oracle $H(m \in \{0,1\}^*)$}
		\BeginBox[draw=blue]
		\State $\textbf{if } m = k \textbf{ then}$
		\Comment{$G_1 - G_4$}
		\State \quad $bad_1 \assign true$
		\BeginBox[draw=red,dashed]
		\State \quad $abort$
		\Comment{$G_2 - G_4$}
		\EndBox
		\State $\textbf{if } m \text{ starts with } h_b|...|h_{2b-1} \textbf{ then}$
		\State \quad $bad_2 \assign true$
		\BeginBox[draw=green,dashed]
		\State \quad $abort$
		\Comment{$G_3 - G_4$}
		\EndBox
		\EndBox
		\State $\textbf{if } \sum[m] = \bot \textbf{ then}$
		\State \quad $\sum[m] \randomsample \{0,1\}^{2b}$
		\State \Return $\sum[m]$
	\end{algorithmic}
	\hrule
	\caption{Game $G_0 - G_4$}
	\label{fig:eddsa'games}
\end{figure}

\begin{proof}
	\item The proof will be conducted by gradually changing the game $G_0$, which is the SUF-CMA game for EdDSA, to $G_4$, which is the SUF-CMA game for EdDSA'. At each step it is argued that the change can be detected with at most negligible probability.

	\item \paragraph{\underline{$G_0:$}} Let $G_0$ be defined in figure \ref{fig:eddsa'games} by excluding all boxes expect the black one. Clearly $G_0$ is the $\cma$ game for EdDSA. By definition,

	\[ \advantage{\text{EdDSA},\adversary{A}}{\cma}(\secparamter) = \Pr[\cma_{\text{EdDSA}}^{\adversary{A}} \Rightarrow 1] = \Pr[G_0^{\adversary{A}} \Rightarrow 1]. \]

	\item \paragraph{\underline{$G_1:$}} Let $G_1$ be defined by additionally including all blue boxes and excluding the black boxes. This change inlines the hash function calls and introduces to if conditions in the random oracle that set a bad flag if the abort condition is true. The inlining of the hash function calls ensures that the challenger does not trigger the abort conditions itself. Since the behavior of the game does not change the changes are conceptual and the probability of winning the game is not affected. Hence,

	\[ \Pr[G_0^{\adversary{A}} \Rightarrow 1] = \Pr[G_1^{\adversary{A}} \Rightarrow 1]. \]

	\item \paragraph{\underline{$G_2:$}} $G_2$ now introduces the abort instruction in the red box. The game is aborted if the flag $bad_1$ is set. This abort instruction ensures that the adversary will not be able to get the hash value for the secret key $k$. For each individual query, the $bad_1$ flag is set with a probability at most $\frac{1}{2^b}$. The flag is set if the input to the hash function is equal to $k$. $k$ is a value chosen uniformly at random from $\{0,1\}^b$ and is hidden from the adversary. Therefore, the adversary can only guess this value. By the union bound over all hash queries $\hashqueries$ plus the one, which is performed by the challenger during signature verification, we obtain $\Pr[bad_1] \leq \frac{\hashqueries + 1}{2^b}$. Since $G_1$ and $G_2$ are identical-until-bad games with respect to the $bad_1$ flag, we have

	\[ |\Pr[G_1^{\adversary{A}} \Rightarrow 1] - \Pr[G_2^{\adversary{A}} \Rightarrow 1]| \leq \Pr[bad_1] \leq \frac{\hashqueries + 1}{2^b}. \]

	\item \paragraph{\underline{$G_3:$}} $G_3$ now also introduces the abort instruction in the green box. This game also aborts if a message is queried that starts with $h_b | ... | h_{2b-1}$. This abort instruction ensures that the adversary cannot obtain the discrete logarithm of the commitments by querying the hash function. For each individual query, the $bad_2$ flag is set with a probability at most $\frac{1}{2^b}$. The value $h$ is the result of a random oracle call with $k$ as input. Since the adversary is unable to query the random oracle with input $k$ due to the abort condition introduced in $G_2$, the adversary has no information about $h$. Therefore, the adversary can only guess the value of $h$. By the union bound over all hash queries $\hashqueries$ plus the one hash, which is performed by the challenger during signature verification, we obtain $\Pr[bad_2] \leq \frac{\hashqueries + 1}{2^b}$. Since $G_2$ and $G_3$ are identical-until-bad games with respect to the $bad_2$ flag, we have

	\[ |\Pr[G_2^{\adversary{A}} \Rightarrow 1] - \Pr[G_3^{\adversary{A}} \Rightarrow 1]| \leq \Pr[bad_2] \leq \frac{\hashqueries + 1}{2^b}. \]

	\item \paragraph{\underline{$G_4:$}} $G_4$ replaces the blue boxes in the main game and the \Osign oracle with the orange boxes. With this change, the hash value $h$ and the discrete logarithm of the commitments $r'$ are randomly chosen instead of being the result of the hash function call. This change is only conceptual, since the aborts introduced in $G_2$ and $G_3$ ensure that the adversary cannot obtain these values from the hash function, and therefore these values are random to the adversary. Hence,

	\[ \Pr[G_3^{\adversary{A}} \Rightarrow 1] = \Pr[G_4^{\adversary{A}} \Rightarrow 1]. \]

	\item Now $G_4$ is the same as SUF-CMA parameterized with EdDSA'. So we have

	\[ \Pr[G_4^{\adversary{A}} \Rightarrow 1] = \advantage{\text{EdDSA'},\adversary{A}}{\cma}(\secparamter). \]

	\item This proves theorem \ref{theorem:adveddsa'}.
\end{proof}

The proof for the EUF-CMA security is the same for as the proof for the SUF-CM security, with the only difference being the win condition for the adversary. Now that EdDSA' has been introduced, and it has been shown that the for and adversary cannot distinguish between these signature schemes in the SUF-CMA and EUF-CMA setting, the EdDSA' signature scheme is used instead of the EdDSA signature scheme for the proofs in the following section. Using the EdDSA' makes the proofs in the random oracle model easier.