masterthesis/thesis/sections/eddsa.tex

\section{EdDSA Signatures}
\label{sec:eddsa}

% TODO: Referenz zum ersten Paper 2011 oder lieber zum journal paper 2012?

This section takes a closer look at the existing specifications of the EdDSA signature scheme and specifies a version which will be analyzed in this thesis.

This work will take a closer look at the \cma security of the EdDSA signature scheme. EdDSA was introduced as the Ed25519 signature scheme using the twisted Edwards curve Edwards25519, which is birationally equivalent to the Weierstrass curve Curve25519 \cite{JCEng:BDLSY12}. Later in 2015 the paper "EdDSA for more Curves" by Bernstein et al. introduces a more general version of EdDSA \cite{EPRINT:BJLSY15}. The paper also introduces a variant of EdDSA using prehashing. The RFC 8032 "Edwards-Curve Digital Signature Algorithm (EdDSA)" from 2017 specifies a version of EdDSA with the inclusion of an additional input parameter \textit{context} for the \sign and \verify procedure \cite{josefsson_edwards-curve_2017}. This version was also included into the FIPS 186-5 "Digital Signature Standard (DSS)" standard \cite{moody_digital_2023}.

In the prehashing variant of EdDSA the signature is calculated on the hash value of the message. The message is used twice during the generation of the signature. Thus the message needs to be buffered or transmitted twice during the generation of the signature. Therefore the prehashing variant offers an performance advantage on memory and bandwidth constraint devices. The context is an additional input parameter which has to be equal during generation and verification of the signature and is used to bind the signature to a given context.

In the following, when speaking from the EdDSA signature scheme, the original variant, without prehashing and context, is meant. It is argued that the context can be modeled as being part of the message. Regarding the prehashing a standard proof for prehashing variants of UF-CMA secure signature schemes can be used. In the case of EdDSA the prehash function $H'(\inp)$ is the identity function. Figure \ref{fig:eddsa} defines the EdDSA signature scheme
%TODO standard proof for prehashing referenzieren

\subsection{EdDSA Parameter}

The generic version of EdDSA from the "EdDSA for more Curves" paper, the RFC 8032 and the FIPS 186-5 standard is parameterized by the following 11 parameters \cite{EPRINT:BJLSY15} \cite{josefsson_edwards-curve_2017} \cite{moody_digital_2023}.

The list of the parameters can be found in table \ref{tab:parameter}.

\subsection{Encoding of Group Elements}

The encoding function encodes points on the twisted Edwards cuve into b-bit bitstring and vice versa. It is assumed to be unambiguous, with each point on the twisted Edwards curve having exactly one bitstring representing that point and invalid bitstring being rejected during decoding of the point. This way by decoding a b-bit bitstring into an cuve point implicitly ensures that the decoded point is a valid point on the specified twisted Edwards curve. The requirement for this property comes from the specified encoding function in \cite{EPRINT:BJLSY15}.

\subsection{Message Space}

The message space $\messagespace$ is defined as a bitstring of arbitrary length. To make the proof also apply to the EdDSA variant with context the context can be modeled to be part of the message.

Looking at the RFC and the FIPS standard the context is passed into a "dom" function which concatenates the context with some additional data. The resulting data is then passed as additional data to each hash function call during the generation and verification of the signature. Since the proofs are conducted in the random oracle model the position of the data in the hash function call and the actual content of the message and the context are not relevant for the output of the random oracle call. Unless the reduction explicitly uses the content of message, which they do not in this case. Therefore, the context can be modeled as being part of the message.

\subsection{Signature}

The signature is a defined as a $2b$ bitstrig of the encoded curve points $\groupelement{R}$ concatenated with the $b$-bit little endian encoding of the scalar $S$.

$S$ being defined as the $b$-bit little endian encoding poses a problem. It might be possible that the decoded $S$ is larger than the order $L$ of the generator. The original paper \cite{EPRINT:BJLSY15} proposes two variants to handle decoded $S$ values, which are larger than $L$. The first approach is to replace $S$ with $S \pmod L$ and continue the verification of the signature. This will be called the lax parsing. The other approach is to reject all $S$ values larger than $L$ and fail the verification of the signature in that case. Parsing the integer like this will be called strict parsing.

The later proofs show that those two approaches lead to different security properties of the signature. Using strict parsing results in the SUF-CMA security while using lax parsing "only" ensures EUF-CMA security.

The RFC as well as the FIPS standard both require strict parsing.

\subsection{Differences from Schnorr Signatures}

As already pointed out in \cite{SP:BCJZ21} there are some minor differences from traditional Schnorr signature which prevent already existing proofs of the Schnorr signature scheme to be applied to EdDSA. This section points out the differences of the EdDSA signature scheme from traditional Schnorr signature scheme.

\subsubsection{Group Structure}

Unlike the standard Schnorr signature scheme, which was defined over a prime order group, the EdDSA signature scheme is defined over a prime order subgroup of a twisted Edwards curve.

This might pose additional challenges since working with group elements outside the prime order subgroup might have some unintended side effects. In the proofs using the algebraic group model, where this might become relevant, it is argued that the additional group structure from the twisted Edwards curves do not pose an additional treat to the schema.

\subsubsection{Private Key Clamping}

Instead of choosing the secret scalar uniformly at random, as done in most other schemes, the secret scalar is generated by hashing a random bitstring, fixing some bits of the hash result to a specific value and then interpreting $n$ bits of the result as the little endian representation of an integer.

To be more precise from the lower $b$ bit of the $2b$ bit the lowest $c$ bit are set to 0, where $c$ is the cofactor of the twisted Edwards cureve, and the $n$th bit is set to 1. Then the first $n$ bits are interpreted as the secret scalar $s$.

This is strictly less secure, in the sense of the discrete logarithm problem, then choosing the secret scalar uniformly at random. It also makes proofs in the multi-user setting more challenging, since rerandomization of a public key is not easily possible and therefore the multi-user security of EdDSA can not easily be reduced onto the single-user security of EdDSA.

To overcome this challenge specific variants of the discrete logarithm problem and the one-more discrete logarithm problem are introduced, which take the specific key generation into account. The hardness of those problems are then examined in the generic group model.

Choosing the secret scalar like this is supposed to help make implementation constant time and to prevent the leakage of bits through side channel attacks.

\subsubsection{Key Prefixing}

The EdDSA signatur scheme also includes the public key as an additional input to the hash function, when generating the challenge. This change does not reduce the security of the signature scheme and mainly revolves around the multi-user security of the signature scheme.

\subsubsection{Deterministic Nonce Generation}

The commitment is chosen as the result of a hash function instead of uniformly at random on every signature generation. This makes the signature generation deterministic. Since the hash function is modeled as a random oracle the deterministic generation of the commitment does not pose an additional security risk, since it can be replaced with a random function, as shown in \ref{sec:eddsa'_proof}.

% TODO: Ist das ok hier einfach zu kopieren?
\begin{center}
	\begin{table}[!ht]
		\centering
		\begin{tabularx}{\textwidth}{@{}lX@{}}
			\textbf{Parameter} & \textbf{Description} \\
			\hline
			$q$ & An odd prime power $q$. EdDSA uses an elliptic curve over the finite field $\field{q}$. \\
			$b$ & An integer $b$ with $2^{b-1} > q$. The bit size of encoded points on the twisted Edwards curve. \\
			$Enc(\inp)$ & A $(b-1)$-bit encoding of elements in the underlying finite field. \\
			$H(\inp)$ & A cryptographic hash function producing $2b$-bit output. \\
			$c$ & The cofactor of the twisted Edwards curve. \\
			$n$ & The number of bits used for the secret scalar of the public key. \\
			$a, d$ & The curve parameter of the twisted Edwards curve. \\
			$B$ & A generator point of the prime order subgroup of $E$. \\
			$L$ & The order of the prime order subgroup. \\
			$H'(\inp)$ & A prehash function applied to the message prior to applying the \sign or \verify procedure.
		\end{tabularx}
		\caption{Parameter of the EdDSA signature scheme}
		\label{tab:parameter}
	\end{table}
\end{center}


\begin{figure}[H]
	\hrule
	\begin{multicols}{3}
		\scriptsize
		\begin{algorithmic}[1]
			\Statex \underline{\textbf{\keygen}}
			\State $k \randomsample \{0,1\}^b$
			\State $(h_0, h_1, ..., h_{2b-1}) \assign H(k)$
			\State $s \leftarrow 2^n + \sum_{i=c}^{n-1} 2^i h_i$
			\State $A \assign sB$
			\State \Return (\encoded{$A$}, $k$)
		\end{algorithmic}
		\columnbreak
		\begin{algorithmic}[1]
			\Statex \underline{\textbf{\sign}($k$, $m$)}
			\State $(h_0, h_1, ..., h_{2b-1}) \assign H(k)$
			\State $s \leftarrow 2^n + \sum_{i=c}^{n-1} 2^i h_i$
			\State $(r'_0, r'_1, ..., r'_{2b-1}) \assign H(h_b | ... | h_{2b-1} | m)$
			\State $r \assign \sum_{i=0}^{2b-1} 2^i r'_i$
			\State $R \assign rB$
			\State $S \assign (r + sH(\encoded{R} | \encoded{A} | m)) \pmod L$
			\State \Return $\sigma \assign (\encoded{R}, S)$
		\end{algorithmic}
		\columnbreak
		\begin{algorithmic}[1]
			\Statex \underline{\textbf{\verify}($\encoded{A}, \sigma \assign (\encoded{R}, S), m$)}
			\State \Return $2^c SB \test 2^c R + 2^c H(\encoded{R} | \encoded{A} | m)A$
		\end{algorithmic}
	\end{multicols}
	\hrule
	\caption{Generic description of the algorithms \keygen, \sign and \verify used by the EdDSA signature scheme}
	\label{fig:eddsa}
\end{figure}

\subsection{Replacing Hash Function Calls}
\label{sec:eddsa'_proof}

To make working with the random oracle easier in the following proofs a variant of the EdDSA signature scheme is introduced which has some calls to the random oracle replaced by directly sampling a value uniformly at random or using a random function. After that it will be shown that the advantage winning the \cma game in both versions of the signature scheme is roughly the same.

\paragraph{\underline{Introducing EdDSA'}}

The EdDSA' signature scheme is depicted in figure \ref{fig:eddsa'}. The difference to the original EdDSA signature scheme is that the value $h$ is sampled uniformly at random from $\{0,1\}^{2b}$ and $r'$ is the result of a call to random function instead of the hash function.

\begin{figure}
	\hrule
	\begin{multicols}{3}
		\scriptsize
		\begin{algorithmic}[1]
			\Statex \underline{\textbf{\keygen}}
			\State $(h_0, h_1, ..., h_{2b-1}) \randomsample \{0,1\}^{2b}$
			\State $s \leftarrow 2^n + \sum_{i=c}^{n-1} 2^i h_i$
			\State $A \assign sB$
			\State \Return (\encoded{$A$}, $k \assign (s, h_b | ... | h_{2b-1})$)
		\end{algorithmic}
		\columnbreak
		\begin{algorithmic}[1]
			\Statex \underline{\textbf{\sign}($k \assign (s, h_b | ... | h_{2b-1})$, $m$)}
			\State $(r'_0, r'_1, ..., r'_{2b-1}) \assign RF(h_b | ... | h_{2b-1} | m)$
			\State $r \assign \sum_{i=0}^{2b-1} 2^i r'_i$
			\State $R \assign rB$
			\State $S \assign (r + sH(\encoded{R} | \encoded{A} | m)) \pmod L$
			\State \Return $\sigma \assign (\encoded{R}, S)$
		\end{algorithmic}
		\columnbreak
		\begin{algorithmic}[1]
			\Statex \underline{\textbf{\verify}($\encoded{A}, \sigma \assign (\encoded{R}, S), m$)}
			\State \Return $2^c SB \test 2^c R + 2^c H(\encoded{R} | \encoded{A} | m)A$
		\end{algorithmic}
	\end{multicols}
	\hrule
	\caption{Generic description of the algorithms \keygen, \sign and \verify used by the EdDSA' signature scheme}
	\label{fig:eddsa'}
\end{figure}

\begin{theorem}
	\label{theorem:adveddsa'}
	Let $\adversary{A}$ be and adversary against SUF-CMA security of the EdDSA signature scheme. Then

	%TODO: richtigre Richtung?
	\[ \advantage{\text{EdDSA'},\adversary{A}}{\cma}(\secparamter) \leq \advantage{\text{EdDSA},\adversary{A}}{\cma}(\secparamter) - \frac{2 (\hashqueries + 1)}{2^b}. \]
\end{theorem}

\paragraph{\underline{Proof Overview}}

The different games used in the proof are depicted in figure \ref{fig:eddsa'games}. The proof uses the random oracle model. The main idea that the values $h$ and $r_i$ look uniformly random to the adversary if he never queries the random oracle with $k$ or a value starting with $h_b | ... | h_{2b-1}$. Therefor those calls to the random oracle can be replaced with the sampling of truly random values.

\begin{figure}
	\hrule
	\begin{multicols}{2}
		\large
		\begin{algorithmic}[1]
			\Statex \underline{\game $G_0$ / \textcolor{blue}{$G_1$} / \textcolor{red}{$G_2$} / \textcolor{green}{$G_3$} / \textcolor{orange}{$G_4$}}
			\State $k \randomsample \{0,1\}^b$
			\BeginBox[draw=black]
			\State $(h_0, h_1, ..., h_{2b-1}) \assign H(k)$
			\Comment{$G_0$}
			\EndBox
			\BeginBox[draw=blue]
			\State $\textbf{if } \sum[k] = \bot \textbf{ then}$
			\Comment{$G_1 - G_3$}
			\State \quad $\sum[k] \randomsample \{0,1\}^{2b}$
			\State $(h_0, h_1, ..., h_{2b-1}) \assign \sum[k]$
			\EndBox
			\BeginBox[draw=orange]
			\State $(h_0, h_1, ..., h_{2b-1}) \randomsample \{0,1\}^{2b}$
			\Comment{$G_4$}
			\EndBox
			\State $s \leftarrow 2^n + \sum_{i=c}^{n-1} 2^i h_i$
			\State $A \assign sB$
			\State $(\m^*, \signature^*) \randomassign \adversary{A}^{H(\inp), \sign(\inp)}(A)$
			\State \Return $\verify(A, \m^*,\signature^*) \wedge (\m^*, \signature^*) \notin Q$
		\end{algorithmic}
		\columnbreak
		\begin{algorithmic}[1]
			\Statex \underline{\oracle \sign($m \in \messagespace$)}
			\BeginBox[draw=black]
			\State $(r'_0, r'_1, ..., r'_{2b-1}) \assign H(h_b | ... | h_{2b-1} | m)$
			\Comment{$G_1$}
			\EndBox
			\BeginBox[draw=blue]
			\State $\textbf{if } \sum[h_b | ... | h_{2b-1} | m] = \bot \textbf{ then}$
			\Comment{$G_1 - G_3$}
			\State \quad $\sum[h_b | ... | h_{2b-1} | m] \randomsample \{0,1\}^{2b}$
			\State $(r'_0, r'_1, ..., r'_{2b-1}) \assign \sum[h_b | ... | h_{2b-1} | m]$
			\EndBox
			\BeginBox[draw=orange]
			\State $(r'_0, r'_1, ..., r'_{2b-1}) = RF(h_b | ... | h_{2b-1} | m)$
			\Comment{$G_4$}
			\EndBox
			\State $r \assign \sum_{i=0}^{2b-1} 2^i r'_i$
			\State $R \assign rB$
			\State $S \assign (r + sH(\encoded{R} | \encoded{A} | m)) \pmod L$
			\State $\signature \assign (\encoded{R}, S)$
			\State $Q \assign Q \cup \{(\m, \signature)\}$
			\State \Return $\signature$
		\end{algorithmic}
	\end{multicols}
	\begin{algorithmic}[1]
		\Statex \underline{\oracle $H(m \in \{0,1\}^*)$}
		\BeginBox[draw=blue]
		\State $\textbf{if } m = k \textbf{ then}$
		\Comment{$G_1 - G_4$}
		\State \quad $bad_1 \assign true$
		\BeginBox[draw=red,dashed]
		\State \quad $abort$
		\Comment{$G_2 - G_4$}
		\EndBox
		\State $\textbf{if } m \text{ starts with } h_b|...|h_{2b-1} \textbf{ then}$
		\State \quad $bad_2 \assign true$
		\BeginBox[draw=green,dashed]
		\State \quad $abort$
		\Comment{$G_3 - G_4$}
		\EndBox
		\EndBox
		\State $\textbf{if } \sum[m] = \bot \textbf{ then}$
		\State \quad $\sum[m] \randomsample \{0,1\}^{2b}$
		\State \Return $\sum[m]$
	\end{algorithmic}
	\hrule
	\caption{Game $G_0 - G_4$}
	\label{fig:eddsa'games}
\end{figure}

\begin{proof}
	\item \paragraph{\underline{$G_0:$}} Let $G_0$ be defined in figure \ref{fig:eddsa'games} by excluding all boxes expect the black ones and $G_0$ be $\cma$. By definition,

	\[ \advantage{\text{EdDSA},\adversary{A}}{\cma}(\secparamter) = \Pr[\cma_{\text{EdDSA}}^{\adversary{A}} \Rightarrow 1] = \Pr[G_0^{\adversary{A}} \Rightarrow 1]. \]

	\item \paragraph{\underline{$G_1:$}} Let $G_1$ be defined by additionally including all blue boxes and excluding the black boxes. This change inlines calls to the random oracle and introduces to if conditions in the random oracle which are setting a bad flag if the condition is triggert. Since the behavior of the game does not change the changes are conceptual and the probability of winning the game is not affected. Hence,

	\[ \Pr[G_0^{\adversary{A}} \Rightarrow 1] = \Pr[G_1^{\adversary{A}} \Rightarrow 1]. \]

	\item \paragraph{\underline{$G_2:$}} $G_2$ now introduces the abort condition in the red box. The game aborts if the flag $bad_1$ is set. For each individual query the $bad_1$ flag is set with a probability at most $\frac{1}{2^b}$. The flag is set if the message equals $k$. $k$ is a value chosen uniformly at random from $\{0,1\}^b$ and is hidden from the adversary. Therefor the adversary can can only guess this value. By the union bound over all hash queries $\hashqueries$ plus the one hash, which is performed by the challenger during signature verification, we obtain $\Pr[bad_1] \leq \frac{\hashqueries + 1}{2^b}$. Since $G_1$ and $G_2$ are identical-until-bad games regarding the $bad_1$ flag, we have

	\[ |\Pr[G_1^{\adversary{A}} \Rightarrow 1] - \Pr[G_2^{\adversary{A}} \Rightarrow 1]| \leq \Pr[bad_1] \leq \frac{\hashqueries + 1}{2^b}. \]

	\item \paragraph{\underline{$G_3:$}} $G_3$ now also introduces the abort condition in the green box. This game also aborts if a message is queried which starts with $h_b | ... | h_{2b-1}$. For each individual query the $bad_2$ flag is set with a probability at most $\frac{1}{2^b}$. The value $h$ is the result of a random oracle call with $k$ as input. Since the adversary is not able to query the random oracle with input $k$, due to the abort condition introduced ion $G_2$, the adversary has no information on $h$. Therefor the adversary can only guess the value of $h$. By the union bound over all hash queries $\hashqueries$ plus the one hash, which is performed by the challenger during signature verification, we obtain $\Pr[bad_2] \leq \frac{\hashqueries + 1}{2^b}$. Since $G_2$ and $G_3$ are identical-until-bad games regarding the $bad_2$ flag, we have

	\[ |\Pr[G_2^{\adversary{A}} \Rightarrow 1] - \Pr[G_3^{\adversary{A}} \Rightarrow 1]| \leq \Pr[bad_2] \leq \frac{\hashqueries + 1}{2^b}. \]

	%TODO: Signatur von RF genauer beschreiben?
	\item \paragraph{\underline{$G_4:$}} $G_4$ replaces the blue boxes in the main game and the \Osign oracle with the orange boxes. This change is only conceptual since the adversary is not able to query the random oracle with the inputs used for those calls and due to the nature of the random oracle model the adversary has no information on those values. Therefore, an adversary can not differentiate between the values being the result of the hash function or chosen uniformly at random. Hence,

	\[ \Pr[G_3^{\adversary{A}} \Rightarrow 1] = \Pr[G_4^{\adversary{A}} \Rightarrow 1]. \]

	\item Now $G_4$ is the same as SUF-CMA parameterized with EdDSA'. Therefore, we have

	\[ \Pr[G_4^{\adversary{A}} \Rightarrow 1] = \advantage{\text{EdDSA'},\adversary{A}}{\cma}(\secparamter). \]

	\item This proves theorem \ref{theorem:adveddsa'}.
\end{proof}

%TODO: Das kann man sicherlich schöner formulieren
In the following proofs when referring to the EdDSA signature scheme actually the EdDSA' signature scheme is used to make the proof more straight forward. In the end when calculating the loss due to the reduction the loss introduced by the EdDSA' signature scheme will be included.