Chapter 7 SVE-based quantum algorithms

In the following section, we will cover some quantum algorithms based on singular value estimation. Some of them are here just because they are simple enough to have a good pedagogical value, while some of them we believe will be really useful in performing data analysis.

7.1 Spectral norm and the condition number estimation

–> We will elaborate more on this result soon. For the moment, we report the main statement.

Theorem 7.1 (Spectral norm estimation) Let there be quantum access to the matrix \(A \in \mathbb{R}^{n\times m}\), and let \(\epsilon>0\) be a precision parameter. There exists a quantum algorithm that estimates \(\|A\|\) to error \(\epsilon\|A\|_F\) in time \(\widetilde{O}\left(\frac{\log(1/\epsilon)}{\epsilon}\frac{\|A\|_F}{\|A\|}\right)\).

7.2 Explained variance: estimating quality of representations

The content of this section is extracted from (Bellante and Zanero 2022).

Let \(A = U\Sigma V^T\) be the singluar value decomposition of a matrix \(A \in \mathbb{R}^{n \times m}\). We call factor scores of \(A\), and denote them with \(\lambda_i = \sigma_i^2\), the squares of its singular values. Similarly, we call factor score ratios the relative magnitudes of the factor scores \(\lambda^{(i)} = \frac{\lambda_i}{\sum_j^{rank(A)} \lambda_j} = \frac{\sigma_i^2}{\sum_j^{rank(A)}\sigma_j^2}\). The factor score ratios are a measure of the amount of variance explained by the singular values.

We state here some nice examples of SVE based algorithms: the first allows us to assess whether a few singular values/factor scores explain most of the variance of the matrix of the dataset; the second one allows computing the cumulative sum of the factor score ratios associated to singular values grater or equal than a certain threshold; the third one is a modified version of the spectral norm estimation result and allows us to define a threshold for the smallest singular value such that the the sum of the above explains more than a given percentage of the total variance; finally, the last two algorithms allow retrieving a classical description of the singular vectors that correspond to the most relevant singular values.

The main intuition behind the first algorithm is that it is possible to create the state \(\sum_{i}^{r} \sqrt{\lambda^{(i)}} |u_i\rangle|v_i\rangle|\overline{\sigma}_i\rangle\) and that the third register, when measured in the computational basis, outputs the estimate \(\overline{\sigma}_i\) of a singular value with probability equal to its factor score ratio \(\lambda^{(i)}\). This allows us to sample the singular values of \(A\) directly from the factor score ratios’ distribution. When a matrix has a huge number of small singular values and only a few of them that are very big, the ones with the greatest factor score ratios will appear many times during the measurements, while the negligible ones are not likely to be measured. This intuition has already appeared in literature (Gyurik, Cade, and Dunjko 2020) and (Cade and Montanaro 2017). Nevertheless, the analysis and the problem solved are different, making the run-time analysis unrelated. This idea in the context of data representation and analysis, this intuition has only been sketched for sparse or low rank square symmetric matrices, by (Seth Lloyd, Mohseni, and Rebentrost 2013), without a precise formalization. We thoroughly formalize it, in a data representation and analysis context, for any real matrix.

Quantum factor score ratio estimation

Figure 7.1: Quantum factor score ratio estimation

(#thm:factor_score_estimation) (Quantum factor score ratio estimation) Assume to have quantum access to a matrix \(A \in \mathbb{R}^{n \times m}\) and \(\sigma_{max} \leq 1\). Let \(\gamma, \epsilon\) be precision parameters. There exists a quantum algorithm that, in time \(\widetilde{O}\left(\frac{1}{\gamma^2}\frac{\mu(A)}{\epsilon}\right)\), estimates:

  • the factor score ratios \(\lambda^{(i)}\), such that \(\|\lambda^{(i)} - \overline{\lambda}^{(i)}\| \leq \gamma\), with high probability;
  • the correspondent singular values \(\sigma_i\), such that \(\|\sigma_i - \overline{\sigma}_i\| \leq \epsilon\), with probability at least \(1-1/\text{poly}(n)\);
  • the correspondent factor scores \(\lambda_i\), such that \(\|\lambda_i - \overline{\lambda}_i\| \leq 2\epsilon\), with probability at least \(1-1/\text{poly}(n)\).

The parameter \(\gamma\) is the one that controls how big a factor score ratio should be for the singular value/factor score to be measured. If we choose \(\gamma\) bigger than the least factor scores ratio of interest, the estimate for the smaller ones is likely to be \(0\), as \(\|\lambda^{(i)}-0\|\leq \gamma\) would be a plausible estimation.

Often in data representations, the cumulative sum of the factor score ratios is a measure of the quality of the representation. By slightly modifying Algorithm in Figure 7.1 to use Theorem 3.4, it is possible to estimate this sum such that \(\|\sum_i^k \lambda^{(i)} - \sum_i^k \overline{\lambda}^{(i)}\| \leq k\epsilon\) with probability \(1-1/\text{poly}(r)\).

However, a slight variation of the algorithm of Theorem 7.1 provides a more accurate estimation in less time, given a threshold \(\theta\) for the smallest singular value to retain.

(#thm:check_explained_variance) (Quantum check on the factor score ratios' sum) Assume to have efficient quantum access to the matrix \(A \in \mathbb{R}^{n \times m}\), with singular value decomposition \(A = \sum_i\sigma_i u_i v_i^T\). Let \(\eta, \epsilon\) be precision parameters and \(\theta\) be a threshold for the smallest singular value to consider. There exists a quantum algorithm that estimates \(p = \frac{\sum_{i: \overline{\sigma}_i \geq \theta} \sigma_i^2}{\sum_j^r \sigma_j^2}\), where \(\|\sigma_i - \overline{\sigma}_i\| \leq \epsilon\), to relative error \(\eta\) in time \(\widetilde{O}\left(\frac{\mu(A)}{\epsilon}\frac{1}{\eta \sqrt{p}}\right)\).

Moreover, we borrow an observation from (Kerenidis and Prakash 2020) on Theorem 7.1, to perform a binary search of \(\theta\) given the desired sum of factor score ratios.

Theorem 7.2 (Quantum binary search for the singular value threshold) Assume to have quantum access to the matrix \(A \in \mathbb{R}^{n \times m}\). Let \(p\) be the factor ratios sum to retain. The threshold \(\theta\) for the smallest singular value to retain can be estimated to absolute error \(\epsilon\) in time \(\widetilde{O}\left(\frac{\log(1/\epsilon)\mu(A)}{\epsilon\sqrt{p}}\right)\).

We will see in the next chapters that in problems such as PCA, CA, and LSA, the desired sum of factor score ratios to retain is a number in the range \(p \in [1/3, 1]\) with precision up to the second decimal digit. In practice, the complexity of these last two algorithms scales as \(\widetilde{O}\left(\frac{\mu(A)}{\epsilon}\right)\).

7.3 Extracting the SVD representations

After introducing the procedures to test for the most relevant singular values, factor scores and factor score ratios of \(A\), we present an efficient routine to extract the corresponding right/left singular vectors. The inputs of this algorithm, other than the matrix, are a parameter \(\delta\) for the precision of the singular vectors, a parameter \(\epsilon\) for the precision of the singular value estimation, and a threshold \(\theta\) to discard the non interesting singular values/vectors. The output guarantees a unit estimate \(\overline{x}_i\) of each singular vector such that \(\left|\left| x_i -\overline{x}_i\right|\right|_{\ell} \leq \delta\) for \({\ell} \in \{2, \infty\}\), ensuring that the estimate has a similar orientation to the original vector. Additionally, this subroutine can provide an estimation of the singular values greater than \(\theta\), to absolute error \(\epsilon\).

Quantum algorithm for top-k singular vector extractor

Figure 7.2: Quantum algorithm for top-k singular vector extractor

(#thm:top-k_sv_extraction) (Top-k singular vectors extraction) Let there be efficient quantum access to the matrix \(A \in \mathbb{R}^{n \times m}\), with singular value decomposition \(A = \sum_i^r \sigma_i u_i v_i^T\) and \(\sigma_{max} \leq 1\). Let \(\delta > 0\) be a precision parameter for the singular vectors, \(\epsilon>0\) a precision parameter for the singular values, and \(\theta>0\) be a threshold such that \(A\) has \(k\) singular values greater than \(\theta\). Define \(p=\frac{\sum_{i: \overline{\sigma}_i \geq \theta} \sigma_i^2}{\sum_j^r \sigma_j^2}\). There exist quantum algorithms that estimate:

  • The top \(k\) left singular vectors \(u_i\) of \(A\) with unit vectors \(\overline{u}_i\) such that \(\|u_i-\overline{u}_i\|_2 \leq \delta\) with probability at least \(1-1/poly(n)\), in time \(\widetilde{O}\left(\frac{1}{\theta}\frac{1}{\sqrt{p}}\frac{\mu(A)}{\epsilon}\frac{kn}{\delta^2}\right)\);
  • The top \(k\) right singular vectors \(v_i\) of \(A\) with unit vectors \(\overline{v}_i\) such that \(\|v_i-\overline{v}_i\|_2 \leq \delta\) with probability at least \(1-1/poly(m)\), in time \(\widetilde{O}\left(\frac{1}{\theta}\frac{1}{\sqrt{p}}\frac{\mu(A)}{\epsilon}\frac{km}{\delta^2}\right)\).
  • The top \(k\) singular values \(\sigma_i\) and factor scores \(\lambda_i\) of \(A\) to precision \(\epsilon\) and \(2\epsilon\) with probability at least \(1 - 1/\text{poly}(m)\), in time \(\widetilde{O}\left(\frac{1}{\theta}\frac{1}{\sqrt{p}}\frac{\mu(A)k}{\epsilon}\right)\) or any of the two above.

Besides \(p\) being negligible, it is interesting to note that the parameter \(\theta\) can be computed using:

  • the procedures of Theorems @ref(thm:factor_score_estimation) and @ref(thm:check_explained_variance);
  • the binary search of Theorem \(\ref{Theorivisto:binarysearch}\);
  • the available literature on the type of data stored in the input matrix \(A\).

About the latter, the original paper of latent semantic indexing (Deerwester et al. 1990) states that the first \(k=100\) singular values are enough for a good representation. We believe that, in the same way, fixed thresholds \(\theta\) can be defined for different machine learning applications. The experiments that you can read in Chapter ?? on the run-time parameters of the polynomial expansions of the MNIST dataset support this expectation: even though in qSFA they keep the \(k\) smallest singular values and refer to \(\theta\) as the biggest singular value to retain, this value does not vary much when the the dimensionality of their dataset grows. In our experiments, we observe that different datasets for image classification have similar \(\theta\)s.

A similar statement to Theorem @ref(thm:top-k_sv_extraction) can be stated with \(\ell_\infty\) guarantees on the vectors (see Corollary 13 of (Bellante and Zanero 2022)).

As we discussed before, given a vector with \(d\) non-zero entries, performing \(\ell_\infty\) tomography with error \(\frac{\delta}{\sqrt{d}}\) provides the same guarantees of \(\ell_2\) tomography with error \(\delta\).

In practice, this result implies that the extraction of the singular vectors, with \(\ell_2\) guarantees, can be faster if we can assume some prior assumptions on their sparseness: \(\widetilde{O}\left(\frac{1}{\theta}\frac{1}{\sqrt{p}}\frac{\mu(A)}{\epsilon}\frac{kd}{\delta^2}\right)\).

7.4 Singular value estimation of a product of two matrices

This is an example of an algorithm that has been superseded by recent development in singular value transformation. Nevertheless, it is a non-trivial way of using SVE, which a nice mathematical error analysis.

Theorem 7.3 (SVE of product of matrices) Assume to have quantum access to matrices \(P \in \mathbb{R}^{d\times d}\) and \(Q \in \mathbb{R}^{d \times d}\). Define \(W=PQ = U\Sigma V^T\) and \(\epsilon > 0\) an error parameter. There is a quantum algorithm that with probability at least \(1-poly(d)\) performs the mapping \(\sum_{i}\alpha|v_i\rangle \to \sum_{i}\alpha_i|v_i\rangle|\overline{\sigma_i}\rangle\) where \(\overline{\sigma_i}\) is an approximation of the eigenvalues \(\sigma_i\) of \(W\) such that \(|\sigma_i - \overline{\sigma}_i | \leq \epsilon\), in time \(\tilde{O}\left(\frac{ ( \kappa(P) + \kappa(Q) ) (\mu(P)+\mu(Q))}{\varepsilon}\right)\).

Proof. We start by noting that for each singular value \(\sigma_{i}\) of \(W\) there is a corresponding eigenvalue \(e^{-i\sigma_i}\) of the unitary matrix \(e^{-iW}\). Also, we note that we know how to multiply by \(W\) by applying theorem 5.9 sequentially with \(Q\) and \(P\). This will allow us to approximately apply the unitary \(U=e^{-iW}\). The last step will consist of the application of phase estimation to estimate the eigenvalues of \(U\) and hence the singular values of \(W\). Note that we need \(W\) to be a symmetric matrix because of the Hamiltonian simulation part. In case \(W\) is not symmetric, we redefine it as \[ W = \begin{bmatrix} 0 & PQ\\ (PQ)^T & 0 \end{bmatrix} \] Note we have \(W=M_1M_2\) for the matrices \(M_1, M_2\) stored in QRAM and defined as

\[ M_1 = \begin{bmatrix} P & 0\\ 0 & Q^T \end{bmatrix}, M_2 = \begin{bmatrix} 0 & Q\\ P^T & 0 \end{bmatrix}. \]

We now show how to approximately apply \(U=e^{-iW}\) efficiently. Note that for a symmetric matrix \(W\) we have \(W=V\Sigma V^T\) and using the Taylor expansion of the exponential function we have

\[U = e^{-iW} = \sum_{j=0}^{\infty} \frac{(-iW)^j}{j!} = V \sum_{j=0}^{\infty} \frac{(-i\Sigma)^j}{j!} V^T\]

With \(\widetilde{U}\) we denote our first approximation of \(U\), where we truncate the sum after \(\ell\) terms.

\[\widetilde{U} = \sum_{j=0}^{\ell} \frac{(-iW)^j}{j!} = V \sum_{j=0}^{\ell} \frac{(-i\Sigma)^j}{j!} V^T\]

We want to chose \(\ell\) such that \(\left\lVert U - \widetilde{U}\right\rVert < \epsilon/4\). We have:

\[\begin{eqnarray*} \left\lVert U - \widetilde{U}\right\rVert & \leq & \| \sum_{j=0}^{\infty} \frac{(-iW)^j}{j!} - \sum_{j=0}^{\ell} \frac{(-iW)^j}{j!} \| \leq \| \sum_{j={\ell+1}}^{\infty} \frac{(-iW)^j}{j!} \| \leq \sum_{j={\ell+1}}^{\infty} \| \frac{(-iW)^j}{j!} \| \leq \sum_{j={\ell+1}}^{\infty} \frac{1}{j!} \\ & \leq & \sum_{j={\ell+1}}^{\infty} \frac{1}{2^{j-1}} \leq 2^{-\ell +1} \end{eqnarray*}\]

where we used triangle inequality and that \(\left\lVert W^j\right\rVert\leq 1\). Choosing \(\ell = O(\log 1/ \varepsilon)\) makes the error less than \(\epsilon/4\). %We can approximate a positive series where the term \(a_n\) satisfy the following two conditions: \(0 \leq a_n \leq Kr^n\) with \(K>0, 0<r<1\) by expressing the error as the geometric series \(\frac{Kr^{N+1}}{1-r}\). In our case \(K=1\) and \(r=1/2\). For a given \(\epsilon\) we have to find \(\ell\) such that \(\frac{(\frac{1}{2})^{\ell+1}}{1-(\frac{1}{2})} \leq \epsilon\). By taking \(\ell = O(\log 1/\epsilon)\) we can easily satisfy the error guarantee.

In fact, we cannot apply \(\widetilde{U}\) exactly but only approximately, since we need to multiply with the matrices \(W^j, j\in[\ell]\) and we do so by using the matrix multiplication algorithm for the matrices \(M_1\) and \(M_2\). For each of these matrices, we use an error of \(\frac{\epsilon}{8 \ell}\) which gives an error for \(W\) of \(\frac{\epsilon}{4 \ell}\) and an error for \(W^j\) of at most \(\frac{\epsilon}{4}\). The running time for multiplying with each \(W^j\) is at most \(O(\ell ( \kappa(M_1)\mu(M_1) \log( 8\ell/\epsilon) + \kappa(M_2)\mu(M_2) \log( 8\ell/\epsilon) ))\) by multiplying sequentially. Hence, we will try to apply the unitary \({U}\) by using the Taylor expansion up to level \(\ell\) and approximating each \(W^j, j\in [\ell]\) in the sum through our matrix multiplication procedure that gives error at most \(\frac{\epsilon}{4}\).

In order to apply \(U\) on a state \(|x\rangle = \sum_{i} \alpha_i |v_i\rangle\), let’s assume \(\ell+1\) is a power of two and define \(N_l = \sum_{j=0}^l (\frac{(-i)^j}{j!})^2\). We start with the state \[\frac{1}{\sqrt{N_l}}\sum_{j=0}^l \frac{-i^j}{j!}|j\rangle|x\rangle\]

Controlled on the first register we use our matrix multiplication procedure to multiply with the corresponding power of \(W\) and get a state at most \(\epsilon/4\) away from the state

\[\frac{1}{\sqrt{N_l}}\sum_{j=0}^l \frac{-i^j}{j!}|j\rangle|W^jx\rangle.\]

We then perform a Hadamard on the first register and get a state \(\epsilon/4\) away from the state

\[\frac{1}{\sqrt{\ell}} |0\rangle \left( \frac{1}{\sqrt{N'}} \sum_{j=0}^l \frac{-i^j}{j!} |W^jx\rangle\right) + |0^\bot\rangle |G\rangle\]

where \(N'\) just normalizes the state in the parenthesis. Note that after the Hadamard on the first register, the amplitude corresponding to each \(|i\rangle\) is the first register is the same. We use this procedure inside an amplitude amplification procedure to increase the amplitude \(1/\sqrt{\ell}\) of \(|0\rangle\) to be close to 1, by incurring a factor \(\sqrt{\ell}\) in the running time. The outcome will be a state \(\epsilon/4\) away from the state

\[\left( \frac{1}{\sqrt{N'}} \sum_{j=0}^l \frac{-i^j}{j!} |W^jx\rangle\right) = |\tilde{U}x\rangle\] which is the application of \(\widetilde{U}\). Since \(\left\lVert U - \widetilde{U}\right\rVert \leq \epsilon/4\), we have that the above procedure applies a unitary \(\overline{U}\) such that \(\left\lVert U - \overline{U}\right\rVert \leq \epsilon/2\). Note that the running time of this procedure is given by the amplitude amplification and the time to multiply with \(W^j\), hence we have that the running time is

\[O(\ell^{3/2} ( \kappa(M_1)\mu(M_1) \log( 8\ell/\epsilon) + \kappa(M_2)\mu(M_2) \log( 8\ell/\epsilon) )\]

Now that we know how to apply \(\overline{U}\), we can perform phase estimation on it with error \(\epsilon/2\). This provides an algorithm for estimating the singular values of \(W\) with overall error of \(\epsilon\). The final running time is

\[O(\frac{\ell^{3/2}}{\epsilon} ( \kappa(M_1)\mu(M_1) \log( 8\ell/\epsilon) + \kappa(M_2)\mu(M_2) \log( 8\ell/\epsilon) )\]

We have \(\mu(M_1)=\mu(M_2)= \mu(P)+\mu(Q)\) and \(\kappa(M_1)=\kappa(M_2) = \frac{max \{ \lambda^{P}_{max}, \lambda^{Q}_{max} \}}{ min \{ \lambda^{P}_{min}, \lambda^{Q}_{min} \}} \leq \kappa(P)+\kappa(Q)\), and since \(\ell=O(\log 1/\epsilon)\)the running time can be simplified to

\[\tilde{O}(\frac{ ( \kappa(P) + \kappa(Q))(\mu(P)+\mu(Q))}{\epsilon} ).\]

7.5 A last example: Slow algorithms for log-determinant

A very simple example of the utility of the SVE subroutines is to estimate quantities associated to a given matrix. In this case we are going to study the log-determinant. As the name sais, this is just the logarithm of the determinant of a (symmetric positive definite) SPD matrix.

Definition 7.1 (Log-determinant of an SPD matrix) Let \(A\in \mathbb{R}^{n \times n}\) be a SPD matrix with singular value decomposition \(A=U\Sigma V^T\). The log-determinant of \(A\) is defined as: \[\log\det(A)=\log(\prod_i^n \sigma_i) = \sum_i^n \log(\sigma_i)\]

Please keep in mind that this is not the fastest algorithm for estimating the log-determinant (we will see that in the appropriate chapter on spectral sums), but it’s worth mentioning here because it perhaps the first thing one would try to do in order to estimate this quantity. It also is a good example of the power of quantum singular value estimation, and checking the correctness of this proof might be a good exercise to learn more some mathematical tricks that are very useful to upper bound quantities that appear in the error analysis or the runtime analysis of algorithms.

SVE based algorithm to estimatethe log-determinant of a matrix

Figure 7.3: SVE based algorithm to estimatethe log-determinant of a matrix

Theorem 7.4 (SVE based algorithm for log-determinant) Assuming to have quantum access to an SPD matrix \(A\), the algorithm in figure 7.3 returns \(\overline{\log\det(A)}\) such that \(|\overline{\log\det(A)} - \log\det(A)| < \epsilon |\log\det(A)|\) in time \(\widetilde{O}(\mu \kappa^3/\epsilon^2).\)

Proof. We can rewrite the quantum state encoding the representation of \(A\) (which we can create with quantum access to \(A\)) as follow: \[\begin{equation} |A\rangle = \frac{1}{\|A\|_F} \sum_{i,j=1}^n a_{ij}|i,j\rangle = \frac{1}{\|A\|_F} \sum_{j=1}^n \sigma_j |u_j\rangle|u_j\rangle. \end{equation}\] Starting from the state \(|A\rangle\), we can apply SVE (see lemma 5.8 to \(|A\rangle\) up to precision \(\epsilon_1\) to obtain \[\frac{1}{\|A\|_F} \sum_{j=1}^n \sigma_j |u_j\rangle|u_j\rangle |\tilde{\sigma}_j\rangle,\] where \(|\tilde{\sigma}_j-\sigma_j|\leq \epsilon_1\). Since \(\left\lVert A\right\rVert \leq 1\), using controlled operations, we can prepare

\[\begin{align} \frac{1}{\|A\|_F} \sum_{i=1}^n \sigma_j |u_j\rangle|u_j\rangle |\tilde{\sigma}_j\rangle \left( C\frac{\sqrt{-\log \tilde{\sigma}_j}}{\tilde{\sigma}_j}|0\rangle + |0^\bot\rangle \right), \tag{7.1} \end{align}\]

where \(C=\min_j \tilde{\sigma}_j/\sqrt{|\log \tilde{\sigma}_j|} \approx \sigma_{\min}/\sqrt{|\log \sigma_{\min}|} =1/\kappa\sqrt{\log \kappa}\). The probability of \(|0\rangle\) is \[P = -\frac{C^2}{\|A\|_F^2} \sum_{j=1}^n \frac{\sigma_j^2}{\tilde{\sigma}_j^2} \log \tilde{\sigma}_j.\]

First, we do the error analysis. Note that \[\begin{align} \left| \sum_{j=1}^n \frac{\sigma_j^2}{\tilde{\sigma}_j^2} \log \tilde{\sigma}_j - \sum_{j=1}^n \log \sigma_j \right| &\leq& \left|\sum_{j=1}^n \frac{\sigma_j^2}{\tilde{\sigma}_j^2} \log \tilde{\sigma}_j - \sum_{j=1}^n \frac{\sigma_j^2}{\tilde{\sigma}_j^2}\log \sigma_j \right| + \left|\sum_{j=1}^n \frac{\sigma_j^2}{\tilde{\sigma}_j^2} \log\sigma_j - \sum_{j=1}^n \log \sigma_j \right| \\ &\leq& \sum_{j=1}^n \frac{\sigma_j^2}{\tilde{\sigma}_j^2} |\log \tilde{\sigma}_j - \log \sigma_j | + \sum_{j=1}^n \frac{|\sigma_j^2-\tilde{\sigma}_j^2|}{\tilde{\sigma}_j^2} |\log \sigma_j | \\ &\leq& \sum_{j=1}^n (1 + \frac{\epsilon_1}{\tilde{\sigma}_j})^2 (\frac{\epsilon_1}{\tilde{\sigma}_j}+O(\frac{\epsilon_1^2}{\tilde{\sigma}_j^2})) + (2\kappa\epsilon_1 +\kappa^2\epsilon_1^2) |\log\det(A)| \\ &\leq& n (\kappa\epsilon_1+O(\kappa^2\epsilon_1^2)) + (2\kappa\epsilon_1 +\kappa^2\epsilon_1^2) |\log \det(A)| \\ &=& (n+2|\log \det(A)|) (\kappa\epsilon_1+O(\kappa^2\epsilon_1^2)). \end{align}\] In the third inequality, we use the result that \(\sigma_j\leq \tilde{\sigma}_j+\epsilon_1\).

Denote \(P'\) as the \(\epsilon_2\)-approximation of \(P\) obtained by amplitude estimation, then the above analysis shows that \(-\|A\|_F^2P'/C^2\) is an \((n+2|\log \det(A)|) (\kappa \epsilon_1 + O(\kappa^2 \epsilon_1^2)) +\epsilon_2\|A\|_F^2/C^2\) approximation of \(\log\det(A)\). Note that

\[\begin{align} && (n+2|\log \det(A)|) (\kappa \epsilon_1 + O(\kappa^2 \epsilon_1^2)) +\epsilon_2\|A\|_F^2/C^2 \\ &=& (n+2|\log \det(A)|) (\kappa \epsilon_1 + O(\kappa^2 \epsilon_1^2)) +\epsilon_2 \|A\|_F^2 \kappa^2\log \kappa \\ &\leq& (n+2n\log \kappa) (\kappa \epsilon_1 + O(\kappa^2 \epsilon_1^2)) +n\epsilon_2 \kappa^2\log \kappa \\ &=& O(n\epsilon_1\kappa\log \kappa+n\epsilon_2\kappa^2\log \kappa). \end{align}\]

To make sure the above error is bounded by \(n\epsilon\) it suffcies to choose \(\epsilon_1=\epsilon/\kappa\log \kappa\) and \(\epsilon_2=\epsilon/\kappa^2\log \kappa\).

Now we do the runtime analysis. The runtime of the algorithm mainly comes from the using of SVE and the performing of amplitude estimation on the state in ((7.1)). Using quantum singular value estimation, the complexity to obtain the state (7.1) is \(\widetilde{O}(\mu /\epsilon_1)\). The complexity to perform amplitude estimation is \(\widetilde{O}(\mu /\epsilon_1\epsilon_2)=\widetilde{O}(\mu \kappa^3(\log \kappa)^2/\epsilon^2)\).

G References

Bellante, Armando, and Stefano Zanero. 2022. “Quantum Matching Pursuit: A Quantum Algorithm for Sparse Representations.” Physical Review A 105 (2): 022414.
Cade, Chris, and Ashley Montanaro. 2017. “The Quantum Complexity of Computing Schatten \(p\)-Norms.” arXiv Preprint arXiv:1706.09279.
Deerwester, Scott, Susan T Dumais, George W Furnas, Thomas K Landauer, and Richard Harshman. 1990. “Indexing by Latent Semantic Analysis.” Journal of the American Society for Information Science 41 (6): 391–407.<391::AID-ASI1>3.0.CO;2-9.
Gyurik, Casper, Chris Cade, and Vedran Dunjko. 2020. “Towards Quantum Advantage for Topological Data Analysis.” arXiv Preprint arXiv:2005.02607.
———. 2020. “Quantum Gradient Descent for Linear Systems and Least Squares.” Physical Review A 101 (2): 022316.
Lloyd, Seth, Masoud Mohseni, and Patrick Rebentrost. 2013. Quantum principal component analysis.” Nature Physics 10 (9): 631–33.