Montag, 4. März 2024

The abc conjecture and positive definite kernel on the natural numbers with inductive Cholesky decomposition.

This is a transcription of the following MathOverflow question, found here. One formulation of the abc-conjecture is: \[\forall a,b \in \mathbb{N}: \frac{a+b}{\gcd(a,b)} < \operatorname{rad}\left ( \frac{ab(a+b)}{\gcd(a,b)^3}\right )^2 \] Let us define: \[K(a,b) := \frac{2(a+b)}{\gcd(a,b)\operatorname{rad}\left ( \frac{ab(a+b)}{\gcd(a,b)^3}\right )^2} \] I have checked in Sagemath up to \(n=100\) that the matrix \[(K(i,j))_{1 \le i,j \le n} \text{ is positive definite }\] and up to \(1\le a,b \le 100\) that : \[K(a,b)\le 1\] In the Encyclopedia of Distances the word "similarity" is described as: A similarity \(s:X\times X \rightarrow \mathbb{R}\) is defined in the Encyclopedia of Distances as:
  1. \(s(x,y) \ge 0 \forall x,y \in X\)
  2. \(s(x,y) = s(y,x) \forall x,y \in X\)
  3. \(s(x,y) \le s(x,x) \forall x,x \in X\)
  4. \(s(x,y) = s(x,x) \iff x=y\)
So one possible formulation of the empirical observation above might be: \[K \text{ is a pos. def. kernel and a sim. function with } K(a,a)=1\] From this formulation the abc conjecture in the variant above follows, because: \[K(a,b) \le K(a,a)=1 \rightarrow \frac{(a+b)}{\gcd(a,b)\operatorname{rad}\left ( \frac{ab(a+b)}{\gcd(a,b)^3}\right )^2} \le \frac{1}{2} < 1 \] Since in the definition of the similarity function, some of the properties are easy to prove for \(K\) such as 1. and 2. and from 3. follows abc, I am asking if it is possible to prove \(\require{enclose}\enclose{horizontalstrike}{\text{ property 4. and / or}}\) that \(K\) is positive definite function? Thanks for your help.
Edit: I think that \(4.\) can be proved by considering that if \(K(x,y)=1\) and assuming there exists a prime \(p\) which divides \(x\) but not \(y\), leads to a contradiction.
Second edit: The abc conjecture (in the version above) would follow, if one could prove that the function \[K(a,b) := \frac{2(a+b)}{\gcd(a,b)\operatorname{rad}\left ( \frac{ab(a+b)}{\gcd(a,b)^3}\right )^2} \] is positive definite over the natural numbers.
Proof: If \(K\) is a positive definite function over the natural numbers, then by the Moore-Aronszajn theorem , there exists a Hilbert space \(H\) and a feature mapping \(\phi: \mathbb{N} \rightarrow H\) such that: \[K(a,b) = \left < \phi(a),\phi(b) \right >_{H}\] But by Cauchy-Schwarz inequality we get: \[K(a,b) = |K(a,b)| = |\left < \phi(a), \phi(b) \right >_{H}|\] \[\le |\phi(a)||\phi(b)| = \sqrt{K(a,a)K(b,b)} = \sqrt{1 \cdot 1} = 1\] But then we get by division of \(2\) and application of the last inequality: \[\frac{(a+b)}{\gcd(a,b)\operatorname{rad}\left ( \frac{ab(a+b)}{\gcd(a,b)^3}\right )^2} \le \frac{1}{2} < 1 \] hence it would follow that: \[\forall a,b \in \mathbb{N}: \frac{a+b}{\gcd(a,b)} < \operatorname{rad}\left ( \frac{ab(a+b)}{\gcd(a,b)^3}\right )^2 \] Modified question: Is there a way to speed up the computations to try and test if the Gram matrix for \(n=1,\cdots,10000 \) is positive definite? Here is my code so far, but it is rather slow:
# inductive cholesky decomposition
def rad(n):
    return prod(prime_divisors(n))


def kk(a,b):
    return (2*(a+b)*a**1*b**1/gcd(a,b))*1/rad(a*b*(a+b)/gcd(a,b)**3)**2

def notPositiveDefinite(a,b):
    return rad(a*b/gcd(a,b))

#kk = notPositiveDefinite

def grammat(kk,n):
    return matrix([[kk(a,b) for a in range(1,n+1)] for b in range(1,n+1)],ring=QQ)


PHI = dict([])

def phi(n,useN=False):
    if n in PHI.keys():
        return PHI[n]
    
    if n==1:
        if not useN:
            x = vector([sqrt(kk(1,1))])
        else:
            x = vector([sqrt(kk(1,1)+0.0)])
        PHI[n] = x
        return x
    else:
        w = CC(n-1)**(-1)*vv(n-1)
        lw = list(w)
        if not useN:
            lw.append(sqrt(kk(n,n)-w.norm()**2))
        else:
            lw.append(sqrt(kk(n,n)-w.norm().n()**2).n())
        x = vector(lw)
        PHI[n]= x
        return x

def vv(n):
    return vector([kk(i,n+1) for i in range(1,n+1)])
    

CCDict = dict([])    

def CC(n):
    if n in CCDict.keys():
        return matrix(CCDict[n])
    mm = []
    for i in range(1,n+1):
        vi = list(phi(i))
        if len(vi) < n:
            vi.extend((n-i)*[0])
        mm.append(vi)
    M= matrix(mm)
    CCDict[n] = mm
    return M
    
    

for n in range(1,200):
    print(n,all(x in RR for x in phi(n,useN=True)))
    #print(grammat(kk,n).rational_form())
```
Comment on code:
It is based on inductive Cholesky decomposition as described below and keeps track in a dictionary if the \(\phi(n)\) or \(C_n\) have already been computed. It uses float approximation for greater speed and in the last step we check if each component of the vector \(\phi(n)\) is a real number. If it is for all \(1 \le k \le n\) then we conclude that the Gram matrix of \(n\) is positive definite. If it is not, then at least the Gram matrix for \(n\) is not positive definite:
Inductive Cholesky decomposition to find an embedding \(\phi(n)\) for each number \(n\):
Equation 1): \[\phi(n+1) = \left(\begin{array}{r} {C_{n}^{-1} v_n} \\ \sqrt{k(n+1,n+1)-{\left| {C_{n}^{-1} v_n} \right|}^{2} } \end{array}\right)\] Equation 2): \[v_n = \left(\begin{array}{r} k(1,n+1) \\ k(2,n+1) \\ \cdots\\ k(n,n+1)\\ \end{array}\right)\] Equation 3): \[C_n = \left(\begin{array}{r} \phi(1)^T \\ \phi(2)^T \\ \cdots\\ \phi(n)^T\\ \end{array}\right)\] Equation 4): \[\phi(1) := \sqrt{k(1,1)} e_1\] If we assume that the function \(k\) is positive definite, then this inductive method, should give us the vectors \(\phi(a),\phi(b)\) such that under the normal inner product we have: \[k(a,b) = \left < \phi(a), \phi(b) \right > \]. Edit (11.03.2024): This conjecture about the positive semidefiniteness of \(K\) is wrong as the determinant of \(G_{192}\) is < \(0\)

Samstag, 2. März 2024

Application of Ehrhart polynomials for counting of numbers and lattices for natural numbers.

This is a transcription of the MO question written here: It is known that \(\log(p)\) is independent as a vector over the rational numbers where \(p\) is a prime. We have \(\log(ab) = \log(a)+\log(b)\) and \(\log(n) = \sum_{p |n}v_p(n) \log(p)\). We define \(\psi(n) := \sum_{p |n}v_p(n) e_p\) where \(e_p\) is the \(p\)-th standard basis vector of the Hilbert space of sequences. for \(q = a/b \in \mathbb{Q}_{>0}\) we define: \[\psi(a/b) := \psi(a/\gcd(a,b))-\psi(b/\gcd(a,b))\]. We have for \(n \in \mathbb{N}\): \[\|\psi(n)\|^2 = \sum_{p|n} v_p(n)^2 \text{ is a natural number}\] Furthermore the function \[K(a,b) = \left < \psi(a),\psi(b)\right > = \sum_{p|\gcd(a,b)} v_p(a)v_p(b)\] is a positive definite kernel on the natural numbers. Let us define the \(\infty\)-dimensional lattice \(\Gamma\) as: \(\Gamma:= \{\psi(q)| q \in \mathbb{Q}_{>0}, \|\psi(q)\|^2 \equiv 0 \mod(2)\}\) and let us also define: \[\eta(n) := (-1)^{\|\psi(n)\|^2}= \prod_{p|n}(-1)^{v_p(n)^2}\] For \(\psi(a),\psi(b) \in \Gamma\) we have: \[\|\psi(ab)\|^2 = \|\psi(a)+\psi(b)\|^2 = \|\psi(a)\|^2+\|\psi(b)\|^2+2K(a,b) \equiv 0+0+0 \equiv 0 \mod(2)\] Therefore \(\psi(a)+\psi(b) \in \Gamma\) and \(\psi(1) \in \Gamma\). The lattice shares some properties with the Leech-lattice and the E8 lattice: 1. It is unimodular in the sense that: For a finite set of primes, the gram matrix \((K(p,q))_{p,q \text{ is prime}}\) has \(\det=1\). 2. It is even: \(\forall x \in \Gamma: |x|^2 \equiv 0 \mod(2)\) 3. For all nonzero vectors in the lattice the squared norm is at least \(2\). My questions are these: 1) Probably an easy question: \[\eta(mn) = \eta(m) \eta(n) \forall m,n\in \mathbb{N}\] 2) Does it satisfy a Riemann-Hypothesis type of random variable equality: \[\forall \epsilon > 0 : \lim_{N\rightarrow \infty} \frac{1}{N^{1/2+\epsilon}}\sum_{n=1}^N \eta(n) = 0\] 3) Is there a relationship to the Riemann Hypothesis and this lattice? Here is some Sagemath code for experiments. And here is some picture concerning the 2. question:
**Edit**: Let \(\lambda(n)=(-1)^{\Omega(n)}\) be the Liouville function. Then because \(v_p(n)^2\equiv v_p(n) \mod(2)\), we have indeed: \[\eta(n)= \lambda(n)\] This answers the first question since it is known that \(\lambda\) is multiplicative. This also answers the second and third question. Modified question: Is this lattice known in literature? Thanks for your help. ------------------------------------------------------------------------------------------------- A maybe useful point of view, is to use the Ehrhart polynomial of a polytope, namely the simplex, to count the numbers as points in a simplex: Let \(Q_N:=\) Polytope of \((\{\psi(p)|1\le p \le N, p \text{ prime }\})\) Then the polytope generated by the primes is the simplex, and it is known ( "Computing the continuous discretely" Matthias Beck, Sinai Robins, Second edition, p.31 ff) that the Ehrhart polynomial is given by: \[L(Q_N,t) = \operatorname{binomial}(d+t,d)\] where \(d= \pi(N)\) is the dimension of the simplex, and it is equal to the number of primes \(\le N\), which is denoted by \(\pi(N)\). Let \(\hat{\pi}(n):=\{p \text{ prime}: p|n \}\) and let \(p_n:=n\)-th prime. Since \(L(Q_N,t)\) counts the points \(\psi(n)=(x_1,\cdots,x_d)\) in the dilated polytope \(t Q_N\) and those points have coordinates \(\ge 0\) with \(x_1+\cdots+x_d \le t\), we conclude that, by observing that the sum of the coordinates correspond to \(\Omega(n) := \sum_{p|n} v_p(n)\): \[L(Q_N,t) = |\{\psi(n) : \Omega(n) \le t , \hat{\pi}(n) \subset \hat{\pi}(N!), 1 \le n \le p_{\pi(N)}^t\}|\] The condition \(\hat{\pi}(n) \subset \hat{\pi}(N!)\) is for making sure we look at those numbers which have only prime divisors \(\le N\). We also observe that: \[L(Q_N,t)-L(Q_N,t-1) = |\{ n : 1 \le n \le p_d^t, \Omega(n)=t, \hat{\pi}(n) \subset \hat{\pi}(N!) \}| = |A_{N,t}|\] where I have defined: \[A_{N,t} = \{ n : 1 \le n \le p_d^t, \Omega(n)=t, \hat{\pi}(n) \subset \hat{\pi}(N!) \}\] We can now define and try to evaluate the following sum: \[F(N,t):= \sum_{k=0}^t \sum_{n \in A_{N,k}} \lambda(n)\] \[=\sum_{k=0}^t (-1)^k ( L(Q_N,k)-L(Q_N,k-1) )\] \[=\sum_{k=0}^t (-1)^k ( \operatorname{binomial}(d+k,d)-\operatorname{binomial}(d+k-1,d) )\] which according to Wolfram Alpha is equal to: \[=((-1)^t \operatorname{binomial}(d+t+1,d) F_{2, 1}(1,d+t+2;t+2;-1)+2^{-d-1})-2^{-d-1}(2^{d+1}(-1)^t \operatorname{binomial}(d+t,d)F_{2,1}(1,d+t+1;t+1;-1))\] \[=((-1)^t \operatorname{binomial}(\pi(N)+t+1,\pi(N)) F_{2,1}(1,\pi(N)+t+2;t+2;-1)+2^{-\pi(N)-1})-2^{-\pi(N)-1}(2^{\pi(N)+1}(-1)^t \operatorname{binomial}(\pi(N)+t,\pi(N))F_{2,1}(1,\pi(N)+t+1;t+1;-1))\] where \(F_{2,1}(a,b;c;z)\) is the hypergeometric function. I must admit that the formula is not really "nice" but I think it should be useful, because using the RHS of the last equality with \(d\) could maybe be extended to other values for \(d,t\) other than the natural numbers. I can also not see, how to relate the sum \(F(N,t)\) to the Liouville sum: \[\sum_{n=1}^N \lambda(n)\] The naive idea is that in the limit \(N\rightarrow \infty, t \rightarrow \infty\), the two sets: \[\mathbb{N}:= \lim_{N\rightarrow \infty}\{1,\cdots,N\} \text{ and } \lim_{N,t\rightarrow \infty} A_{N,t} \] should be intuitively equal. There is also an equivalent formulation of the Riemann Hypothesis with the points on a different polytope, but while it uses the Ehrhardt polynomials, which in this case are very complicated at \(t=1\) I can not see how to derive an analytic formula in this case. Here you can find some sanity check in SageMath. Let \(B_{N,t} = \{n : \Omega(n) \le t , \hat{\pi}(n) \subset \hat{\pi}(N!), 1 \le n \le p_{\pi(N)}^t\}\), so that \[\operatorname{binomial}(d+t,d) = L(Q_N,t) = |B_{N,t}|\] The arguments above show that: \[\sum_{n \in B_{N,t}} \lambda(n) = F(N,t) = \] \[=((-1)^t \operatorname{binomial}(d+t+1,d) F_{2,1}(1,d+t+2;t+2;-1)+2^{-d-1})-2^{-d-1}(2^{d+1}(-1)^t \operatorname{binomial}(d+t,d)F_{2,1}(1,d+t+1;t+1;-1))\] hence dividing by the number of points under which the sum runs, we get: \[\frac{1}{|B_{N,t}|} \sum_{n \in B_{N,t}} \lambda(n) = \frac{1}{|B_{N,t}|} F(N,t) = \] \[=(-1)^t \frac{d}{t+1}F_{2,1}(1,d+t+2;t+2;-1)+(-1)^t F_{2,1}(1,d+t+2;t+2;-1)+\frac{1}{2^{d+1} \operatorname{binomial}(d+t,d)}+(-1)^{t+1}F_{2,1}(1,d+t+1;t+1;-1)\] where \(d:=\pi(N)\). Notice the similarity to the prime number theorem: We have in both cases sets \(C_N\) such that: \[\lim_{N\rightarrow \infty} \frac{1}{|C_N|} \sum_{n \in C_N} \lambda(n) = 0\] The case \(C_N := \{1,\cdots,N\}\) corresponds by Landau equivalently to the prime number theorem. We observe that: \[\lim_{N \rightarrow \infty} B_{N,t} = \{n \in \mathbb{N} | \Omega(n) \le t \} =: B_{\infty,t}\] and that \[\lim_{t \rightarrow \infty}(\lim_{N\rightarrow \infty} B_{N,t} ) = \lim_{t \rightarrow \infty} B_{\infty,t} = \lim_{t \rightarrow \infty} \{n \in \mathbb{N} | \Omega(n) \le t \} = \mathbb{N}\] Similarly: \[\lim_{N \rightarrow \infty}(\lim_{t\rightarrow \infty} B_{N,t} ) = \lim_{N \rightarrow \infty} B_{N,\infty} = \lim_{N \rightarrow \infty} \{n \in \mathbb{N} | \hat{\pi}(n) \subset \hat{\pi}(N!)\} = \mathbb{N}\] Hence we have: \[\lim_{t \rightarrow \infty, N \rightarrow \infty} B_{N,t} = \mathbb{N}\] (Here we have used that there are infinitely many primes.) Now let us put \(C_N := B_{N,\pi(N)}\). Then one empirical observation, which I was not able to prove directly, is: \[\lim_{N \rightarrow \infty} \frac{1}{|C_N|} \sum_{n \in C_N} \lambda(n) =^? 0\] which reminds a little bit on the prime number theorem where \(C_N:=\{1,\cdots,N\}\).