Lucas Sun

Hello World

Hi! I'm Lucas Sun. Welcome to my blog.

I'll write here about things I find genuinely interesting — math, machine learning, and whatever else pulls my attention.

What to expect Posts here will tend to be technical. Topics I expect to return to: - Training dynamics and optimizer theory - Mathematical foundations of learning - Experiments I run and find surprising - Occasional notes on things outside ML No particular posting schedule. Quality over frequency.

Why write?#

Writing is thinking. A half-formed idea that feels solid in my head usually falls apart the moment I try to write it down precisely. So the act of writing a post is primarily for my own benefit — to find out whether I actually understand something.

The secondary benefit is that occasionally someone finds it useful. That's a nice bonus.

"The most valuable thing I could do is to try to get ideas out of my head and into a form where other people can engage with them."

Some mathematics#

Since this blog will involve a lot of math, let me make sure the rendering works. Here are a few examples.

Euler's identity#

Perhaps the most famous equation in mathematics:

eiπ+1=0e^{i\pi} + 1 = 0

It connects the five most fundamental constants: ee, ii, π\pi, 11, and 00.

The Gaussian integral#

A result that appears everywhere in probability and physics:

ex2dx=π\int_{-\infty}^{\infty} e^{-x^2}\, dx = \sqrt{\pi}

Proof sketch The standard trick is to compute I2I^2 instead of II directly. Let I=ex2dxI = \int_{-\infty}^{\infty} e^{-x^2} dx. Then: I2=(ex2dx)(ey2dy)=e(x2+y2)dxdyI^2 = \left(\int_{-\infty}^{\infty} e^{-x^2}\, dx\right)\left(\int_{-\infty}^{\infty} e^{-y^2}\, dy\right) = \int_{-\infty}^{\infty}\int_{-\infty}^{\infty} e^{-(x^2+y^2)}\, dx\, dy Converting to polar coordinates x=rcosθx = r\cos\theta, y=rsinθy = r\sin\theta: I2=02π0er2rdrdθ=2π12=πI^2 = \int_0^{2\pi}\int_0^{\infty} e^{-r^2} r\, dr\, d\theta = 2\pi \cdot \frac{1}{2} = \pi Therefore I=πI = \sqrt{\pi}. \square

The Basel problem#

Euler's 1734 result, which stunned the mathematical world:

n=11n2=π26\sum_{n=1}^{\infty} \frac{1}{n^2} = \frac{\pi^2}{6}

More generally, for even positive integers, the Riemann zeta function satisfies:

ζ(2k)=(1)k+1B2k(2π)2k2(2k)!\zeta(2k) = \frac{(-1)^{k+1} B_{2k} (2\pi)^{2k}}{2 \cdot (2k)!}

where B2kB_{2k} are the Bernoulli numbers.

Softmax and attention#

The softmax function, which appears everywhere in machine learning:

softmax(x)i=exijexj\text{softmax}(x)_i = \frac{e^{x_i}}{\sum_{j} e^{x_j}}

The scaled dot-product attention mechanism used in transformers:

Attention(Q,K,V)=softmax ⁣(QKdk)V\text{Attention}(Q, K, V) = \text{softmax}\!\left(\frac{QK^\top}{\sqrt{d_k}}\right) V

where QRn×dkQ \in \mathbb{R}^{n \times d_k}, KRm×dkK \in \mathbb{R}^{m \times d_k}, VRm×dvV \in \mathbb{R}^{m \times d_v}.

Code#

I'll sometimes include code. Here's a minimal transformer attention block in Python:

import torch
import torch.nn.functional as F

def scaled_dot_product_attention(Q, K, V, mask=None):
    """
    Q: (batch, heads, seq_q, d_k)
    K: (batch, heads, seq_k, d_k)
    V: (batch, heads, seq_k, d_v)
    """
    d_k = Q.size(-1)
    scores = torch.matmul(Q, K.transpose(-2, -1)) / d_k**0.5

    if mask is not None:
        scores = scores.masked_fill(mask == 0, float('-inf'))

    weights = F.softmax(scores, dim=-1)
    return torch.matmul(weights, V), weights

And a simple gradient descent loop:

for step in range(num_steps):
    loss = criterion(model(x), y)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

References#

A non-exhaustive list of references - Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. NeurIPS 2017. - Euler, L. (1734). De summis serierum reciprocarum. Commentarii academiae scientiarum Petropolitanae.

Final notes#

This post is mostly a test of the blog infrastructure — LaTeX rendering, code highlighting, collapsible sections, table of contents, tags. Everything seems to work.

Future posts will be more substantive. If you want to follow along, my email is lucas.gx.sun@gmail.com.

Cite this post ```bibtex @online{hello_world, author = {Lucas Sun}, title = {Hello World}, year = {2026}, month = {04}, day = {15}, url = {https://xtimecrystal.com/posts/hello-world/}, } ```