DR error measures: Difference between revisions
| Line 108: | Line 108: | ||
=== Grid method (FDR case) === | === Grid method (FDR case) === | ||
<pre> | |||
class GridSolution: | |||
def __init__(self, x, fx): | |||
self.x = x | |||
self.fx = fx | |||
def __repr__(self): | |||
return f"GridSolution(x={self.x:.5f}, fx={self.fx:.5f})" | |||
def grid_method(f, window_size=100, coarse_steps=1000, fine_steps=1000): | |||
best_error = float("inf") | |||
best_x = 0 | |||
coarse_step = window_size / coarse_steps | |||
# Fine step partitions one coarse step into smaller pieces | |||
fine_step = coarse_step / fine_steps | |||
# --- Phase 1: Coarse Grid Search --- | |||
# Search in the window (0, window_size] | |||
for i in range(1, coarse_steps + 1): | |||
x = i * coarse_step | |||
fx = f(x) | |||
if fx < best_error: | |||
best_error = fx | |||
best_x = x | |||
# --- Phase 2: Fine Grid Search --- | |||
# Center the new search window around the best_x found in Phase 1 | |||
# We search from (best_x - coarse_step/2) to (best_x + coarse_step/2) | |||
fine_window_lower = best_x - (coarse_step / 2) | |||
for j in range(1, fine_steps + 1): | |||
x = fine_window_lower + (j * fine_step) | |||
fx = f(x) | |||
if fx < best_error: | |||
best_error = fx | |||
best_x = x | |||
return GridSolution(best_x, best_error) | |||
</pre> | |||
=== L-BFGS-B (one related delta set, arbitrary free deltas) === | === L-BFGS-B (one related delta set, arbitrary free deltas) === | ||
Revision as of 22:29, 14 December 2025
This article will describe several least-squares error measures for delta-rational chords. They have the advantage of not fixing a particular interval in the chord when constructing the chord of best fit. However, like any other numerical measure of concordance or error, you should take them with a grain of salt.
Conventions and introduction
The idea motivating least-squares error measures on a chord as an approximation to a given delta signature is the following (for simplicity, let’s talk about the fully DR case first):
We want the error of a chord 1:r1:r2:...:rn (in increasing order), with n > 1, in the linear domain as an approximation to a fully delta-rational chord with signature +δ1 +δ2 ... +δn, i.e. a chord
with root real-valued harmonic x. Let be the delta signature +δ1 +δ2 ... +δn written cumulatively.
We want to measure the error without having to fix any dyad (as one might naively fix a dyad and measure errors in the other deltas in relation to the fixed dyad). To do this we solve a least-squares error problem: use a root-sum-square error function and optimize x (and any free deltas) to minimize that function.
Domain and error model
We have two choices:
- to measure either the linear (frequency ratio) error or the logarithmic (cents) one (called the domain);
- the collection of intervals to sum over (which we call the error model):
- Rooted: Only intervals from the root real-valued harmonic x are chosen.
- Pairwise: All intervals in the chord are compared.
- All-steps: Only intervals between adjacent notes are compared.
We arrive at the following general formula: Let let be the error model, and let represent the domain function (identity for linear, or ). Then the error function to be minimized by optimizing and any free deltas is:
| Domain | Error model | Error function |
|---|---|---|
| Linear | Rooted | |
| Pairwise | ||
| All-steps | ||
| Logarithmic (nepers) |
Rooted | |
| Pairwise | ||
| All-steps |
To convert nepers to cents, multiply by
Solution methods
Analytic
Fully DR, rooted linear
Setting the derivative to 0 gives us the closed-form solution
which can be plugged back into
to obtain the least-squares linear error.
Partially DR (one related delta set, one free variable), rooted linear
Suppose we wish to approximate a target delta signature of the form with the chord (where the +? is free to vary). The least-squares problem is to minimize by varying x and y the following:
where y represents the free delta +?.
We can set the partial derivatives with respect to x and y of the inner expression equal to zero (since the derivative of sqrt() is never 0) and use SymPy to solve the system symbolically:
import sympy
x = sympy.Symbol("x", real=True)
y = sympy.Symbol("y", real=True)
d1 = sympy.Symbol("\\delta_{1}", real=True)
d2 = sympy.Symbol("\\delta_{2}", real=True)
d3 = sympy.Symbol("\\delta_{3}", real=True)
r1 = sympy.Symbol("r_1", real=True)
r2 = sympy.Symbol("r_2", real=True)
r3 = sympy.Symbol("r_3", real=True)
err_squared = ((x + d1) / x - r1) ** 2 + ((x + d1 + y) / x - r2) ** 2 + ((x + d1 + y + d3) / x - r3) ** 2
err_squared.expand()
err_squared_x = sympy.diff(err_squared, x)
err_squared_y = sympy.diff(err_squared, y)
sympy.nonlinsolve([err_squared_x, err_squared_y], [x, y])The unique solution with x > 0 is
Grid method (FDR case)
class GridSolution:
def __init__(self, x, fx):
self.x = x
self.fx = fx
def __repr__(self):
return f"GridSolution(x={self.x:.5f}, fx={self.fx:.5f})"
def grid_method(f, window_size=100, coarse_steps=1000, fine_steps=1000):
best_error = float("inf")
best_x = 0
coarse_step = window_size / coarse_steps
# Fine step partitions one coarse step into smaller pieces
fine_step = coarse_step / fine_steps
# --- Phase 1: Coarse Grid Search ---
# Search in the window (0, window_size]
for i in range(1, coarse_steps + 1):
x = i * coarse_step
fx = f(x)
if fx < best_error:
best_error = fx
best_x = x
# --- Phase 2: Fine Grid Search ---
# Center the new search window around the best_x found in Phase 1
# We search from (best_x - coarse_step/2) to (best_x + coarse_step/2)
fine_window_lower = best_x - (coarse_step / 2)
for j in range(1, fine_steps + 1):
x = fine_window_lower + (j * fine_step)
fx = f(x)
if fx < best_error:
best_error = fx
best_x = x
return GridSolution(best_x, best_error)
L-BFGS-B (one related delta set, arbitrary free deltas)
We let x1 = x and include additional free variables x2, ..., xn, one for every additional +?, after coalescing strings of consecutive +?'s into one +? and after trimming leading and trailing free delta segments.
L-BFGS-B is a quasi-Newton optimization method (based on L-BFGS) particularly suited for this problem:
- The error function is smooth, allowing use of gradients
- Fast convergence, requiring at worst 20 iterations for accuracy
- Naturally deals with the x > 0 constraint using a log barrier and minimizing the transformed function using regular L-BFGS
Python code
Here x represents the vector x0 is the initial guess for the solution, and f is the error function.
import numpy as np
import math
class LBFGSSolution:
def __init__(self, x, fx, iterations, success):
self.x = x
self.fx = fx
self.iterations = iterations
self.success = success
def numerical_gradient(f, x, eps=1e-8):
grad = np.zeros_like(x)
for i in range(len(x)):
x_plus = x.copy()
x_minus = x.copy()
x_plus[i] += eps
x_minus[i] -= eps
grad[i] = (f(x_plus) - f(x_minus)) / (2 * eps)
return grad
def l_bfgs(f, grad, x0, history_size=10, max_iterations=100, tolerance=1e-5):
x = np.array(x0, dtype=float) # Ensure float array
fx = f(x)
g = grad(x)
s_history = []
y_history = []
rho_history = []
for iteration in range(max_iterations):
grad_norm = np.linalg.norm(g)
if grad_norm < tolerance:
return LBFGSSolution(x, fx, iteration, True)
# --- Two-loop recursion ---
q = g.copy()
# We need to store alpha to use it in the second loop
m = len(s_history)
alpha = [0.0] * m
# First loop (Backward)
for i in range(m - 1, -1, -1):
alpha[i] = rho_history[i] * np.dot(s_history[i], q)
q = q - alpha[i] * y_history[i]
# Initial Hessian approximation scaling
gamma = 1.0
if m > 0:
gamma = np.dot(s_history[-1], y_history[-1]) / np.dot(y_history[-1], y_history[-1])
z = q * gamma
# Second loop (Forward)
for i in range(m):
beta = rho_history[i] * np.dot(y_history[i], z)
z = z + s_history[i] * (alpha[i] - beta)
p = -z # Search direction
# --- Line search (Backtracking) ---
step_size = 1.0
c1 = 1e-4
rho_ls = 0.5
max_line_search = 20
# Check for sufficient decrease (Armijo rule)
# f(x + alpha*p) <= f(x) + c1 * alpha * p^T * g
g_dot_p = np.dot(g, p)
for _ in range(max_line_search):
x_new = x + step_size * p
fx_new = f(x_new)
if fx_new <= fx + c1 * step_size * g_dot_p:
break
step_size *= rho_ls
else:
# If line search fails to find a better point, stop or accept best attempt
# (Here we just accept the last reduced step_size)
pass
# --- Update History ---
s = x_new - x
g_new = grad(x_new)
y = g_new - g
sy = np.dot(s, y)
if sy > 1e-10: # Ensure curvature condition
s_history.append(s)
y_history.append(y)
rho_history.append(1.0 / sy)
# Maintain history size
if len(s_history) > history_size:
s_history.pop(0)
y_history.pop(0)
rho_history.pop(0)
x = x_new
fx = fx_new
g = g_new
return LBFGSSolution(x, fx, max_iterations, False)
def l_bfgs_barrier(f, bounds, x0, history_size=10, max_iterations=100, tolerance=1e-5, barrier_weight=1e-4):
"""
Solves bounded optimization using a Log-Barrier method wrapped around L-BFGS.
Note: x0 must be strictly feasible (inside bounds).
"""
def transformed_f(x):
penalty = 0
for i, val in enumerate(x):
lower, upper = bounds[i]
# Hard cutoff prevents log domain errors during line search exploration
if (lower is not None and val <= lower) or (upper is not None and val >= upper):
return float("inf")
# Log barrier penalties
if lower is not None:
penalty -= barrier_weight * math.log(val - lower)
if upper is not None:
penalty -= barrier_weight * math.log(upper - val)
return f(x) + penalty
# Use the transformed function for gradients as well
grad = lambda x: numerical_gradient(transformed_f, x)
result = l_bfgs(transformed_f, grad, x0, history_size, max_iterations, tolerance)
# Restore actual function value
result.fx = f(result.x)
return result
