DR error measures: Difference between revisions
mNo edit summary |
|||
| (10 intermediate revisions by the same user not shown) | |||
| Line 11: | Line 11: | ||
with some root real-valued harmonic ''x''. Let <math>D_0 = 0, D_i = \sum_{k=1}^i \delta_k</math> be the delta signature +δ<sub>1</sub> +δ<sub>2</sub> ... +δ<sub>''n''</sub> written cumulatively. | with some root real-valued harmonic ''x''. Let <math>D_0 = 0, D_i = \sum_{k=1}^i \delta_k</math> be the delta signature +δ<sub>1</sub> +δ<sub>2</sub> ... +δ<sub>''n''</sub> written cumulatively. | ||
We want to measure the error ''without having to fix any | We want to measure the error ''without having to fix any interval in the target chord'' (as one might naively fix an interval and measure errors in the other deltas in relation to the fixed interval). To do this we solve a least-squares optimization problem: use a root-sum-square objective function and optimize ''x'' (and any free deltas) to minimize that function. | ||
== Domain and error model == | == Domain and error model == | ||
| Line 18: | Line 18: | ||
* the collection of intervals to sum over (which we call the ''error model''): | * the collection of intervals to sum over (which we call the ''error model''): | ||
** ''rooted'': Only intervals from the root real-valued harmonic ''x'' are chosen. | ** ''rooted'': Only intervals from the root real-valued harmonic ''x'' are chosen. | ||
** ''pairwise'': All intervals in the chord are | ** ''pairwise'': All intervals in the chord are chosen. | ||
** ''all-steps'': Only intervals between adjacent notes are | ** ''all-steps'': Only intervals between adjacent notes are chosen. | ||
We arrive at the following general formula: Let <math> | We arrive at the following general formula: Let <math>I \subseteq {\{0, 1, 2, ..., n\} \choose 2}</math> be the error model, and let <math>f_D</math> represent the domain function (identity for linear, or <math>\log</math>). Then the objective function (measuring the error) to be minimized by optimizing <math>x</math> and any free deltas is: | ||
<math> | <math> | ||
| Line 44: | Line 44: | ||
||<math>\sqrt{\sum_{0\leq i<n} \Bigg( \frac{x + D_{i+1}}{x + D_i} - \frac{r_{i+1}}{r_i} \Bigg)^2 }</math> | ||<math>\sqrt{\sum_{0\leq i<n} \Bigg( \frac{x + D_{i+1}}{x + D_i} - \frac{r_{i+1}}{r_i} \Bigg)^2 }</math> | ||
|- | |- | ||
!rowspan="3"|Logarithmic<br/>( | !rowspan="3"|Logarithmic<br/>(nataves) | ||
!|Rooted | !|Rooted | ||
||<math>\sqrt{\sum_{i=1}^n \Bigg(\log \frac{x + D_i}{r_i x} \Bigg)^2}</math> | ||<math>\sqrt{\sum_{i=1}^n \Bigg(\log \frac{x + D_i}{r_i x} \Bigg)^2}</math> | ||
| Line 54: | Line 54: | ||
||<math>\sqrt{\sum_{0\leq i<n} \Bigg(\log \frac{x + D_{i+1}}{x + D_i} - \log \frac{r_{i+1}}{r_i} \Bigg)^2}</math> | ||<math>\sqrt{\sum_{0\leq i<n} \Bigg(\log \frac{x + D_{i+1}}{x + D_i} - \log \frac{r_{i+1}}{r_i} \Bigg)^2}</math> | ||
|} | |} | ||
To convert | To convert nataves to cents, multiply by <math>\frac{1200}{\log 2}.</math> | ||
== Solution methods == | == Solution methods == | ||
This section gets into the depths of mathematical optimization methods used to minimize DR error. (Optimization is a whole field and there are many different methods; the reason most of those methods exist is to find solutions when finding | This section gets into the depths of mathematical optimization methods used to minimize DR error. (Optimization is a whole field and there are many different methods; the reason most of those methods exist is to find solutions when finding formulas for the solutions is infeasible.) | ||
=== Grid method (FDR case) === | === Grid method (FDR case) === | ||
<syntaxhighlight lang="py"> | <syntaxhighlight lang="py" line="1"> | ||
class GridSolution: | class GridSolution: | ||
def __init__(self, x, fx): | def __init__(self, x, fx): | ||
| Line 164: | Line 114: | ||
In the Python implementation below, <code>x</code> represents the vector <math>(x_1, x_2, ..., x_n),</math> <code>x0</code> is the initial guess for the solution, and <code>f</code> is the objective function. | In the Python implementation below, <code>x</code> represents the vector <math>(x_1, x_2, ..., x_n),</math> <code>x0</code> is the initial guess for the solution, and <code>f</code> is the objective function. | ||
<syntaxhighlight lang="py"> | <syntaxhighlight lang="py" line="1"> | ||
import numpy as np | import numpy as np | ||
import math | import math | ||
| Line 303: | Line 253: | ||
L-BFGS-B is an approximation to BFGS-B limiting memory usage, particularly suited for high-dimensional problems but nevertheless agreeing very well with BFGS-B for typical DR test cases. Python code for the L-BFGS method is provided below. | L-BFGS-B is an approximation to BFGS-B limiting memory usage, particularly suited for high-dimensional problems but nevertheless agreeing very well with BFGS-B for typical DR test cases. Python code for the L-BFGS method is provided below. | ||
<syntaxhighlight lang="py"> | <syntaxhighlight lang="py" line="1"> | ||
import numpy as np | import numpy as np | ||
import math | import math | ||
| Line 413: | Line 363: | ||
== External links == | == External links == | ||
* [https:// | * [https://turbofishcrow.github.io/delta/ Inthar's DR chord explorer (includes least-squares error calculation in both domains and multiple error models)] | ||
{{cat|Atypical ratios}} | {{cat|Atypical ratios}} | ||
Latest revision as of 04:22, 24 February 2026
This article will describe several least-squares error measures for delta-rational chords. The idea behind least-squares error measures is to find the chord with the exact specified delta signature such that it deviates as little as possible from the chord that is meant to approximate that delta signature, and to measure the deviation. Least-squares error measures have the advantage of not fixing a particular interval in the chord when constructing the chord of best fit. However, like any other numerical measure of concordance or error, you should take them with a grain of salt.
Conventions and introduction
The idea motivating least-squares error measures on a chord as an approximation to a given delta signature is the following:
We want the error of a chord 1:r1:r2:...:rn (in increasing order; we also take r0 = 1), with n > 1, in the linear domain as an approximation to a delta-rational chord with signature +δ1 +δ2 ... +δn (possibly with some deltas free), i.e. the target chord is
with some root real-valued harmonic x. Let be the delta signature +δ1 +δ2 ... +δn written cumulatively.
We want to measure the error without having to fix any interval in the target chord (as one might naively fix an interval and measure errors in the other deltas in relation to the fixed interval). To do this we solve a least-squares optimization problem: use a root-sum-square objective function and optimize x (and any free deltas) to minimize that function.
Domain and error model
We have two choices:
- to measure either the linear (frequency ratio) error or the logarithmic (cents) one (called the domain);
- the collection of intervals to sum over (which we call the error model):
- rooted: Only intervals from the root real-valued harmonic x are chosen.
- pairwise: All intervals in the chord are chosen.
- all-steps: Only intervals between adjacent notes are chosen.
We arrive at the following general formula: Let be the error model, and let represent the domain function (identity for linear, or ). Then the objective function (measuring the error) to be minimized by optimizing and any free deltas is:
| Domain | Error model | Objective function |
|---|---|---|
| Linear | Rooted | |
| Pairwise | ||
| All-steps | ||
| Logarithmic (nataves) |
Rooted | |
| Pairwise | ||
| All-steps |
To convert nataves to cents, multiply by
Solution methods
This section gets into the depths of mathematical optimization methods used to minimize DR error. (Optimization is a whole field and there are many different methods; the reason most of those methods exist is to find solutions when finding formulas for the solutions is infeasible.)
Grid method (FDR case)
class GridSolution:
def __init__(self, x, fx):
self.x = x
self.fx = fx
def __repr__(self):
return f"GridSolution(x={self.x:.5f}, fx={self.fx:.5f})"
def grid_method(f, window_size=100, coarse_steps=1000, fine_steps=1000):
best_error = float("inf")
best_x = 0
coarse_step = window_size / coarse_steps
# Fine step partitions one coarse step into smaller pieces
fine_step = coarse_step / fine_steps
# --- Phase 1: Coarse Grid Search ---
# Search in the window (0, window_size]
for i in range(1, coarse_steps + 1):
x = i * coarse_step
fx = f(x)
if fx < best_error:
best_error = fx
best_x = x
# --- Phase 2: Fine Grid Search ---
# Center the new search window around the best_x found in Phase 1
# We search from (best_x - coarse_step/2) to (best_x + coarse_step/2)
fine_window_lower = best_x - (coarse_step / 2)
for j in range(1, fine_steps + 1):
x = fine_window_lower + (j * fine_step)
fx = f(x)
if fx < best_error:
best_error = fx
best_x = x
return GridSolution(best_x, best_error)BFGS-B (one related delta set, arbitrary free deltas)
We let x1 = x and include additional free variables x2, ..., xn, one for every additional +?, after coalescing segments of consecutive +?'s into one +? and after trimming leading and trailing free delta segments.
BFGS-B is a quasi-Newton optimization method (based on BFGS) particularly suited for this problem:
- The objective function is smooth, allowing use of gradients
- Fast convergence, requiring at worst 20 iterations for accuracy
- Naturally deals with the x > 0 constraint using a log barrier and minimizing the transformed function using the unconstrained BFGS method
- Acceptable memory usage given a realistic number of parameters for practical DR chords (up to 3 interior free delta segments, thus 4 parameters).
It is a quasi-Newton method because it uses an approximation of the Hessian (matrix of mixed second partial derivatives) of the objective function at each step.
In the Python implementation below, x represents the vector x0 is the initial guess for the solution, and f is the objective function.
import numpy as np
import math
class BFGSSolution:
def __init__(self, x, fx, iterations, success):
self.x = x
self.fx = fx
self.iterations = iterations
self.success = success
def numerical_gradient(f, x, eps=1e-8):
grad = np.zeros_like(x)
for i in range(len(x)):
x_plus = x.copy()
x_minus = x.copy()
x_plus[i] += eps
x_minus[i] -= eps
grad[i] = (f(x_plus) - f(x_minus)) / (2 * eps)
return grad
def bfgs(f, grad, x0, max_iterations=100, tolerance=1e-5):
x = np.array(x0, dtype=float)
n = len(x)
fx = f(x)
g = grad(x)
# Approximate inverse Hessian
# Initial approximation is the identity matrix
H = np.eye(n)
for i in range(max_iterations):
# 0: Check convergence
grad_norm = np.linalg.norm(g)
if grad_norm < tolerance:
return BFGSSolution(x, fx, i, True)
# 1: Set search direction p (negative gradient direction transformed by H)
# p = -H * g
p = -H @ g
# 2: Get alpha satisfying Wolfe conditions (Armijo rule and curvature condition)
c1 = 1e-4
c2 = 0.9
max_line_search = 20
rho_ls = 0.5
gp = np.dot(g, p)
alpha = 1.0
for _ in range(max_line_search):
x_next_guess = x + alpha * p
# Check Armijo rule
# f(x + alpha*p) <= f(x) + c1 * alpha * p^T * g
if f(x_next_guess) <= fx + c1 * alpha * gp:
# Check Curvature condition
# -p^T * grad(x_next) <= -c2 * p^T * g
g_next_guess = grad(x_next_guess)
if -np.dot(p, g_next_guess) <= -c2 * gp:
break
alpha *= rho_ls
# 3: Set s = alpha * p and x_next = x + s
s = alpha * p
x_next = x + s
# 4: Set y = grad(x_next) - grad(x)
# We re-calculate grad(x_next) here to match the strict logic flow,
# though optimization could reuse g_next_guess from the successful line search.
g_next = grad(x_next)
y = g_next - g
# 5: BFGS Update
# Update H += U + V
sy = np.dot(s, y)
# Prevent division by zero if step size was extremely small
if sy == 0:
# In a robust implementation, you might reset H to Identity here
break
Hy = H @ y
# Calculate scalar for the first term: (s^T y + y^T H y) / (s^T y)^2
scalar1 = (sy + np.dot(y, Hy)) / (sy * sy)
U = scalar1 * np.outer(s, s)
# Calculate the second term matrices
# W = (H y) s^T + s (y^T H)
# Note: Since H is symmetric, y^T H is equivalent to (H y)^T
W = np.outer(Hy, s) + np.outer(s, Hy)
V = (-1 / sy) * W
H = H + U + V
# Update x, fx, and g for next iteration
x = x_next
fx = f(x_next)
g = g_next
return BFGSSolution(x, fx, max_iterations, False)
def bfgs_barrier(f, bounds, x0, history_size=10, max_iterations=100, tolerance=1e-5, barrier_weight=1e-4):
"""
Solves bounded optimization using a Log-Barrier method wrapped around BFGS.
Note: x0 must be strictly feasible (inside bounds).
"""
def transformed_f(x):
penalty = 0
for i, val in enumerate(x):
lower, upper = bounds[i]
# Hard cutoff prevents log domain errors during line search exploration
if (lower is not None and val <= lower) or (upper is not None and val >= upper):
return float("inf")
# Log barrier penalties
if lower is not None:
penalty -= barrier_weight * math.log(val - lower)
if upper is not None:
penalty -= barrier_weight * math.log(upper - val)
return f(x) + penalty
# Use the transformed function for gradients as well
grad = lambda x: numerical_gradient(transformed_f, x)
result = bfgs(transformed_f, grad, x0, history_size, max_iterations, tolerance)
# Restore actual function value
result.fx = f(result.x)
return resultL-BFGS-B (one related delta set, arbitrary free deltas)
L-BFGS-B is an approximation to BFGS-B limiting memory usage, particularly suited for high-dimensional problems but nevertheless agreeing very well with BFGS-B for typical DR test cases. Python code for the L-BFGS method is provided below.
import numpy as np
import math
class LBFGSSolution:
def __init__(self, x, fx, iterations, success):
self.x = x
self.fx = fx
self.iterations = iterations
self.success = success
def numerical_gradient(f, x, eps=1e-8):
grad = np.zeros_like(x)
for i in range(len(x)):
x_plus = x.copy()
x_minus = x.copy()
x_plus[i] += eps
x_minus[i] -= eps
grad[i] = (f(x_plus) - f(x_minus)) / (2 * eps)
return grad
def l_bfgs(f, grad, x0, history_size=10, max_iterations=100, tolerance=1e-5):
x = np.array(x0, dtype=float) # Ensure float array
fx = f(x)
g = grad(x)
s_history = []
y_history = []
rho_history = []
for iteration in range(max_iterations):
grad_norm = np.linalg.norm(g)
if grad_norm < tolerance:
return LBFGSSolution(x, fx, iteration, True)
# --- Two-loop recursion ---
q = g.copy()
# We need to store alpha to use it in the second loop
m = len(s_history)
alpha = [0.0] * m
# First loop (Backward)
for i in range(m - 1, -1, -1):
alpha[i] = rho_history[i] * np.dot(s_history[i], q)
q = q - alpha[i] * y_history[i]
# Initial Hessian approximation scaling
gamma = 1.0
if m > 0:
gamma = np.dot(s_history[-1], y_history[-1]) / np.dot(y_history[-1], y_history[-1])
z = q * gamma
# Second loop (Forward)
for i in range(m):
beta = rho_history[i] * np.dot(y_history[i], z)
z = z + s_history[i] * (alpha[i] - beta)
p = -z # Search direction
# --- Line search (Backtracking) ---
step_size = 1.0
c1 = 1e-4
rho_ls = 0.5
max_line_search = 20
# Check for sufficient decrease (Armijo rule)
# f(x + alpha*p) <= f(x) + c1 * alpha * p^T * g
g_dot_p = np.dot(g, p)
for _ in range(max_line_search):
x_new = x + step_size * p
fx_new = f(x_new)
if fx_new <= fx + c1 * step_size * g_dot_p:
break
step_size *= rho_ls
else:
# If line search fails to find a better point, stop or accept best attempt
# (Here we just accept the last reduced step_size)
pass
# --- Update History ---
s = x_new - x
g_new = grad(x_new)
y = g_new - g
sy = np.dot(s, y)
if sy > 1e-10: # Ensure curvature condition
s_history.append(s)
y_history.append(y)
rho_history.append(1.0 / sy)
# Maintain history size
if len(s_history) > history_size:
s_history.pop(0)
y_history.pop(0)
rho_history.pop(0)
x = x_new
fx = fx_new
g = g_new
return LBFGSSolution(x, fx, max_iterations, False)