Machine Learning with Swift
上QQ阅读APP看书,第一时间看更新

Implementing DTW in Swift

There are two versions of the algorithm (with locality constraint, and without it). We'll implement both.

The full source code for the application we are developing in this chapter can be found in the MotionClassification folder of supplementary materials.

Let's define a DTW structure, and create a static function distance in it:

func distance(sVec: [Double], tVec: [Double]) -> Double { 

First, we're creating a distance matrix of size (n+1 x m+1), and populating it with some values: the first cell of the matrix should be equal to zero, and the first row and the first column should be equal to a maximum double value. This is needed to handle border conditions in a proper way later. The first cell plays a role of initial value: initially, the distance is zero. All other cells are unimportant for now, as we'll overwrite their values later:

let n = sVec.count 
let m = tVec.count 
var dtwMat = [[Double]](repeating: [Double](repeating: Double.greatestFiniteMagnitude, count: m+1), count: n+1) 
dtwMat[0][0] = 0 

After this, we iterate through both arrays from 1 to n and 1 to m, filling the distance matrix. At each position [i, j], we calculate the cost for the previous position (i-1, j-1) as the squared difference between corresponding positions in the arrays: (si-1 - tj-1)2:

for i in 1...n { 
  for j in 1...m { 
    let cost = pow(sVec[i-1] - tVec[j-1], 2) 
    let insertion = dtwMat[i-1][j] 
    let deletion = dtwMat[i][j-1] 
    let match = dtwMat[i-1][j-1] 
    let prevMin = min(insertion, deletion, match) 
    dtwMat[i][j] = cost + prevMin 
  } 
} 

The value we are now looking for is in the last cell of the matrix: dtw[n, m]. To make the result comparable between series with different lengths, we normalize it by the length of the longest series:

    return dtwMat[n][m]/Double(max(n, m))  
} 

This gives us an average distance between two series.

To avoid warping the whole sequence to the small segment of its counterpart, locality constraint was introduced. It sets the upper limit to how many deletions/insertions can be found in a row.

And a version of the algorithm with locality constraint w:

func distance(sVec: [Double], tVec: [Double], w: Int) -> Double { 
    let n = sVec.count 
    let m = tVec.count 
    var dtwMat = [[Double]](repeating: [Double](repeating: Double.greatestFiniteMagnitude, count: m+1), count: n+1) 
    dtwMat[0][0] = 0 
    let constraint = max(w, abs(n-m)) 
     
    for i in 1...n { 
        for j in max(1, i-constraint)...min(m, i+constraint) { 
            let cost = pow(sVec[i-1] - tVec[j-1], 2) 
            let insertion = dtwMat[i-1][j] 
            let deletion = dtwMat[i][j-1] 
            let match = dtwMat[i-1][j-1] 
            dtwMat[i][j] = cost + min(insertion, deletion, match) 
        } 
    } 
    return dtwMat[n][m]/Double(max(n, m)) 
} 

Let's test our algorithm. The first two vectors are similar :

let aVec: [Double] = [1,2,3,4,5,6,7,6,5,4,3,2,1] 
let bVec: [Double] = [2,3,4,5,7,7,6,5,4,3,2,1,0,-2] 
 
let distance1 = DTW.distance(sVec: aVec, tVec: bVec) 
let distance2 = DTW.distance(sVec: aVec, tVec: bVec, w: 3) 

The result is about 0.857 in both cases.

Now we have two very different vectors:

let cVec: [Double] = [1,2,3,4,5,6,7,6,5,4,3,2,1,0] 
let dVec: [Double] = [30,2,2,0,1,1,1,14,44] 
 
let distance3 = DTW.distance(sVec: cVec, tVec: dVec) 
let distance4 = DTW.distance(sVec: cVec, tVec: dVec, w: 3) 

The results are 216.571 and 218.286 correspondingly. Note that the distance with locality constraint is even bigger than without it.

Our implementation of DTW is naïve, and can be accelerated using parallel computing. To calculate the new row/column in a distance matrix, you don't need to wait until the previous one is finished; you only need it to be filled one cell ahead of your row/column. DTW can be effectively parallelized using GPU. See Accelerating Dynamic Time Warping Subsequence Search with GPUs and FPGAs for more details [3].