I noticed that both the covariance and the Frobenius norm are computed differently in your implementation.
You compute the Frobenius norm as below:
# frobenius norm between source and target
loss = torch.mean(torch.mul((xc - xct), (xc - xct)))
However as stated here http://mathworld.wolfram.com/FrobeniusNorm.html , after squaring each element and summing them, should be computed the square root of the sum not the mean of the squared elements.
In the original paper the covariances are computed as below :
https://arxiv.org/abs/1607.01719

While in your implementation:
# source covariance
xm = torch.mean(source, 0, keepdim=True) - source
xc = xm.t() @ xm
# target covariance
xmt = torch.mean(target, 0, keepdim=True) - target
xct = xmt.t() @ xmt
I noticed that both the covariance and the Frobenius norm are computed differently in your implementation.
You compute the Frobenius norm as below:
# frobenius norm between source and targetloss = torch.mean(torch.mul((xc - xct), (xc - xct)))However as stated here http://mathworld.wolfram.com/FrobeniusNorm.html , after squaring each element and summing them, should be computed the square root of the sum not the mean of the squared elements.
In the original paper the covariances are computed as below :
https://arxiv.org/abs/1607.01719
While in your implementation: