A multi-GPU implementation of a D2Q37 Lattice Boltzmann code