Optimization of lattice Boltzmann simulations on heterogeneous computers