GPU solvers

gpu_solvers.py

Because GPUs run asyncronusly, we can unlock the GIL in python and run in a parallel manner over multiple GPUs. The only constrain is the input/output writing between hardwares.