scs: ./out/demo_socp_gpu fails to solve its problem
Specifications
- OS: Arch Linux
- SCS Version:
masterat 5be0e1684d12c4cfd4d22c5fba236a84a092ab5b - Compiler: gcc
Description
scs fails at solving ./out/demo_socp_gpu 1000 0.5 0.5 1
How to reproduce
linking against julia openblas:
JULIA_HOME="/opt/julias/julia-1.6"
JULIA_LD_PATH="$JULIA_HOME/lib/julia"
BLASLDFLAGS="-L$JULIA_LD_PATH -lopenblas64_"
SCSFLAGS="USE_OPENMP=1 BLAS64=1 BLASSUFFIX=_64_"
make -j4 CFLAGS="-march=native" DLONG=0 ${SCSFLAGS} BLASLDFLAGS="${BLASLDFLAGS}" gpu
then running it via
LD_LIBRARY_PATH=$JULIA_LD_PATH:$LD_LIBRARY_PATH ./out/demo_socp_gpu 1000 0.5 0.5 1
Additional information
similarly compiled direct and indirect solvers (cpu) work just fine
Output
seed : 1
A is 4000 by 1000, with 32 nonzeros per column.
A has 32000 nonzeros (0.800000% dense).
Nonzeros of A take 0.000238 GB of storage.
Row idxs of A take 0.000119 GB of storage.
Col ptrs of A take 0.000004 GB of storage.
ScsCone information:
Zero cone rows: 2000
LP cone rows: 2000
Number of second-order cones: 0, covering 0 rows, with sizes
[]
Number of rows covered is 4000 out of 4000.
true pri opt = 2022.070521
true dua opt = 2022.070521
------------------------------------------------------------------
SCS v3.0.0 - Splitting Conic Solver
(c) Brendan O'Donoghue, Stanford University, 2012
------------------------------------------------------------------
problem: variables n: 1000, constraints m: 4000
cones: z: primal zero / dual free vars: 2000
l: linear vars: 2000
settings: eps_abs: 1.0e-04, eps_rel: 1.0e-04, eps_infeas: 1.0e-07
alpha: 1.50, scale: 1.00e-01, adaptive_scale: 1
max_iters: 100000, normalize: 1, warm_start: 0
acceleration_lookback: 10, acceleration_interval: 10
lin-sys: sparse-indirect GPU
nnz(A): 32000, nnz(P): 0
------------------------------------------------------------------
iter | pri res | dua res | gap | obj | scale | time (s)
------------------------------------------------------------------
0| 6.90e+00 9.46e+01 3.33e+04 -1.66e+04 1.00e-01 1.03e-03
250| 1.76e+04 4.31e+01 1.23e+04 -6.15e+03 1.00e-01 1.65e-01
500| 2.74e+04 4.29e+01 1.23e+04 -6.16e+03 1.00e-01 3.29e-01
750| 1.57e+04 4.26e+01 1.23e+04 -6.16e+03 1.00e-01 4.94e-01
1000| 1.64e+04 4.29e+01 1.23e+04 -6.16e+03 1.00e-01 6.85e-01
1250| 4.30e+21 2.67e+22 6.54e+22 -3.27e+22 1.00e-01 8.48e-01
1500| 1.90e+04 4.29e+01 1.23e+04 -6.16e+03 1.00e-01 9.48e-01
1750| 2.14e+04 4.29e+01 1.23e+04 -6.16e+03 1.00e-01 1.04e+00
2000| 2.48e+04 4.29e+01 1.23e+04 -6.16e+03 1.00e-01 1.13e+00
2250| 6.45e+20 2.19e+22 4.21e+22 2.11e+22 1.00e-01 1.22e+00
2500| 2.07e+04 4.29e+01 1.23e+04 -6.16e+03 1.00e-01 1.30e+00
2750| 2.53e+04 4.29e+01 1.23e+04 -6.16e+03 1.00e-01 1.39e+00
3000| 2.02e+04 4.29e+01 1.23e+04 -6.16e+03 1.00e-01 1.48e+00
3250| 5.72e+20 3.01e+22 3.73e+22 1.87e+22 1.00e-01 1.57e+00
3500| 2.09e+04 4.29e+01 1.23e+04 -6.16e+03 1.00e-01 1.66e+00
3750| 2.43e+04 4.29e+01 1.23e+04 -6.16e+03 1.00e-01 1.75e+00
4000| 2.31e+04 4.29e+01 1.23e+04 -6.16e+03 1.00e-01 1.84e+00
[ ... ]
99500| 2.48e+04 4.29e+01 1.23e+04 -6.16e+03 1.00e-01 3.65e+01
99750| 2.48e+04 4.29e+01 1.23e+04 -6.16e+03 1.00e-01 3.67e+01
100000| 2.48e+04 4.29e+01 1.23e+04 -6.16e+03 1.00e-01 3.68e+01
------------------------------------------------------------------
status: solved (inaccurate - reached max_iters)
timings: total: 3.68e+01s = setup: 5.47e-02s + solve: 3.68e+01s
lin-sys: 3.16e+01s, cones: 7.88e-01s, accel: 4.77e-01s
------------------------------------------------------------------
objective = -6159.028853 (inaccurate)
------------------------------------------------------------------
true pri opt = 2022.070521
true dua opt = 2022.070521
scs pri obj= 0.000000
scs dua obj = -12318.057707
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 30 (27 by maintainers)
Try the following patch. I got all the tests to pass with this fix.
I presume this issue can be closed after #251 is merged
Hmmm, actually this is likely something to do with the GPU solver specifically. There is some issue in there that only trips on some GPUs that I have run into before. It’s probably something to do with type sizes that I have not been able to figure out. I would probably recommend shelving the GPU solver for now, the MKL one is typically faster anyway.
Looks like the tests are passing except for hs21, which is probably just because the numerics are slightly different on the GPU and it’s producing a bad flag.
Thanks for posting. I am unable to reproduce this, when I run the command I get:
It might be the case that you are missing the gpu fixes I submitted here: https://github.com/cvxgrp/scs/commit/13e675d8c1f17e8f1e184281b25b8196c4ac74da.
I did not cut a new release / tag with those fixes. Is that the issue?
By the way, you can better test the gpu using: