Logs: freenode/#haskell
| 2021-05-12 14:01:31 | <merijn> | No, I'm serious |
| 2021-05-12 14:01:39 | <cheater> | no. you're not. |
| 2021-05-12 14:01:45 | <merijn> | Benchmarking stuff like that is really tricky and there is no quick and simple solution |
| 2021-05-12 14:02:02 | × | ddellac__ quits (~ddellacos@86.106.143.189) (Ping timeout: 265 seconds) |
| 2021-05-12 14:02:07 | <cheater> | so rather than talk about criterion let's go back to what i was originally trying to do |
| 2021-05-12 14:02:08 | <cheater> | which is |
| 2021-05-12 14:02:13 | <merijn> | You'll have to pry apart the accelerate single IO action into something you can splice timings into |
| 2021-05-12 14:02:15 | <cheater> | make 1000 copies of a monadic action run |
| 2021-05-12 14:02:19 | <cheater> | lol, no |
| 2021-05-12 14:02:28 | <cheater> | i'll just run the thing so many times that the upload time is insignificant |
| 2021-05-12 14:03:02 | <merijn> | Yeah, but getting the compiler to actually run trivial loops like that multiple times is not easy |
| 2021-05-12 14:03:02 | <exarkun> | cheater: Why don't you just subtract the upload time |
| 2021-05-12 14:03:06 | <cheater> | or i can run it 1000, 2000, and 3000 x, look at the wall times, and curve fit the cost of just the computation. |
| 2021-05-12 14:03:16 | <cheater> | exarkun: the upload time cannot be measured separately from the computation. |
| 2021-05-12 14:03:20 | <exarkun> | cheater: Sure it can |
| 2021-05-12 14:03:26 | <exarkun> | You control the computation right? |
| 2021-05-12 14:03:36 | <cheater> | what's your idea? |
| 2021-05-12 14:03:57 | <exarkun> | Upload a basically-free computation. Maybe repeat it a lot of times to get the distribution. |
| 2021-05-12 14:04:00 | <cheater> | just change the computation? |
| 2021-05-12 14:04:09 | <merijn> | cheater: by "hosed" I meant "you will have to do a lot of manual effort to ensure the compiler doesn't defeat your logic and actually run it 1000 times" |
| 2021-05-12 14:04:16 | <exarkun> | Now you know what an upload costs, right? |
| 2021-05-12 14:04:19 | <cheater> | a single computation will get hidden in the noise |
| 2021-05-12 14:04:25 | <merijn> | exarkun: Not really |
| 2021-05-12 14:04:27 | <exarkun> | Where does the noise come from? |
| 2021-05-12 14:04:33 | <merijn> | Data transfer to GPU is rather noisy |
| 2021-05-12 14:04:45 | <merijn> | cheater: nvidia gpu? |
| 2021-05-12 14:04:48 | <cheater> | yes |
| 2021-05-12 14:04:50 | <cheater> | it's a T4 |
| 2021-05-12 14:05:00 | <merijn> | cheater: Here's an entirely different idea |
| 2021-05-12 14:05:06 | <cheater> | i'm comparing against the cpu backend (accelerate allows both) |
| 2021-05-12 14:05:08 | <merijn> | cheater: Run your code inside nvprof? |
| 2021-05-12 14:05:21 | <merijn> | cheater: nvprof can time actual kernel invocations |
| 2021-05-12 14:05:23 | <cheater> | never heard of that. what would that do? |
| 2021-05-12 14:05:33 | <merijn> | cheater: It's nvidia's profiling tool |
| 2021-05-12 14:05:44 | <cheater> | ok, and what does it tell me? |
| 2021-05-12 14:05:45 | <exarkun> | People can extract the signal that comes from the difference between memcmp looking at 4 bytes instead of 8 bytes in JavaScript over the public internet |
| 2021-05-12 14:05:52 | <cheater> | not sure how to run it , let me look |
| 2021-05-12 14:05:53 | <exarkun> | Does transfer to a GPU have more noise than that? |
| 2021-05-12 14:06:09 | <cheater> | exarkun: that's an interesting idea, but i'll try other approaches first. |
| 2021-05-12 14:06:18 | <merijn> | cheater: Well, I mostly work with handwritten CUDA, but presumably accelerate just generates CUDA/openCL kernels |
| 2021-05-12 14:06:31 | <merijn> | cheater: nvprof tracks kernel launches (and finishing) in the driver |
| 2021-05-12 14:06:40 | <merijn> | cheater: So you bypass the host code entirely in your measurements |
| 2021-05-12 14:06:44 | <merijn> | cheater: https://docs.nvidia.com/cuda/profiler-users-guide/index.html |
| 2021-05-12 14:06:48 | <cheater> | merijn: what it does is it invokes llvm at runtime to compile its program into cuda, and then runs that. |
| 2021-05-12 14:06:58 | <cheater> | i'm not sure if nvprof will catch that out |
| 2021-05-12 14:07:05 | <merijn> | cheater: Sure it will |
| 2021-05-12 14:07:08 | <cheater> | ok |
| 2021-05-12 14:07:15 | <cheater> | let me try that then |
| 2021-05-12 14:07:18 | <merijn> | cheater: It can't bypass the nvidia driver in the kernel |
| 2021-05-12 14:07:26 | <cheater> | *shrug* yeah |
| 2021-05-12 14:07:32 | <merijn> | cheater: That's the only way to talk to the GPU and nvprof interacts with the driver |
| 2021-05-12 14:07:52 | → | larsan1 joins (~larsan@37.120.211.188) |
| 2021-05-12 14:08:05 | <merijn> | cheater: Can't hurt to try anyway, at worst you spend 10 minutes and it fails, at best it saves you hours of coding :p |
| 2021-05-12 14:08:06 | <cheater> | so what, just nvprov my-binary ? |
| 2021-05-12 14:08:34 | <cheater> | hmm yeah that's nice |
| 2021-05-12 14:08:41 | <cheater> | that actually did a thing |
| 2021-05-12 14:08:49 | <merijn> | \o/ |
| 2021-05-12 14:08:52 | <cheater> | however |
| 2021-05-12 14:08:58 | <cheater> | i'm not sure how to measure the cpu then |
| 2021-05-12 14:09:05 | <cheater> | because the same tool does not exist for the cpu backend. |
| 2021-05-12 14:09:13 | <cheater> | so i'm back to square one on that. |
| 2021-05-12 14:10:38 | → | nineonine joins (~nineonine@50.216.62.2) |
| 2021-05-12 14:12:59 | → | kidbuu joins (~Thunderbi@116.40.185.87) |
| 2021-05-12 14:13:00 | <cheater> | i wish accelerate had a LiftIO of some sort |
| 2021-05-12 14:13:03 | <cheater> | but it doesn't seem to |
| 2021-05-12 14:13:57 | × | nineonin_ quits (~nineonine@2604:3d08:777e:900:e4fe:87c8:c43b:fc90) (Ping timeout: 250 seconds) |
| 2021-05-12 14:18:19 | → | isovector joins (~isovector@172.103.216.166.cable.tpia.cipherkey.com) |
| 2021-05-12 14:19:43 | × | LKoen_ quits (~LKoen@5.166.9.109.rev.sfr.net) (Quit: “It’s only logical. First you learn to talk, then you learn to think. Too bad it’s not the other way round.”) |
| 2021-05-12 14:21:19 | <cheater> | ah, crap. the thing i thought was a monad isn't a monad. |
| 2021-05-12 14:21:30 | <cheater> | https://hackage.haskell.org/package/accelerate-1.3.0.0/docs/Data-Array-Accelerate.html#t:Acc |
| 2021-05-12 14:21:48 | → | waleee-cl joins (uid373333@gateway/web/irccloud.com/x-tqrlgrprcnuqmtaa) |
| 2021-05-12 14:24:22 | → | dvdp73 joins (59736826@38.104.115.89.rev.vodafone.pt) |
| 2021-05-12 14:25:07 | × | kritzefitz quits (~kritzefit@2003:5b:203b:200::10:49) (Remote host closed the connection) |
| 2021-05-12 14:30:47 | × | nerdypepper quits (znc@152.67.162.71) (Ping timeout: 245 seconds) |
| 2021-05-12 14:31:04 | → | cr3 joins (~cr3@192-222-143-195.qc.cable.ebox.net) |
| 2021-05-12 14:31:12 | → | Shuppiluliuma joins (~shuppilul@153.33.68.161) |
| 2021-05-12 14:34:47 | → | wroathe joins (~wroathe@c-68-54-25-135.hsd1.mn.comcast.net) |
| 2021-05-12 14:37:47 | → | ddellac__ joins (ddellacost@gateway/vpn/mullvad/ddellacosta) |
| 2021-05-12 14:38:41 | × | nut quits (~gtk@roc37-h01-176-170-197-243.dsl.sta.abo.bbox.fr) (Ping timeout: 240 seconds) |
| 2021-05-12 14:40:11 | × | stree quits (~stree@68.36.8.116) (Ping timeout: 240 seconds) |
| 2021-05-12 14:41:01 | × | frozenErebus quits (~frozenEre@37.231.244.249) (Ping timeout: 260 seconds) |
| 2021-05-12 14:41:22 | × | Shuppiluliuma quits (~shuppilul@153.33.68.161) (Ping timeout: 252 seconds) |
| 2021-05-12 14:42:06 | × | ddellac__ quits (ddellacost@gateway/vpn/mullvad/ddellacosta) (Ping timeout: 240 seconds) |
| 2021-05-12 14:44:15 | → | frozenErebus joins (~frozenEre@37.231.244.249) |
| 2021-05-12 14:44:41 | → | tromp joins (~tromp@dhcp-077-249-230-040.chello.nl) |
| 2021-05-12 14:45:59 | → | nerdypepper joins (znc@152.67.162.71) |
| 2021-05-12 14:48:11 | × | merijn quits (~merijn@83-160-49-249.ip.xs4all.nl) (Ping timeout: 240 seconds) |
| 2021-05-12 14:48:13 | <mupf> | merijn: yes, three years to the day. |
| 2021-05-12 14:49:37 | × | tromp quits (~tromp@dhcp-077-249-230-040.chello.nl) (Ping timeout: 252 seconds) |
| 2021-05-12 14:49:53 | × | jgt_ quits (~jgt@92-247-237-116.spectrumnet.bg) (Ping timeout: 260 seconds) |
| 2021-05-12 14:50:26 | → | Franciman joins (~francesco@host-79-13-131-112.retail.telecomitalia.it) |
| 2021-05-12 14:50:34 | <Franciman> | Hi, is there any arm build for cabal? |
| 2021-05-12 14:51:53 | → | auri_ joins (~admin@fsf/memeber/auri-) |
| 2021-05-12 14:52:11 | <siraben> | Can someone clarify what the difference is between deep/shallow embedding of DSLs? |
| 2021-05-12 14:52:14 | <siraben> | and which kind is tagless-final? |
| 2021-05-12 14:52:38 | × | isovector quits (~isovector@172.103.216.166.cable.tpia.cipherkey.com) (Ping timeout: 246 seconds) |
| 2021-05-12 14:52:42 | × | frozenErebus quits (~frozenEre@37.231.244.249) (Ping timeout: 252 seconds) |
| 2021-05-12 14:53:22 | → | stree joins (~stree@68.36.8.116) |
| 2021-05-12 14:54:08 | <dminuoso> | https://www.cs.ox.ac.uk/people/jeremy.gibbons/publications/embedding-short.pdf |
| 2021-05-12 14:55:10 | <cheater> | tomsmeding: hey are you around? :) |
All times are in UTC.