User talk:Mgkrupa
From cppreference.com
high_resolution_clock
hi, your cpp/chrono/high_resolution_clock#Example is larger than the rest of the page it's on, and it's not clear that it shows good practice: while we've all done something like that (in fact, cpp/chrono/c/clock#Example and cpp/thread/sleep_for#Example do exactly that), there are no barriers to prevent the code motion around the calls to now() or dead store elimination (in fact, the ideal compiler should return zero for both of your timings!), and it's not trying to make the measurements statistically significant. If you really want to show timing, how about simplifying to a simple difference like in cpp/thread/sleep_for#Example? --Cubbi (talk) 11:06, 17 July 2017 (PDT)
- My understanding is that compilers can't rearrange calls to functions whose definitions are not available at compile time. Presumably, somewhere inside of now()'s definition there is a call to such a function or is now() independent of the OS? As for the variance, you're correct. I had previously divided by only one of the two variables in the denominator but now that that's no longer the case, only one loop is necessary. "the ideal compiler should return zero for both of your timings!" - then I'm fortunate that I don't live in an ideal world (unfortunately). Mgkrupa (talk) 17:01, 17 July 2017 (PDT)
- somewhere in now() there is an opaque system call, true, but both vector assignment and the sorts can be inlined, allocate/deallocate can annihilate each other (clang loves doing that), and then it will see that now() can't possible access any of the vector elements.. to be fair, I checked and neither gcc nor clang were that smart on your example. --Cubbi (talk) 19:29, 17 July 2017 (PDT)
- Thank you for that great insight. I think that the code below should solve all of the problems that you mentioned.Mgkrupa (talk) 01:04, 19 July 2017 (PDT)
- somewhere in now() there is an opaque system call, true, but both vector assignment and the sorts can be inlined, allocate/deallocate can annihilate each other (clang loves doing that), and then it will see that now() can't possible access any of the vector elements.. to be fair, I checked and neither gcc nor clang were that smart on your example. --Cubbi (talk) 19:29, 17 July 2017 (PDT)
Run this code
#include <algorithm> #include <chrono> #include <iostream> #include <numeric> #include <random> #include <vector> int main(int argc, char** argv) { std::chrono::nanoseconds total{0}; typedef unsigned long int_type; std::random_device rnd_device; std::mt19937_64 generator(rnd_device()); std::uniform_int_distribution<int_type> dist(std::numeric_limits<int>::min(), std::numeric_limits<int>::max()); std::vector<int_type> vec_original(1 << 14); for (auto &i : vec_original) i = dist(generator); auto vec = vec_original; std::size_t num_reps = 1 << 7; auto start_time = std::chrono::high_resolution_clock::now(); for (auto i = 0ul; i < num_reps; i++) { //Each statement gets its own integer to compare argc to so that // the compiler can't merge contiguous expressions. //Unless you enter a LOT of command line arguments, these statements will // always be executed, but the compiler doesn't know that. //Also, dependence on vec[0] should prevent reordering. if (argc != 0 || vec[0] != 0) start_time = std::chrono::high_resolution_clock::now(); if (argc != -1 || vec[0] != 1) std::sort(vec.begin(), vec.end()); if (argc != -2 || vec[0] != 2) total += std::chrono::high_resolution_clock::now() - start_time; //This if() prevents the compiler from knowing what value vec[0] // and vec_original[0] will have. if (argc != -3 || vec[0] != 3) vec = vec_original; //Again, this will always executed. else vec_original[0] += vec[0]; //The compiler can't rule out a change in // vec_original[0] (and hence in vec[0]), but we can. } std::chrono::nanoseconds::rep count = total.count(); //Make sure that the output of the program depend on vec[0] and total. std::cout << "It took an average of " << count / num_reps << " nanoseconds to std::sort() the vector " << vec[0] << ", ..., " << vec[vec.size() - 1] << " of size " << vec.size() << std::endl; return 0; }
- the main problem is that this is an attempt to demonstrate microbechmarking and not an example of std::chrono::high_resolution_clock: the majority of code does not do anything with it and the part that does (the three calls to now()) does not add anything to what's already in cpp/chrono/high_resolution_clock/now#Example. If you're interested in microbenchmark design, take a look at google/benchmark for ideas, in particular ClobberMemory() and DoNotOptimize(). --Cubbi (talk) 07:29, 19 July 2017 (PDT)
- Thanks a lot. BTW, the code above wasn't meant to be placed on cpp/chrono/high_resolution_clock.Mgkrupa (talk) 07:33, 20 July 2017 (PDT)