[PATCH] Extend prifiling tool to support Vsize and Rss memory usage profile per call

Shuang He shuang.he at intel.com
Wed Jan 23 17:06:02 PST 2013


On 2013/1/24 0:50, José Fonseca wrote:
> Hi,
>
> This looks useful. Thanks.

Thanks for your review and coment :)

>
> But os_memory.hpp and CMakeLists.txt will need to be improved a bit
> further so that at least the code builds and runs gracefully on
> platforms that don't provide this info. Even on my Linux development
> machine, I got a "fatal error: proc/readproc.h: No such file or
> directory" until I installed libproc-dev package.
>
> Is there any other way of getting this info on Linux without adding
> the libproc dependency? That might be easier to accomplish that
> detecting libproc availability.

Yes, that's absolutely need to be resolved. I would fix it to at least 
not fail to build in case of no libproc support.
I have considered to read proc fs to get those info, but it may be a 
little slower.

Thanks
     --Shuang

>
> Jose
>
> On Tue, Jan 22, 2013 at 2:51 AM, Shuang He <shuang.he at intel.com> wrote:
>> Add one new option -pmem for retrace tool to track Visze and Rss usage during every call
>>
>> Signed-off-by: Shuang He <shuang.he at intel.com>
>> ---
>>   common/os_memory.hpp       |   51 ++++++++++++++++++++++++++++++++
>>   common/trace_profiler.cpp  |   69 +++++++++++++++++++++++++++++++++++++++++---
>>   common/trace_profiler.hpp  |   25 ++++++++++++++--
>>   retrace/CMakeLists.txt     |    2 ++
>>   retrace/glretrace_main.cpp |   44 ++++++++++++++++++++++++++--
>>   retrace/retrace.hpp        |    1 +
>>   retrace/retrace_main.cpp   |   13 ++++++++-
>>   7 files changed, 196 insertions(+), 9 deletions(-)
>>   create mode 100644 common/os_memory.hpp
>>
>> diff --git a/common/os_memory.hpp b/common/os_memory.hpp
>> new file mode 100644
>> index 0000000..cca227b
>> --- /dev/null
>> +++ b/common/os_memory.hpp
>> @@ -0,0 +1,51 @@
>> +/**************************************************************************
>> + *
>> + * Copyright 2011 Jose Fonseca
>> + * All Rights Reserved.
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a copy
>> + * of this software and associated documentation files (the "Software"), to deal
>> + * in the Software without restriction, including without limitation the rights
>> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
>> + * copies of the Software, and to permit persons to whom the Software is
>> + * furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice shall be included in
>> + * all copies or substantial portions of the Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
>> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
>> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
>> + * THE SOFTWARE.
>> + *
>> + **************************************************************************/
>> +
>> +/*
>> + * Simple OS time measurement abstraction.
>> + */
>> +
>> +#ifndef _OS_MEMORY_HPP_
>> +#define _OS_MEMORY_HPP_
>> +
>> +#include <proc/readproc.h>
>> +
>> +namespace os {
>> +    inline long long
>> +    getVsize(void) {
>> +        proc_t proc;
>> +       look_up_our_self(&proc);
>> +        return proc.vsize;
>> +    }
>> +
>> +    inline long long
>> +    getRss(void) {
>> +        proc_t proc;
>> +       look_up_our_self(&proc);
>> +        return proc.rss;
>> +    }
>> +} /* namespace os */
>> +
>> +#endif /* _OS_MEMORY_HPP_ */
>> diff --git a/common/trace_profiler.cpp b/common/trace_profiler.cpp
>> index 0f90ee2..fa114ce 100644
>> --- a/common/trace_profiler.cpp
>> +++ b/common/trace_profiler.cpp
>> @@ -34,9 +34,12 @@ Profiler::Profiler()
>>       : baseGpuTime(0),
>>         baseCpuTime(0),
>>         minCpuTime(1000),
>> +      baseVsizeUsage(0),
>> +      baseRssUsage(0),
>>         cpuTimes(false),
>>         gpuTimes(true),
>> -      pixelsDrawn(false)
>> +      pixelsDrawn(false),
>> +      memoryUsage(false)
>>   {
>>   }
>>
>> @@ -44,13 +47,14 @@ Profiler::~Profiler()
>>   {
>>   }
>>
>> -void Profiler::setup(bool cpuTimes_, bool gpuTimes_, bool pixelsDrawn_)
>> +void Profiler::setup(bool cpuTimes_, bool gpuTimes_, bool pixelsDrawn_, bool memoryUsage_)
>>   {
>>       cpuTimes = cpuTimes_;
>>       gpuTimes = gpuTimes_;
>>       pixelsDrawn = pixelsDrawn_;
>> +    memoryUsage = memoryUsage_;
>>
>> -    std::cout << "# call no gpu_start gpu_dura cpu_start cpu_dura pixels program name" << std::endl;
>> +    std::cout << "# call no gpu_start gpu_dura cpu_start cpu_dura vsize_start vsize_dura rss_start rss_dura pixels program name" << std::endl;
>>   }
>>
>>   int64_t Profiler::getBaseCpuTime()
>> @@ -63,6 +67,16 @@ int64_t Profiler::getBaseGpuTime()
>>       return baseGpuTime;
>>   }
>>
>> +int64_t Profiler::getBaseVsizeUsage()
>> +{
>> +    return baseVsizeUsage;
>> +}
>> +
>> +int64_t Profiler::getBaseRssUsage()
>> +{
>> +    return baseRssUsage;
>> +}
>> +
>>   void Profiler::setBaseCpuTime(int64_t cpuStart)
>>   {
>>       baseCpuTime = cpuStart;
>> @@ -73,6 +87,16 @@ void Profiler::setBaseGpuTime(int64_t gpuStart)
>>       baseGpuTime = gpuStart;
>>   }
>>
>> +void Profiler::setBaseVsizeUsage(int64_t vsizeStart)
>> +{
>> +    baseVsizeUsage = vsizeStart;
>> +}
>> +
>> +void Profiler::setBaseRssUsage(int64_t rssStart)
>> +{
>> +    baseRssUsage = rssStart;
>> +}
>> +
>>   bool Profiler::hasBaseTimes()
>>   {
>>       return baseCpuTime != 0 || baseGpuTime != 0;
>> @@ -83,7 +107,9 @@ void Profiler::addCall(unsigned no,
>>                          unsigned program,
>>                          int64_t pixels,
>>                          int64_t gpuStart, int64_t gpuDuration,
>> -                       int64_t cpuStart, int64_t cpuDuration)
>> +                       int64_t cpuStart, int64_t cpuDuration,
>> +                      int64_t vsizeStart, int64_t vsizeDuration,
>> +                      int64_t rssStart, int64_t rssDuration)
>>   {
>>       if (gpuTimes && gpuStart) {
>>           gpuStart -= baseGpuTime;
>> @@ -109,12 +135,23 @@ void Profiler::addCall(unsigned no,
>>           pixels = 0;
>>       }
>>
>> +    if (!memoryUsage || !vsizeStart || !rssStart) {
>> +        vsizeStart = 0;
>> +       vsizeDuration = 0;
>> +       rssStart = 0;
>> +       rssDuration = 0;
>> +    }
>> +
>>       std::cout << "call"
>>                 << " " << no
>>                 << " " << gpuStart
>>                 << " " << gpuDuration
>>                 << " " << cpuStart
>>                 << " " << cpuDuration
>> +             << " " << vsizeStart
>> +             << " " << vsizeDuration
>> +             << " " << rssStart
>> +             << " " << rssDuration
>>                 << " " << pixels
>>                 << " " << program
>>                 << " " << name
>> @@ -132,6 +169,8 @@ void Profiler::parseLine(const char* in, Profile* profile)
>>       std::string type;
>>       static int64_t lastGpuTime;
>>       static int64_t lastCpuTime;
>> +    static int64_t lastVsizeUsage;
>> +    static int64_t lastRssUsage;
>>
>>       if (in[0] == '#' || strlen(in) < 4)
>>           return;
>> @@ -139,6 +178,8 @@ void Profiler::parseLine(const char* in, Profile* profile)
>>       if (profile->programs.size() == 0 && profile->calls.size() == 0 && profile->frames.size() == 0) {
>>           lastGpuTime = 0;
>>           lastCpuTime = 0;
>> +       lastVsizeUsage = 0;
>> +       lastRssUsage = 0;
>>       }
>>
>>       line >> type;
>> @@ -151,6 +192,10 @@ void Profiler::parseLine(const char* in, Profile* profile)
>>                >> call.gpuDuration
>>                >> call.cpuStart
>>                >> call.cpuDuration
>> +            >> call.vsizeStart
>> +            >> call.vsizeDuration
>> +            >> call.rssStart
>> +            >> call.rssDuration
>>                >> call.pixels
>>                >> call.program
>>                >> call.name;
>> @@ -163,6 +208,14 @@ void Profiler::parseLine(const char* in, Profile* profile)
>>               lastCpuTime = call.cpuStart + call.cpuDuration;
>>           }
>>
>> +       if (lastVsizeUsage < call.vsizeStart + call.vsizeDuration) {
>> +            lastVsizeUsage = call.vsizeStart + call.vsizeDuration;
>> +       }
>> +
>> +       if (lastRssUsage < call.rssStart + call.rssDuration) {
>> +            lastRssUsage = call.rssStart + call.rssDuration;
>> +       }
>> +
>>           profile->calls.push_back(call);
>>
>>           if (call.pixels >= 0) {
>> @@ -174,6 +227,8 @@ void Profiler::parseLine(const char* in, Profile* profile)
>>               program.cpuTotal += call.cpuDuration;
>>               program.gpuTotal += call.gpuDuration;
>>               program.pixelTotal += call.pixels;
>> +           program.vsizeTotal += call.vsizeDuration;
>> +           program.rssTotal += call.rssDuration;
>>               program.calls.push_back(profile->calls.size() - 1);
>>           }
>>       } else if (type.compare("frame_end") == 0) {
>> @@ -183,15 +238,21 @@ void Profiler::parseLine(const char* in, Profile* profile)
>>           if (frame.no == 0) {
>>               frame.gpuStart = 0;
>>               frame.cpuStart = 0;
>> +           frame.vsizeStart = 0;
>> +           frame.rssStart = 0;
>>               frame.calls.begin = 0;
>>           } else {
>>               frame.gpuStart = profile->frames.back().gpuStart + profile->frames.back().gpuDuration;
>>               frame.cpuStart = profile->frames.back().cpuStart + profile->frames.back().cpuDuration;
>> +           frame.vsizeStart = profile->frames.back().vsizeStart + profile->frames.back().vsizeDuration;
>> +           frame.rssStart = profile->frames.back().rssStart + profile->frames.back().rssDuration;
>>               frame.calls.begin = profile->frames.back().calls.end + 1;
>>           }
>>
>>           frame.gpuDuration = lastGpuTime - frame.gpuStart;
>>           frame.cpuDuration = lastCpuTime - frame.cpuStart;
>> +       frame.vsizeDuration = lastVsizeUsage - frame.vsizeStart;
>> +       frame.rssDuration = lastRssUsage - frame.rssStart;
>>           frame.calls.end = profile->calls.size() - 1;
>>
>>           profile->frames.push_back(frame);
>> diff --git a/common/trace_profiler.hpp b/common/trace_profiler.hpp
>> index d833242..87888b8 100644
>> --- a/common/trace_profiler.hpp
>> +++ b/common/trace_profiler.hpp
>> @@ -45,6 +45,11 @@ struct Profile {
>>           int64_t cpuStart;
>>           int64_t cpuDuration;
>>
>> +       int64_t vsizeStart;
>> +       int64_t vsizeDuration;
>> +       int64_t rssStart;
>> +       int64_t rssDuration;
>> +
>>           int64_t pixels;
>>
>>           std::string name;
>> @@ -59,6 +64,11 @@ struct Profile {
>>           int64_t cpuStart;
>>           int64_t cpuDuration;
>>
>> +       int64_t vsizeStart;
>> +       int64_t vsizeDuration;
>> +       int64_t rssStart;
>> +       int64_t rssDuration;
>> +
>>           /* Indices to profile->calls array */
>>           struct {
>>               unsigned begin;
>> @@ -72,6 +82,8 @@ struct Profile {
>>           uint64_t gpuTotal;
>>           uint64_t cpuTotal;
>>           uint64_t pixelTotal;
>> +       int64_t vsizeTotal;
>> +       int64_t rssTotal;
>>
>>           /* Indices to profile->calls array */
>>           std::vector<unsigned> calls;
>> @@ -88,14 +100,16 @@ public:
>>       Profiler();
>>       ~Profiler();
>>
>> -    void setup(bool cpuTimes_, bool gpuTimes_, bool pixelsDrawn_);
>> +    void setup(bool cpuTimes_, bool gpuTimes_, bool pixelsDrawn_, bool memoryUsage_);
>>
>>       void addCall(unsigned no,
>>                    const char* name,
>>                    unsigned program,
>>                    int64_t pixels,
>>                    int64_t gpuStart, int64_t gpuDuration,
>> -                 int64_t cpuStart, int64_t cpuDuration);
>> +                 int64_t cpuStart, int64_t cpuDuration,
>> +                int64_t vsizeStart, int64_t vsizeDuration,
>> +                int64_t rssStart, int64_t rssDuration);
>>
>>       void addFrameEnd();
>>
>> @@ -103,9 +117,13 @@ public:
>>
>>       void setBaseCpuTime(int64_t cpuStart);
>>       void setBaseGpuTime(int64_t gpuStart);
>> +    void setBaseVsizeUsage(int64_t vsizeStart);
>> +    void setBaseRssUsage(int64_t rssStart);
>>
>>       int64_t getBaseCpuTime();
>>       int64_t getBaseGpuTime();
>> +    int64_t getBaseVsizeUsage();
>> +    int64_t getBaseRssUsage();
>>
>>       static void parseLine(const char* line, Profile* profile);
>>
>> @@ -113,10 +131,13 @@ private:
>>       int64_t baseGpuTime;
>>       int64_t baseCpuTime;
>>       int64_t minCpuTime;
>> +    int64_t baseVsizeUsage;
>> +    int64_t baseRssUsage;
>>
>>       bool cpuTimes;
>>       bool gpuTimes;
>>       bool pixelsDrawn;
>> +    bool memoryUsage;
>>   };
>>   }
>>
>> diff --git a/retrace/CMakeLists.txt b/retrace/CMakeLists.txt
>> index b271727..6c6b636 100644
>> --- a/retrace/CMakeLists.txt
>> +++ b/retrace/CMakeLists.txt
>> @@ -98,6 +98,7 @@ if (WIN32 OR APPLE OR X11_FOUND)
>>               # http://stackoverflow.com/questions/2702628/gdb-cannot-find-new-threads-generic-error
>>               ${CMAKE_THREAD_LIBS_INIT}
>>               dl
>> +            proc
>>           )
>>
>>           if (${CMAKE_SYSTEM_NAME} MATCHES "Linux")
>> @@ -123,6 +124,7 @@ if (ENABLE_EGL AND X11_FOUND AND NOT WIN32 AND NOT APPLE)
>>           ${X11_X11_LIB}
>>           ${CMAKE_THREAD_LIBS_INIT}
>>           dl
>> +        proc
>>       )
>>
>>       if (${CMAKE_SYSTEM_NAME} MATCHES "Linux")
>> diff --git a/retrace/glretrace_main.cpp b/retrace/glretrace_main.cpp
>> index d0298fc..1bbd15f 100755
>> --- a/retrace/glretrace_main.cpp
>> +++ b/retrace/glretrace_main.cpp
>> @@ -31,6 +31,7 @@
>>   #include "glstate.hpp"
>>   #include "glretrace.hpp"
>>   #include "os_time.hpp"
>> +#include "os_memory.hpp"
>>
>>   /* Synchronous debug output may reduce performance however,
>>    * without it the callNo in the callback may be inaccurate
>> @@ -59,6 +60,10 @@ struct CallQuery
>>       const trace::FunctionSig *sig;
>>       int64_t cpuStart;
>>       int64_t cpuEnd;
>> +    int64_t vsizeStart;
>> +    int64_t vsizeEnd;
>> +    int64_t rssStart;
>> +    int64_t rssEnd;
>>   };
>>
>>   static bool supportsElapsed = true;
>> @@ -137,10 +142,20 @@ getTimeFrequency(void) {
>>       }
>>   }
>>
>> +static inline void
>> +getCurrentVsize(int64_t& vsize) {
>> +    vsize = os::getVsize();
>> +}
>> +
>> +static inline void
>> +getCurrentRss(int64_t& rss) {
>> +    rss = os::getRss();
>> +}
>> +
>>   static void
>>   completeCallQuery(CallQuery& query) {
>>       /* Get call start and duration */
>> -    int64_t gpuStart = 0, gpuDuration = 0, cpuDuration = 0, pixels = 0;
>> +    int64_t gpuStart = 0, gpuDuration = 0, cpuDuration = 0, pixels = 0, vsizeDuration = 0, rssDuration = 0;
>>
>>       if (query.isDraw) {
>>           if (retrace::profilingGpuTimes) {
>> @@ -163,10 +178,15 @@ completeCallQuery(CallQuery& query) {
>>           cpuDuration = query.cpuEnd - query.cpuStart;
>>       }
>>
>> +    if (retrace::profilingMemoryUsage) {
>> +        vsizeDuration = query.vsizeEnd - query.vsizeStart;
>> +        rssDuration = query.rssEnd - query.rssStart;
>> +    }
>> +
>>       glDeleteQueries(NUM_QUERIES, query.ids);
>>
>>       /* Add call to profile */
>> -    retrace::profiler.addCall(query.call, query.sig->name, query.program, pixels, gpuStart, gpuDuration, query.cpuStart, cpuDuration);
>> +    retrace::profiler.addCall(query.call, query.sig->name, query.program, pixels, gpuStart, gpuDuration, query.cpuStart, cpuDuration, query.vsizeStart, vsizeDuration, query.rssStart, rssDuration);
>>   }
>>
>>   void
>> @@ -213,6 +233,12 @@ beginProfile(trace::Call &call, bool isDraw) {
>>           CallQuery& query = callQueries.back();
>>           query.cpuStart = getCurrentTime();
>>       }
>> +
>> +    if (retrace::profilingMemoryUsage) {
>> +       CallQuery& query = callQueries.back();
>> +        query.vsizeStart = os::getVsize();
>> +        query.rssStart = os::getRss();
>> +    }
>>   }
>>
>>   void
>> @@ -234,6 +260,12 @@ endProfile(trace::Call &call, bool isDraw) {
>>               glEndQuery(GL_SAMPLES_PASSED);
>>           }
>>       }
>> +
>> +    if (retrace::profilingMemoryUsage) {
>> +        CallQuery& query = callQueries.back();
>> +       query.vsizeEnd = os::getVsize();
>> +       query.rssEnd = os::getRss();
>> +    }
>>   }
>>
>>   void
>> @@ -287,6 +319,14 @@ initContext() {
>>               retrace::profiler.setBaseGpuTime(currentTime);
>>           }
>>       }
>> +
>> +    if (retrace::profilingMemoryUsage) {
>> +        GLint64 currentVsize, currentRss;
>> +       getCurrentVsize(currentVsize);
>> +       retrace::profiler.setBaseVsizeUsage(currentVsize);
>> +       getCurrentRss(currentRss);
>> +       retrace::profiler.setBaseRssUsage(currentRss);
>> +    }
>>   }
>>
>>   void
>> diff --git a/retrace/retrace.hpp b/retrace/retrace.hpp
>> index 7c8dcd5..f1be8c8 100644
>> --- a/retrace/retrace.hpp
>> +++ b/retrace/retrace.hpp
>> @@ -92,6 +92,7 @@ extern bool profiling;
>>   extern bool profilingCpuTimes;
>>   extern bool profilingGpuTimes;
>>   extern bool profilingPixelsDrawn;
>> +extern bool profilingMemoryUsage;
>>
>>   /**
>>    * State dumping.
>> diff --git a/retrace/retrace_main.cpp b/retrace/retrace_main.cpp
>> index ca03745..6e3a2e6 100644
>> --- a/retrace/retrace_main.cpp
>> +++ b/retrace/retrace_main.cpp
>> @@ -72,6 +72,7 @@ bool profiling = false;
>>   bool profilingGpuTimes = false;
>>   bool profilingCpuTimes = false;
>>   bool profilingPixelsDrawn = false;
>> +bool profilingMemoryUsage = false;
>>   bool useCallNos = true;
>>
>>   unsigned frameNo = 0;
>> @@ -514,6 +515,7 @@ usage(const char *argv0) {
>>           "      --pcpu              cpu profiling (cpu times per call)\n"
>>           "      --pgpu              gpu profiling (gpu times per draw call)\n"
>>           "      --ppd               pixels drawn profiling (pixels drawn per draw call)\n"
>> +       "      --pmem              memory usage profiling (vsize rss per call)\n"
>>           "  -c, --compare=PREFIX    compare against snapshots with given PREFIX\n"
>>           "  -C, --calls=CALLSET     calls to compare (default is every frame)\n"
>>           "      --call-nos[=BOOL]   use call numbers in snapshot filenames\n"
>> @@ -536,6 +538,7 @@ enum {
>>       PCPU_OPT,
>>       PGPU_OPT,
>>       PPD_OPT,
>> +    PMEM_OPT,
>>       SB_OPT,
>>   };
>>
>> @@ -556,6 +559,7 @@ longOptions[] = {
>>       {"pcpu", no_argument, 0, PCPU_OPT},
>>       {"pgpu", no_argument, 0, PGPU_OPT},
>>       {"ppd", no_argument, 0, PPD_OPT},
>> +    {"pmem", no_argument, 0, PMEM_OPT},
>>       {"sb", no_argument, 0, SB_OPT},
>>       {"snapshot-prefix", required_argument, 0, 's'},
>>       {"snapshot", required_argument, 0, 'S'},
>> @@ -676,6 +680,13 @@ int main(int argc, char **argv)
>>
>>               retrace::profilingPixelsDrawn = true;
>>               break;
>> +        case PMEM_OPT:
>> +            retrace::debug = false;
>> +           retrace::profiling = true;
>> +           retrace::verbosity = -1;
>> +
>> +           retrace::profilingMemoryUsage = true;
>> +           break;
>>           default:
>>               std::cerr << "error: unknown option " << opt << "\n";
>>               usage(argv[0]);
>> @@ -685,7 +696,7 @@ int main(int argc, char **argv)
>>
>>       retrace::setUp();
>>       if (retrace::profiling) {
>> -        retrace::profiler.setup(retrace::profilingCpuTimes, retrace::profilingGpuTimes, retrace::profilingPixelsDrawn);
>> +        retrace::profiler.setup(retrace::profilingCpuTimes, retrace::profilingGpuTimes, retrace::profilingPixelsDrawn, retrace::profilingMemoryUsage);
>>       }
>>
>>       os::setExceptionCallback(exceptionCallback);
>> --
>> 1.7.10.1
>>
>> _______________________________________________
>> apitrace mailing list
>> apitrace at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/apitrace



More information about the apitrace mailing list