Recent (4/2011) updates Randell Jesup (see bugzilla for contact info)
jprof-mapfiles. By collecting a large number of these call stacks, it is possible to deduce where mozilla is spending its time.
Configure your mozilla with jprof support by adding
--enable-jprof to your configure options (eg adding
ac_add_options --enable-jprof to your
making sure that you do not have the
--enable-strip configure option set -- jprof needs symbols to
operate. On many architectures with GCC, you'll need to add
--enable-optimize="-O3 -fno-omit-frame-pointer" or the
equivalent to ensure frame pointer generation in the compiler you're using.
Finally, build mozilla with your new configuration. Now you can run jprof.
jprof [-v] [-t] [-e exclude] [-i include] [-s stackdepth] [--last] [--all] [--start n [--end m]] [--output-dir dir] prog log [log2 ...]Options:
kill -PROF, or with
kill -ALRMif JP_REALTIME is used, or with
kill -POLL(also known as
kill -IO) if JP_RTC_HZ is used).
freqis the frequency at which the timer should fire, measured in Hz. It must be a power of 2. The maximal frequency allowed by the kernel can be changed by writing to
/proc/sys/dev/rtc/max-user-freq; the maximum value it can be set to is 8192. Note that
/dev/rtcwill need to be readable by the Firefox process; making that file world-readable is a simple way to accomplish that.
SIGUSR2 will cause the circular buffer to be cleared.
A build with jprof enabled adds four functions to the Window object:
JProfStopProfiling(): When used with JP_DEFER, these
allow one to start and stop the timer just around whatever critical section is
These clear the circular buffer and save the buffer (without stopping), respectively.
setenv JPROF_FLAGS "JP_START JP_FIRST=3 JP_PERIOD=0.025"
setenv JPROF_FLAGS "JP_DEFER JP_PERIOD=0.001"
setenv JPROF_FLAGS "JP_DEFER JP_PERIOD=0.010 JP_REALTIME"
setenv JPROF_FLAGS "JP_DEFER JP_RTC_HZ=8192"
setenv JPROF_FLAGS "JP_DEFER JP_CIRCULAR=1048576 JP_PERIOD=0.001"
jprof can be paused at any time by sending a SIGUSR1 to mozilla (
-USR1). This will cause the timer signals to stop and jprof-map to be
written, but it will not close jprof-log. Combining SIGUSR1 with the JP_DEFER
option allows profiling of one sequence of actions by starting the timer right
before starting the actions and stopping the timer right afterward.
After a SIGUSR1, sending another timer signal (SIGPROF, SIGALRM, or SIGPOLL (aka SIGIO), depending on the mode) can be used to continue writing data to the same output.
SIGUSR2 will cause the circular buffer to be cleared, if it's in use. This is useful right before running a test when you're using a large, continuous circular buffer, or programmatically at the start of an action which might take too long (JProfClearCircular()).
jprof-mapfiles, we can use the jprof executable is used to turn them into readable output. To do this jprof needs the name of the mozilla binary and the log file. It deduces the name of the map file:
./jprof /home/user/mozilla/objdir/dist/bin/firefox ./jprof-log > tmp.htmlThis will generate the file
tmp.htmlwhich you should view in a web browser.
./jprof --output-dir=/tmp /home/user/mozilla/objdir/dist/bin/firefox ./jprof-log*This will generate a set of files in /tmp for each process.
This shows that of the 151603 times the timer fired, 1777 (1.2% of the total) were inside ComputedStyle::GetStyleData() and 1552 (1.0% of the total) were in the nsCOMPtr_base destructor.Total hit count: 151603 Count %Total Function Name 8806 5.8 __libc_poll 2254 1.5 __i686.get_pc_thunk.bx 2053 1.4 _int_malloc 1777 1.2 ComputedStyle::GetStyleData(nsStyleStructID) 1600 1.1 __libc_malloc 1552 1.0 nsCOMPtr_base::~nsCOMPtr_base()
In general, the functions with the highest count are the functions which are taking the most time.
The function names are linked to the entry for that function in the hierarchical profile, which is described in the next section.
The information this block tells us is:index Count Hits Function Name 545 (46.4%) nsBlockFrame::ReflowInlineFrames(nsBlockReflowState&, nsLineList_iterator, int*) 100 (8.5%) nsBlockFrame::ReflowDirtyLines(nsBlockReflowState&) 72870 4 (0.3%) 645 (54.9%) nsBlockFrame::DoReflowInlineFrames(nsBlockReflowState&, nsLineLayout&, nsLineList_iterator, nsFlowAreaRect&, int&, nsFloatManager::SavedState*, int*, LineReflowStatus*, int) 545 (46.4%) nsBlockFrame::ReflowInlineFrame(nsBlockReflowState&, nsLineLayout&, nsLineList_iterator, nsIFrame*, LineReflowStatus*) 83 (7.1%) nsBlockFrame::PlaceLine(nsBlockReflowState&, nsLineLayout&, nsLineList_iterator, nsFloatManager::SavedState*, nsRect&, int&, int*) 9 (0.8%) nsLineLayout::BeginLineReflow(int, int, int, int, int, int) 1 (0.1%) nsTextFrame::GetType() const 1 (0.1%) nsLineLayout::RelativePositionFrames(nsOverflowAreas&) 1 (0.1%) __i686.get_pc_thunk.bx 1 (0.1%) PL_ArenaAllocate
nsBlockFrame::DoReflowInlineFrames. Of these:
This block corresponds to the function
nsBlockFrame::DoReflowInlineFrames, which is
therefore bolded and not a link. The name of this function is preceded by
five numbers which have the following meaning. The number on the left (72870)
is the index number, and is not important. The next number (4) and the
percentage following (0.3%) are the number
of times this function was interrupted by the timer and the percentage of
the total hits that is. The last number pair ("645 (54.9%)")
are the number of times this function was in the call stack when the timer went
off. That is, the timer went off while we were in code that was ultimately
For our example we can see that our function was in the call stack for 645 interrupt ticks, but we were only the function that was running when the interrupt arrived 4 times.
The functions listed above the line for
nsBlockFrame::DoReflowInlineFrames are its
callers. The numbers to the left of these function names are the numbers of
times these functions were in the call stack as callers of
nsBlockFrame::DoReflowInlineFrames. In our example, we were called 545 times by
nsBlockFrame::ReflowInlineFrames and 100 times by
The functions listed below the line for
nsBlockFrame::DoReflowInlineFrames are its
callees. The numbers to the left of the function names are the numbers of
times these functions were in the callstack as callees of
nsBlockFrame::DoReflowInlineFrames and the corresponding percentages. In our example, of the 645 profiler hits under
nsBlockFrame::DoReflowInlineFrames 545 were under
nsBlockFrame::ReflowInlineFrame, 83 were under
nsBlockFrame::PlaceLine, and so forth.
NOTE: If there are loops of execution or recursion, the numbers will not add up and percentages can exceed 100%. If a function directly calls itself "(self)" will be appended to the line, but indirect recursion will not be marked.
If you get an error:
Inconsistency detected by ld.so: dl-open.c: 260: dl_open_worker: Assertion `_dl_debug_initialize (0, args->nsid)->r_state == RT_CONSISTENT' failed!
that means you've hit a timing hole in the version of glibc you're running. See Redhat bug 4578.