This is the mail archive of the cygwin-developers mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Performance optimization in av::fixup - use buffered IO, not mapped file


On 12/12/2012 11:58 AM, Corinna Vinschen wrote:
On Dec 12 09:47, Eric Blake wrote:
On 12/12/2012 08:39 AM, Ryan Johnson wrote:

Does gcc/ld/whatever know the final file size before the first write?
No, but does it need to?  posix_fallocate() does not change file
contents; it merely says that anywhere there was previously a hole must
now be guaranteed to be backed by disk.  So gcc would write the file as
usual, and then just before close()ing the fd, do a final
posix_fallocate(fd, 0, len) with len determined by the final file size.

You have to posix_fallocate the entire file before any write that might
create a hole, because the sparse flag poisons the loader,
Is there really a flag stuck into the file when it becomes sparse?
Yes. And, as I wrote, you can't remove it pre-Vista.

cp --sparse=never $(which emacs-nox) dense
for f in sparse dense; do echo $f; time ./$f -Q --batch --eval
'(kill-emacs)'; done
cp --sparse=never dense sparse
for f in sparse dense; do echo $f; time ./$f -Q --batch --eval
'(kill-emacs)'; done
du dense sparse
This doesn't point to a flag in the file, so much as cached information
(the file system is remembering that 'sparse' used to be sparse, even if
it is no longer sparse).  But your point about a file being cached at
some point while it is sparse, even if it is later made non-sparse, is
interesting.

The relevant output is:
sparse
real    0m1.791s

dense
real    0m0.606s

sparse
real    0m3.158s

dense
real    0m0.081s

16728   dense
16768   sparse
Given that we're talking about cygwin-specific patches for emacs and
binutils anyway, would it be better to add a cygwin-specific fcntl call
that clears the file's sparse flag?
What flag is there to clear?  Your cp demonstration showed that even
when we do a byte-for-byte copy of every byte (and the file is
non-sparse), the file system cache remembers that it used to be sparse.
  How do we defeat that file system cache?
Another question is, is that behaviour reproducible?  Does it happen the
second time the "new" non-sparse sparse file is called?  You don't even
know if the slowness is a result of writing the file is still in flight.
Windows caching can be pretty slow at times, but it recovers quickly
if a file is used again, usually.
It's painfully reproducible. It takes nearly two hours for a gcc bootstrap compiler to configure the various bits of the next stage. It's the same for emacs unexec (as OP reported).

I've seen how slow the cache is, it can take up to a minute before du reports the actual number of pages in a freshly-copied sparse file. I thought cp --sparse=always had a bug at first...

Even after du stabilizes, though, the slow loading persists indefinitely. It doesn't matter how many times or how recently the binary was last executed, you'll still pay the full cost to pull it off disk again, easily confirmed with Resource Monitor (the same file being read by umpteen different processes simultaneously).

$ for i in $(seq 20); do time ./sparse -Q --batch --eval '(kill-emacs)'; done 2>&1 | grep real | awk '{print $2}'
0m1.714s
0m1.548s
0m1.588s
0m1.570s
0m1.528s
0m1.563s
0m1.512s
0m1.676s
0m1.638s
0m1.663s
0m1.533s
0m1.567s
0m1.466s
0m1.669s
0m1.575s
0m1.489s
0m1.658s
0m1.497s
0m1.515s
0m1.541s

Ryan





Ryan



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]