Performance optimization in av::fixup - use buffered IO, not mapped file
Ryan Johnson
ryan.johnson@cs.utoronto.ca
Wed Dec 12 17:06:00 GMT 2012
On 12/12/2012 11:58 AM, Corinna Vinschen wrote:
> On Dec 12 09:47, Eric Blake wrote:
>> On 12/12/2012 08:39 AM, Ryan Johnson wrote:
>>
>>> Does gcc/ld/whatever know the final file size before the first write?
>> No, but does it need to? posix_fallocate() does not change file
>> contents; it merely says that anywhere there was previously a hole must
>> now be guaranteed to be backed by disk. So gcc would write the file as
>> usual, and then just before close()ing the fd, do a final
>> posix_fallocate(fd, 0, len) with len determined by the final file size.
>>
>>> You have to posix_fallocate the entire file before any write that might
>>> create a hole, because the sparse flag poisons the loader,
>> Is there really a flag stuck into the file when it becomes sparse?
> Yes. And, as I wrote, you can't remove it pre-Vista.
>
>>> cp --sparse=never $(which emacs-nox) dense
>>> for f in sparse dense; do echo $f; time ./$f -Q --batch --eval
>>> '(kill-emacs)'; done
>>> cp --sparse=never dense sparse
>>> for f in sparse dense; do echo $f; time ./$f -Q --batch --eval
>>> '(kill-emacs)'; done
>>> du dense sparse
>> This doesn't point to a flag in the file, so much as cached information
>> (the file system is remembering that 'sparse' used to be sparse, even if
>> it is no longer sparse). But your point about a file being cached at
>> some point while it is sparse, even if it is later made non-sparse, is
>> interesting.
>>
>>> The relevant output is:
>>>> sparse
>>>> real 0m1.791s
>>>>
>>>> dense
>>>> real 0m0.606s
>>>>
>>>> sparse
>>>> real 0m3.158s
>>>>
>>>> dense
>>>> real 0m0.081s
>>>>
>>>> 16728 dense
>>>> 16768 sparse
>>> Given that we're talking about cygwin-specific patches for emacs and
>>> binutils anyway, would it be better to add a cygwin-specific fcntl call
>>> that clears the file's sparse flag?
>> What flag is there to clear? Your cp demonstration showed that even
>> when we do a byte-for-byte copy of every byte (and the file is
>> non-sparse), the file system cache remembers that it used to be sparse.
>> How do we defeat that file system cache?
> Another question is, is that behaviour reproducible? Does it happen the
> second time the "new" non-sparse sparse file is called? You don't even
> know if the slowness is a result of writing the file is still in flight.
> Windows caching can be pretty slow at times, but it recovers quickly
> if a file is used again, usually.
It's painfully reproducible. It takes nearly two hours for a gcc
bootstrap compiler to configure the various bits of the next stage. It's
the same for emacs unexec (as OP reported).
I've seen how slow the cache is, it can take up to a minute before du
reports the actual number of pages in a freshly-copied sparse file. I
thought cp --sparse=always had a bug at first...
Even after du stabilizes, though, the slow loading persists
indefinitely. It doesn't matter how many times or how recently the
binary was last executed, you'll still pay the full cost to pull it off
disk again, easily confirmed with Resource Monitor (the same file being
read by umpteen different processes simultaneously).
$ for i in $(seq 20); do time ./sparse -Q --batch --eval '(kill-emacs)';
done 2>&1 | grep real | awk '{print $2}'
> 0m1.714s
> 0m1.548s
> 0m1.588s
> 0m1.570s
> 0m1.528s
> 0m1.563s
> 0m1.512s
> 0m1.676s
> 0m1.638s
> 0m1.663s
> 0m1.533s
> 0m1.567s
> 0m1.466s
> 0m1.669s
> 0m1.575s
> 0m1.489s
> 0m1.658s
> 0m1.497s
> 0m1.515s
> 0m1.541s
Ryan
Ryan
More information about the Cygwin-developers
mailing list