A 64MB machine (like the minimum one speced for Debian) would have an issue with this, but we don't need to move the minimum up to 2GB to fix that problem like one poster suggested. 128MB would be quite sufficient. After all, it's not like you can cache much of anything while installing. Trying to do so would just be wasting memory--most I/O is 'read once' or 'write and forget'. If the kernel and the install program take up more than 64MB, then someone's doing it wrong. (and the current value of 64MB for a minimum debian system would alread be broken)
I used to run debian on a 64MB Geode system with a 300MHz processor. I ran out of CPU before I ran out of memroy. Tasks that we now take for granted (like logging in with ssh using public key crypto or running 'aptitude') take quite a while.
does xz do multithreaded decompression yet? last i saw it was "coming sometime".
i've always found unbuntu kind of slow with package installs, even on multicore cpus.
although the time to update package lists is unrelated to the decompression, i wish that would be improved too.
The install process for a package (or group of packages) is written so that it could fail at any point in time. In order to do this, you'll get one file operation (eg putting a new file onto the filesystem, renaming the new file over the old file) done and then an fsync (or two). This ultimately adds up to several fsyncs per file. (Use strace on dpkg to see this going on.)
fsyncs are the slowest possible operation on a filesystem since they only return once the data is on the media. This involves the kernel flushing outstanding data (and on some filesystems like ext it will often flush all data for the filesystem not just the file), having a barrier and then waiting for the media to confirm the writes are on media. A 7200 rpm drive is 120 rps. Under perfect circumstances you would get 120 fsyncs per second but in practise there will be time waiting for the rotation, and possibly more than one write. For SSDs it is also slow since they like to buffer writes up and do big blocks at a time. Tiny little writes can be a lot slower.
You can disable fsync by using 'eatmydata' which stubs it out - eg 'sudo eatmydata apt-get install ....' and see the actual install speed ignoring the media. I do this for Ubuntu distro upgrades and it makes them orders of magnitude faster.
The better solution to this sort of thing is to have filesystem transactions which either succeed as a group of operations or all fail/rollback. You start a transaction, do all the operations and commit at the end. It should be noted NTFS now has this functionality and that btrfs can sort of fake it.
Someone already mentioned lzham in the previuos thread, similar compression ration than lzma but much faster decompression: https://code.google.com/p/lzham/
- in case a package contains something that can do serious harm to your computer, it is possible for the admins to remove it or remove its read permissions to prevent as much additional harm as possible from happening, until a new fixed package is available. Setting it as eternally cacheable would prevent that option.
- Another possible reason for a changed file is if an earlier copy contained wrong data or was not complete as the result from e.g. I/O errors, and gets overwritten with a correct copy later.
BTW: if you run your own caching server, you can always override those caching headers if you prefer.
Removing read or similar permissions won't make much difference. Files will already be cached based on how long ago they were modified, and it will have no effect on machines that have already installed the package. The correct fix is to release a new package version. Remember that it is only the packages (dpkg) caching that we are talking about - the catalog of packages is not cached and is what is updated pointing to the new package version. The same issue applies if the package was borked. Note that the signature will fail if there was a problem after build time. Ubuntu's PPA servers do not let you overwrite an existing uploaded package and I'd assume the main archive is the same way. So my original premise stands - package issues are addressed by a version bump, not by overwriting, and unless they set the packages to be never cacheable the approaches you mentioned will have little effect.
As for the last point, I don't see why it makes more sense for every cache administrator to go in and add extra configuration, rather than the source to set them correctly in the first place.