mischief [Thu, 26 Jun 2014 05:06:29 +0000 (22:06 -0700)]
upas/fs: disable imap mail fetch pipeline due to race
pipeline = 1 with a dovecot imap server causes FETCH and OK responses
get interleaved so some message bodies accidentally get merged together.
disabling it will make fetching mail over imap slower, but it works.
cinap_lenrek [Sun, 22 Jun 2014 13:12:45 +0000 (15:12 +0200)]
kernel: new pagecache, remove Lock from page, use cmpswap for Ref instead of Lock
make the Page stucture less than half its original size by getting rid of
the Lock and the lru.
The Lock was required to coordinate the unchaining of pages that where
both cached and on the lru freelist.
now pages have a single next pointer that is used for palloc.head
freelist xor for page cache hash chains in Image.pghash[].
cached pages are not on the freelist anymore, but will be reclaimed
from images by the pager when the freelist runs out of pages.
each Image has its own 512 hash chains for cached page lookup. That is
2MB worth of pages and there should be no collisions for most text images.
page reclaiming can be done without holding palloc.lock as the Image is
the owner of the page hash chains protected by the Image's lock.
reclaiming Image structures can be done quickly by only reclaiming pages from
inactive images, that is images which are not currently in use by segments.
the Ref structure has no Lock anymore. Only a single long that is atomically
incremented or decremnted using cmpswap().
there are various other changes as a consequence code. and lots of pikeshedding,
sorry.
cinap_lenrek [Wed, 11 Jun 2014 16:01:20 +0000 (18:01 +0200)]
revert ramfs example
the code was correct. erealloc9p() terminates the process
on error, but the code was handling realloc() error explicitely
and responded the request with Enomem error.
mischief [Mon, 9 Jun 2014 07:22:11 +0000 (00:22 -0700)]
pc: clip rectangles before sending them to the hardware in flushmemscreen
the vmware svga video card emulated by qemu (qemu -vga vmware) complains and eventually causes a panic if the rectangles aren't clipped.
messages like the following can be observed from qemu before the kernel panics:
vmsvga_update_rect: update h was < 0 (-20000)
vmsvga_update_rect: update height too large y: 10000, h: 0
vmsvga_update_rect: update w was < 0 (-20000)
vmsvga_update_rect: update width too large x: 10000, w: 0
i could only reproduce this in qemu 2.0.50 on the master branch, when using the ui and had selected 'Zoom To Fit' from the View menu.
cinap_lenrek [Sun, 8 Jun 2014 15:39:40 +0000 (17:39 +0200)]
swap: make sure swap address sticks arround until page is written to swap
we have to make sure the *swap address* doesnt go away,
after putting the swap address in the segment pte.
after we unlock the segment, the process could be
killed or fault which would cause the swap address to
be freed *before* we write the page to disk when it
pulls the page from the cache and putswap() swap pte.
keeping a reference to the page is no good. we have
to hold on the swap address. this also has the advantage
that we can now test if the swap address is still
referenced and can avoid writing to disk.
cinap_lenrek [Thu, 5 Jun 2014 19:54:32 +0000 (21:54 +0200)]
kernel: dont use atomic increment for Proc.nlocks, maintain Lock.m for lock(), use uintptr intstead of long for pc values
change Proc.nlocks from Ref to int and just use normal increment and decrelemt
as done in erik quanstros 9atom.
It is not clear why we used atomic increment in the fist place as even if we
get preempted by interrupt and scheduled before we write back the incremented
value, it shouldnt be a problem and we'll just continue where we left off as
our process is the only one that can write to it.
Yoann Padioleau found that the Mach pointer Lock.m wasnt maintained
consistently for lock() vs canlock() and ilock(). Fixed.
Use uintptr instead of ulong for maxlockpc, maxilockpc and ilockpc debug variables.
cinap_lenrek [Wed, 4 Jun 2014 15:45:08 +0000 (17:45 +0200)]
webfs: explicitely unmount old /mnt/web (thanks BurnZeZ)
webfs forks the namespace to isolate itself from its mount
point which has the side effect that it captures the mount
of previous instances of webfs mounted on /mnt/web.
explicitely unmount the mountpoint in our namespace copy
to drop the reference.
cinap_lenrek [Tue, 3 Jun 2014 05:21:48 +0000 (07:21 +0200)]
nusb/usbd: serialize /dev/usbevent processing
when there are multiple readers of /dev/usbevent, we have to
serialize the processing to make sure that only one driver
is opening the devices control endpoint at a time.
to do this, we assume the device is busy after reading the
event file until the next read or clunk on the same fid.
to mark a device busy, we set the dev->aux pointer to the
fid processing a event. And the Event structure takes a
reference to the device producing the event.
the problem arised from cdc ethernet and nusb/serial sharing
the same device class, and we need to run the particular driver
to figure out if the device can be used. doing this concurrently
fails because devusb allows only one open per endpoint.
cinap_lenrek [Sun, 1 Jun 2014 01:13:58 +0000 (03:13 +0200)]
pc64: allocate palloc.pages from upages
the palloc.pages array takes arround 5% of the upages which
gives us:
16GB = ~0.8GB
32GB = ~1.6GB
64GB = ~3.2GB
we only have 2GB of address space above KZERO so this will not
work for long.
instead, pageinit() was altered to accept a preallocated memory
in palloc.pages. and preallocpages() in pc64/main.c allocates the
in upages memory, mapping it in the VMAP area (which has 512GB).
the drawback is that we cannot poke at Page structures now from
/proc/n/mem as the VMAP area is not accessible from it.
Aram Hăvărneanu [Fri, 30 May 2014 10:28:01 +0000 (12:28 +0200)]
6a, 6c, 6l: fix copy propagation
Without an explicit signal for a truncation, copy propagation will
sometimes propagate a 32-bit truncation and end up overwriting uses of
the original 64-bit value.
This was independently discovered and fixed in Go. See:
http://golang.org/issue/1315
https://codereview.appspot.com/6002043/
cinap_lenrek [Thu, 29 May 2014 16:50:52 +0000 (18:50 +0200)]
pc, pc64: simplify reboot code
as we do system reset and reboot only from boot processor cpu0 now,
theres no need for active.rebooting conditional variable.
mpshutdown() will unconditionally park application processors and
and cpu0 boots the new kernel or calls mpshutdown() causing system
reset.
cinap_lenrek [Thu, 29 May 2014 16:24:40 +0000 (18:24 +0200)]
pc: initiate machine reset only from boot processors in mpshutdown()
in vmware, mpshutdown() used to hang in i8042reset() when not
called from the boot processor, so instead of reseting from first
cpu that acquires the shutdown lock, we park all application
processors and let the boot processor do the reset.
cinap_lenrek [Sun, 25 May 2014 22:27:06 +0000 (00:27 +0200)]
devproc: handle 64bit address writes to /proc/n/mem files
procwrite() did truncate the offset to 32bit ulong.
introduce off2addr() function that does the sign
extension hack and use it conststently for Qmem
reads and writes.
cinap_lenrek [Sat, 24 May 2014 00:09:52 +0000 (02:09 +0200)]
cpu: remove duplicate environment and chdir($home) code (thanks qrstuv)
newns() (called by auth_chuid()) already prepares the
environment variables and puts us in a sane working
directory (as specified by the namespace file).
cinap_lenrek [Fri, 23 May 2014 23:27:57 +0000 (01:27 +0200)]
kernel: fix read size calculation in pio() demand load
on amd64, the text segment is aligned and padded to
2MB, but segment granularity is 4K which can give
us page faults that are beyond the highest file
offset. this is perfectly valid, but was not handled
correctly in pio().
cinap_lenrek [Fri, 23 May 2014 16:56:20 +0000 (18:56 +0200)]
libc: avoid static table and supurious reads in nsec()
use two per process memory slots, one for the
pid and one for the fd instead of a global table
avoiding the case when the table gets full.
instead of calling pread() on the cached fd
(dangerous as it has side effects when the
fd was not closed), we check if the cached fd
is still good using fd2path() when called
the first time in this process.
cinap_lenrek [Tue, 20 May 2014 05:05:53 +0000 (07:05 +0200)]
libc: revert nsec() change, bring back filedescriptor caching
theres big performance regression with this using
cwfs. cwfs calls time() to update atime on every
read/write which now causes walks on /dev.
reverting to the previous version for now. in the
long run, we'll use new _nsec() syscall but this
has to wait for a later release once new kernels
are established.