Проблема с файловой системой ext4

Автор B@F, 03 Декабря 2013, 09:11

« предыдущая тема - следующая тема »

0 Пользователей и 1 Гость просматривают эту тему.

B@F

Доброе утро.

Опишу что есть. Сервер HP с ОС Debian 7. На нем xen 4.1 Domain-0 . Гость тоже xen и тот же debian 7.

Имеется проблема с файловой системой ext4 у гостя. Работает работает а потом бац и вот:
EXT4-fs error (device xvda2): htree_dirblock_to_tree:587: inode #528458: block 2106047: comm tar: bad entry in directory: rec_len is smaller than minimal - offset=0(0), inode=0, rec_len=0, name_len=0
[4798396.423227] Aborting journal on device xvda2-8.
[4798396.423543] EXT4-fs (xvda2): Remounting filesystem read-only


Вот файл fstab гостя:
# /etc/fstab: static file system information.
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
proc            /proc           proc    defaults        0       0
devpts          /dev/pts        devpts  rw,noexec,nosuid,gid=5,mode=620 0  0
/dev/xvda1 none swap sw 0 0
/dev/xvda2 / ext4 noatime,nodiratime,errors=remount-ro,usrjquota=quota.user,grpjquota=quota.group,jqfmt=vfsv0,barrier=0 0 1


Изначально barrier=0 не было, прочитал что есть баг в ксене и что лечится отключением барьеров, выключил. Прошло около 3 месяцев и казалась проблема решена. Как бац и снова этот косяк.

Файловая система при обнаружении ошибок монтируется в режим только для чтения, потом я вижу что все работает но не работает, перегружаю гостя и получаю:
[   13.548767] EXT4-fs (xvda2): re-mounted. Opts: (null)
[....] Checking root file system...fsck from util-linux 2.20.1
/dev/xvda2 contains a file system with errors, check forced.
Deleted inode 655364 has zero dtime.  FIXED.                                               
/dev/xvda2: Inodes that were part of a corrupted orphan linked list found. 

/dev/xvda2: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
        (i.e., without -a or -p options)
fsck died with exit status 4
failed (code 4).
[....] An automatic file system check (fsck) of the root filesystem failed. A manual fsck must be performed, then the system restarted. T[FAILck should be performed in maintenance mode with the root filesystem mounted in read-only mode. ... failed!
[....] The root filesystem is currently mounted in read-only mode. A maintenance shell will now be started. After performing system maint[warne, press CONTROL-D to terminate the maintenance shell and restart the system. ... (warning).
Give root password for maintenance
(or type Control-D to continue):


После перезагрузки должно было выполнится автоматическая проверка фс чего не произошло, видно выше. Делаю это вручную и выполняю ребут как того требует система. После этого все работает. Но на сколько долго не понятно.

У кого есть какие мысли по этому поводу? Можно ли как-то автоматизировать процесс, если не получится починить.

Поправьте, если я ошибаюсь, буду тока рад.

hedgeven

Эти рекомендации не помогают? http://www.novell.com/support/kb/doc.php?id=3554036
ЦитироватьThis error is caused by a file that has been marked as a directory. This is a non-fatal error and can be fixed by removing the file in question.

  • Mount the file-system in question

  • Locate the file that has been corrupted. The file's inode is the number after "bad entry in directory" Using the example error code the file would be found by typing
        find /MOUNT_POINT-inum 5556142

  • Delete the file identified in step two

  • Umount the file-system

  • Check the disk, and check for errors.
        fsck /dev/PHYSICAL_DEVICE

  • Repeat step 5. If no errors, the file-system is clean.

Jah will give us everything...

B@F

03 Декабря 2013, 13:01 #2 Последнее редактирование: 03 Декабря 2013, 13:18 от B@F
Я считаю, что эти рекомендации малость не те, все же у меня htree_dirblock_to_tree а там bad entry in directory и тем не менее я сделал как там
find / -inum 528458       
find: `/proc/4686/task/4686/fd/5': Нет такого файла или каталога
find: `/proc/4686/task/4686/fdinfo/5': Нет такого файла или каталога
find: `/proc/4686/fd/5': Нет такого файла или каталога
find: `/proc/4686/fdinfo/5': Нет такого файла или каталога
... /web/administrator

Я эту папку скачал и удалил. Затем перешел на 1 левел, перемонтировал в ro и выполнил fsck. Ошибок не было найдено. Затем все восстановил.
Думаю это не поможет.  :huh:


После восстановления find / -inum 528458 ссылается на файл лежащий в той папке при первом поиске.
.../web/administrator/components/com_modules/models/select.php

Изначально я думал, что решил проблему http://www.prolinux.org/node/149 , но не тут то было.
Поправьте, если я ошибаюсь, буду тока рад.

hedgeven

Немного не понятно, что значит все восстановил?
Я так понял после удаления файла ошибка исправилась?
Jah will give us everything...

B@F

Цитата: hedgeven от 03 Декабря 2013, 13:58
Немного не понятно, что значит все восстановил?
Я так понял после удаления файла ошибка исправилась?

Удалил папку, проверил в режиме ro и восстановил папку. Ошибки как таковой не было, т.к. Сервер еще утром заработал. Просто сейчас надо сделать так что бы такого больше не было.
Поправьте, если я ошибаюсь, буду тока рад.

B@F

Вот что удалось нарыть
[spoiler]WARNING:  Kernel Errors Present
   EXT4-fs (xvda2): re-mounted. Opts: errors=remount-ro,usrj ...:  2 Time(s)
   Error: Driver 'pcspkr' ...:  2 Time(s)
   rtc_cmos: probe of rtc_cmos failed with error -38 ...:  2 Time(s)

2 Time(s): .data : 0xc12c93c0 - 0xc141a100   (1347 kB)
2 Time(s): .init : 0xc141b000 - 0xc1483000   ( 416 kB)
2 Time(s): .text : 0xc1000000 - 0xc12c93c0   (2852 kB)
2 Time(s): /build/linux-n2St39/linux-3.2.51/drivers/rtc/hctosys.c: unable to open rtc device (rtc0)
2 Time(s): 0000000000 - 002d1fe000 page 4k
2 Time(s): 0: 0x00000010 -> 0x000000a0
2 Time(s): 0: 0x00000100 -> 0x0002e000
2 Time(s): 14MB HIGHMEM available.
2 Time(s): 721MB LOWMEM available.
2 Time(s): ACPI in unprivileged domain disabled
2 Time(s): ACPI: Interpreter disabled.
2 Time(s): APIC: disable apic facility
2 Time(s): APIC: switched to apic NOOP
2 Time(s): Adding 1048572k swap on /dev/xvda1.  Priority:-1 extents:1 across:1048572k SS
2 Time(s): Allocating PCI resources starting at 2e000000 (gap: 2e000000:d2000000)
2 Time(s): AppArmor: AppArmor disabled by boot time parameter
2 Time(s): BIOS-provided physical RAM map:
2 Time(s): Base memory trampoline at [c009c000] 9c000 size 16384
2 Time(s): Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253)
2 Time(s): Booting paravirtualized kernel on Xen
2 Time(s): Brought up 2 CPUs
2 Time(s): Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 186831
2 Time(s): CPU 0 irqstacks, hard=ec408000 soft=ec40a000
2 Time(s): CPU 1 irqstacks, hard=ec478000 soft=ec47a000
2 Time(s): CPU: Physical Processor ID: 0
2 Time(s): CPU: Processor Core ID: 0
1 Time(s): Calibrating delay loop (skipped), value calculated using timer frequency.. 6400.57 BogoMIPS (lpj=12801144)
1 Time(s): Calibrating delay loop (skipped), value calculated using timer frequency.. 6400.62 BogoMIPS (lpj=12801240)
2 Time(s): Console: colour dummy device 80x25
2 Time(s): DMA      0x00000010 -> 0x00001000
2 Time(s): DMA zone: 0 pages reserved
2 Time(s): DMA zone: 32 pages used for memmap
2 Time(s): DMA zone: 3952 pages, LIFO batch:0
2 Time(s): DMI not present or invalid.
2 Time(s): Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
1 Time(s): Detected 3200.286 MHz processor.
1 Time(s): Detected 3200.310 MHz processor.
2 Time(s): Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
2 Time(s): EXT4-fs (xvda2): mounted filesystem with ordered data mode. Opts: (null)
2 Time(s): EXT4-fs (xvda2): re-mounted. Opts: (null)
2 Time(s): FS-Cache: Loaded
2 Time(s): FS-Cache: Netfs 'nfs' registered for caching
2 Time(s): Freeing initrd memory: 23776k freed
2 Time(s): Freeing unused kernel memory: 416k freed
2 Time(s): Grant table initialized
2 Time(s): Hierarchical RCU implementation.
2 Time(s): HighMem  0x0002d1fe -> 0x0002e000
2 Time(s): HighMem zone: 29 pages used for memmap
2 Time(s): HighMem zone: 3557 pages, LIFO batch:0
2 Time(s): HugeTLB registered 2 MB page size, pre-allocated 0 pages
2 Time(s): IP route cache hash table entries: 32768 (order: 5, 131072 bytes)
2 Time(s): Initialising Xen virtual ethernet driver.
2 Time(s): Initializing CPU#0
2 Time(s): Initializing CPU#1
2 Time(s): Initializing HighMem for node 0 (0002d1fe:0002e000)
2 Time(s): Initializing cgroup subsys blkio
2 Time(s): Initializing cgroup subsys cpu
2 Time(s): Initializing cgroup subsys cpuacct
2 Time(s): Initializing cgroup subsys cpuset
2 Time(s): Initializing cgroup subsys devices
2 Time(s): Initializing cgroup subsys freezer
2 Time(s): Initializing cgroup subsys memory
2 Time(s): Initializing cgroup subsys net_cls
2 Time(s): Initializing cgroup subsys perf_event
2 Time(s): Initializing network drop monitor service
2 Time(s): Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
2 Time(s): Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
2 Time(s): Kernel command line: root=/dev/xvda2 ro root=/dev/xvda2 ro
2 Time(s): Linux agpgart interface v0.103
2 Time(s): Linux version 3.2.0-4-686-pae (debian-kernel@lists.debian.org) (gcc version 4.6.3 (Debian 4.6.3-14) ) #1 SMP Debian 3.2.51-1
2 Time(s): Local APIC disabled by BIOS -- you can enable it with "lapic"
2 Time(s): Memory: 707008k/753664k available (2852k kernel code, 38016k reserved, 1347k data, 416k init, 6152k highmem)
2 Time(s): Mobile IPv6
2 Time(s): Mount-cache hash table entries: 512
2 Time(s): Movable zone start PFN for each node
2 Time(s): NET: Registered protocol family 1
2 Time(s): NET: Registered protocol family 10
2 Time(s): NET: Registered protocol family 16
2 Time(s): NET: Registered protocol family 17
2 Time(s): NET: Registered protocol family 2
2 Time(s): NMI watchdog disabled (cpu0): hardware events not enabled
2 Time(s): NMI watchdog disabled (cpu1): hardware events not enabled
2 Time(s): NR_IRQS:2304 nr_irqs:288 16
2 Time(s): NX (Execute Disable) protection: active
2 Time(s): NX-protecting the kernel data: 3288k
2 Time(s): Normal   0x00001000 -> 0x0002d1fe
2 Time(s): Normal zone: 1412 pages used for memmap
2 Time(s): Normal zone: 179322 pages, LIFO batch:31
2 Time(s): On node 0 totalpages: 188304
2 Time(s): PCI: CLS 0 bytes, default 64
4 Time(s): PCI: System does not support PCI
2 Time(s): PCI: max bus depth: 0 pci_try_num: 1
2 Time(s): PCI: pci_cache_line_size set to 64 bytes
2 Time(s): PCI: setting up Xen PCI frontend stub
2 Time(s): PERCPU: Embedded 14 pages/cpu @ecc1b000 s33280 r0 d24064 u57344
2 Time(s): PID hash table entries: 4096 (order: 2, 16384 bytes)
2 Time(s): PM: Hibernation image not present or could not be loaded.
2 Time(s): PM: Registered nosave memory: 00000000000a0000 - 0000000000100000
2 Time(s): Performance Events: unsupported Netburst CPU model 4 no PMU driver, software events only.
2 Time(s): PnPBIOS: Disabled
2 Time(s): RAMDISK: 01785000 - 02ebd000
2 Time(s): RCU dyntick-idle grace-period acceleration is enabled.
2 Time(s): RPC: Registered named UNIX socket transport module.
2 Time(s): RPC: Registered tcp NFSv4.1 backchannel transport module.
2 Time(s): RPC: Registered tcp transport module.
2 Time(s): RPC: Registered udp transport module.
2 Time(s): Registering the dns_resolver key type
2 Time(s): Released 0 pages of unused memory
2 Time(s): Reserving virtual address space above 0xf5800000
2 Time(s): SFI: Simple Firmware Interface v0.81 http://simplefirmware.org
2 Time(s): SMP alternatives: switching to SMP code
2 Time(s): SMP alternatives: switching to UP code
2 Time(s): SMP: Allowing 2 CPUs, 0 hotplug CPUs
2 Time(s): Security Framework initialized
2 Time(s): Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
2 Time(s): Set 0 page(s) to 1-1 mapping
2 Time(s): Setting capacity to 125829120
1 Time(s): Setting capacity to 2097152
2 Time(s): Switching to clocksource xen
2 Time(s): TCP bind hash table entries: 65536 (order: 7, 524288 bytes)
2 Time(s): TCP cubic registered
2 Time(s): TCP established hash table entries: 131072 (order: 8, 1048576 bytes)
2 Time(s): TCP reno registered
2 Time(s): TCP: Hash tables configured (established 131072 bind 65536)
2 Time(s): UDP hash table entries: 512 (order: 2, 16384 bytes)
2 Time(s): UDP-Lite hash table entries: 512 (order: 2, 16384 bytes)
2 Time(s): Unpacking initramfs...
2 Time(s): Using APIC driver default
2 Time(s): Using IPI No-Shortcut mode
2 Time(s): VFS: Disk quotas dquot_6.5.2
2 Time(s): Write protecting the kernel read-only data: 1084k
2 Time(s): Write protecting the kernel text: 2856k
2 Time(s): XENBUS: Device with no driver: device/console/0
2 Time(s): XENBUS: Device with no driver: device/vbd/51713
2 Time(s): XENBUS: Device with no driver: device/vbd/51714
2 Time(s): XENBUS: Device with no driver: device/vif/0
2 Time(s): Xen version: 4.1.4 (preserve-AD)
2 Time(s): Xen: 0000000000000000 - 00000000000a0000 (usable)
2 Time(s): Xen: 00000000000a0000 - 0000000000100000 (reserved)
2 Time(s): Xen: 0000000000100000 - 000000002e000000 (usable)
2 Time(s): Xen: using vcpuop timer interface
2 Time(s): Zone PFN ranges:
2 Time(s): acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
2 Time(s): alg: No test for stdrng (krng)
2 Time(s): audit: initializing netlink socket (disabled)
2 Time(s): bio: create slab <bio-0> at 0
2 Time(s): blkfront: xvda1: flush diskcache: enabled
2 Time(s): blkfront: xvda2: flush diskcache: enabled
2 Time(s): console [hvc0] enabled
2 Time(s): console [tty0] enabled
2 Time(s): devtmpfs: initialized
2 Time(s): e820 remove range: 00000000000a0000 - 0000000000100000 (usable)
2 Time(s): e820 update range: 0000000000000000 - 0000000000010000 (usable) ==> (reserved)
2 Time(s): early_node_map[2] active PFN ranges
2 Time(s): eth0: no IPv6 routers present
2 Time(s): fixmap  : 0xf5536000 - 0xf57ff000   (2852 kB)
2 Time(s): free_area_init_node: node 0, pgdat c1412740, node_mem_map ecc3d200
2 Time(s): highmem bounce pool size: 64 pages
1 Time(s): hrtimer: interrupt took 180321360 ns
2 Time(s): i8042: No controller found
2 Time(s): i8042: PNP: No PS/2 controller found. Probing ports directly.
2 Time(s): imklog 5.8.11, log source = /proc/kmsg started.
2 Time(s): init_memory_mapping: 0000000000000000-000000002d1fe000
2 Time(s): initial memory mapped : 0 - 03bff000
2 Time(s): input: PC Speaker as /devices/platform/pcspkr/input/input0
2 Time(s): installing Xen timer for CPU 0
2 Time(s): installing Xen timer for CPU 1
2 Time(s): io scheduler cfq registered (default)
2 Time(s): io scheduler deadline registered
2 Time(s): io scheduler noop registered
1 Time(s): ip6_tables: (C) 2000-2006 Netfilter Core Team
2 Time(s): ip_tables: (C) 2000-2006 Netfilter Core Team
2 Time(s): isapnp: No Plug & Play device found
2 Time(s): isapnp: Scanning for PnP cards...
2 Time(s): kernel direct mapping tables up to 2d1fe000 @ 3a92000-3bff000
2 Time(s): last_pfn = 0x2e000 max_arch_pfn = 0x1000000
2 Time(s): low ram: 0 - 2d1fe000
2 Time(s): lowmem  : 0xc0000000 - 0xed1fe000   ( 721 MB)
2 Time(s): mapped low ram: 0 - 2d1fe000
2 Time(s): mousedev: PS/2 mouse device common for all mice
2 Time(s): msgmni has been set to 1415
2 Time(s): nf_conntrack version 0.5.0 (11425 buckets, 45700 max)
2 Time(s): nr_irqs_gsi: 16
2 Time(s): pci_hotplug: PCI Hot Plug PCI Core version: 0.5
2 Time(s): pciehp: PCI Express Hot Plug Controller Driver version: 0.4
2 Time(s): pcpu-alloc:
  • 0
  • 1
    2 Time(s): pcpu-alloc: s33280 r0 d24064 u57344 alloc=14*4096
    2 Time(s): pid_max: default: 32768 minimum: 301
    2 Time(s): pkmap   : 0xf5200000 - 0xf5400000   (2048 kB)
    2 Time(s): platform rtc_cmos: registered platform RTC device (no PNP device found)
    2 Time(s): pnp: PnP ACPI: disabled
    2 Time(s): print_constraints: dummy:
    2 Time(s): registered taskstats version 1
    2 Time(s): rtc_cmos rtc_cmos: rtc core: registered rtc_cmos as rtc0
    2 Time(s): setup_percpu: NR_CPUS:32 nr_cpumask_bits:32 nr_cpu_ids:2 nr_node_ids:1
    2 Time(s): vgaarb: loaded
    2 Time(s): virtual kernel memory layout:
    2 Time(s): vmalloc : 0xed9fe000 - 0xf51fe000   ( 120 MB)
    2 Time(s): xen-balloon: Initialising balloon driver.
    2 Time(s): xen/balloon: Initialising balloon driver.
    2 Time(s): xen: setting RW the range 3bdf000 - 3bff000
    1 Time(s): xvda1: detected capacity change from 0 to 1073741824
    2 Time(s): xvda2: detected capacity change from 0 to 64424509440[/spoiler]
    Может кто что сказать?
Поправьте, если я ошибаюсь, буду тока рад.

B@F

Вроде победил проблему.
Отключил барьеры на Dom0 и убрал из fstab гостоя параметры noatime,nodiratime.
Видимо основная система не успевала записывать на диск за гостевой.
uptime
12:38:44 up 31 days, 23:29,  1 user,  load average: 0,03, 0,02, 0,05

Не показатель, но все же уже хоть какой-то результат.
Поправьте, если я ошибаюсь, буду тока рад.