Dear Members,
I have a Kubuntu 22.04(.3) installation on a Lenovo ThinkStation P330 Tiny. In it I have an NVIDIA Corporation GP107GL [Quadro P1000] (rev a1) graphics card, that I use with the default nouveau drivers.
For a few months now, I have been experiencing problems that can be tracked back to the nouveau driver, and things just have gotten worse over time.
Now I have to boot up my machine 3-4 times every day, just to get it running properly. Because it tends to crash after the GUI login.
At first I only saw timeout error messages like:
Code:
nouveau 0000:01:00.0: fifo: FB_FLUSH_TIMEOUT
fifo: PBDMA0: 00000004 [MEMACK_EXTRA] ch 2 [00ff962000 Xorg[2523]] subc 0 mthd 0e04 data 00f90062
DRM: base-1: timeout
And in order to mitigate the problem, I have ceased to play long running youtube videos, then as the problem got worse, I disabled hardware acceleration in all of my browsers, and that improved things for a while too.
Now I seem to have run out of options, and it is really starting to hurt the overall usability of the machine.
Currently, the nouveau crashes produce logs like this:
Code:
Aug 23 06:58:14 janos-work-host kernel: [ 5.509825] fbcon: nouveaudrmfb (fb0) is primary device
Aug 23 06:58:14 janos-work-host kernel: [ 7.695620] nouveau 0000:01:00.0: DRM: core notifier timeout
Aug 23 06:58:14 janos-work-host kernel: [ 8.487136] nouveau 0000:01:00.0: [drm] fb0: nouveaudrmfb frame buffer device
Aug 23 06:58:14 janos-work-host kernel: [ 8.512533] [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on minor 1
Aug 23 06:58:14 janos-work-host kernel: [ 8.512553] nouveau 0000:01:00.0: DRM: Disabling PCI power management to avoid bug
Aug 23 06:58:14 janos-work-host kernel: [ 20.649242] snd_hda_intel 0000:01:00.1: bound 0000:01:00.0 (ops nv50_audio_component_bind_ops [nouveau])
Aug 23 06:58:14 janos-work-host sensors[1676]: nouveau-pci-0100
Aug 23 06:58:39 janos-work-host kernel: [ 46.813257] nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 0000000000020000 engine 06 [HOST0] client 07 [HUB/HOST_CPU] reason 02 [PTE] on channel 8 [00fef16000 Xorg[2683]]
Aug 23 06:58:39 janos-work-host kernel: [ 46.813269] nouveau 0000:01:00.0: fifo: channel 8: killed
Aug 23 06:58:39 janos-work-host kernel: [ 46.813272] nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recovery
Aug 23 06:58:39 janos-work-host kernel: [ 46.813305] WARNING: CPU: 3 PID: 3183 at drivers/gpu/drm/nouveau/nvkm/engine/fifo/gk104.c:284 gk104_fifo_engine_id+0x4f/0x80 [nouveau]
Aug 23 06:58:39 janos-work-host kernel: [ 46.813408] snd_seq_midi_event libarc4 snd_rawmidi kvm_intel snd_seq btusb kvm btrtl btbcm btintel btmtk snd_seq_device iwlwifi bluetooth nls_iso8859_1 snd_timer rapl mei_me intel_cstate input_leds joydev snd think_lmi ecdh_generic intel_wmi_thunderbolt firmware_attributes_class ecc wmi_bmof cfg80211 ee1004 mei soundcore mac_hid acpi_pad acpi_tad sch_fq_codel msr parport_pc ppdev lp parport ramoops reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 dm_crypt hid_generic i915 nouveau r8152 mxm_wmi drm_ttm_helper i2c_algo_bit ttm cdc_ncm drm_kms_helper cdc_ether syscopyarea sysfillrect usbnet usbhid sysimgblt fb_sys_fops mii hid crct10dif_pclmul cec crc32_pclmul ghash_clmulni_intel rc_core aesni_intel crypto_simd drm nvme cryptd e1000e i2c_i801 xhci_pci nvme_core i2c_smbus xhci_pci_renesas wmi video
Aug 23 06:58:39 janos-work-host kernel: [ 46.813448] RIP: 0010:gk104_fifo_engine_id+0x4f/0x80 [nouveau]
Aug 23 06:58:39 janos-work-host kernel: [ 46.813548] gk104_fifo_fault+0x10c/0x230 [nouveau]
Aug 23 06:58:39 janos-work-host kernel: [ 46.813606] nvkm_fifo_fault+0x15/0x20 [nouveau]
Aug 23 06:58:39 janos-work-host kernel: [ 46.813662] gp100_fifo_intr_fault+0xe0/0x110 [nouveau]
Aug 23 06:58:39 janos-work-host kernel: [ 46.813716] gk104_fifo_intr+0x2a6/0x3c0 [nouveau]
Aug 23 06:58:39 janos-work-host kernel: [ 46.813791] nvkm_fifo_intr+0x1d/0x20 [nouveau]
Aug 23 06:58:39 janos-work-host kernel: [ 46.813866] nvkm_engine_intr+0x1f/0x30 [nouveau]
Aug 23 06:58:39 janos-work-host kernel: [ 46.813901] nvkm_subdev_intr+0x1a/0x20 [nouveau]
Aug 23 06:58:39 janos-work-host kernel: [ 46.813937] nvkm_mc_intr+0x14e/0x190 [nouveau]
Aug 23 06:58:39 janos-work-host kernel: [ 46.813984] ? gp100_mc_intr_unarm+0x3a/0x50 [nouveau]
Aug 23 06:58:39 janos-work-host kernel: [ 46.814031] nvkm_pci_intr+0x5e/0xb0 [nouveau]
Aug 23 06:58:39 janos-work-host kernel: [ 46.814142] nouveau 0000:01:00.0: Xorg[2683]: channel 8 killed!
Aug 23 06:59:47 janos-work-host kernel: [ 115.087275] nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 000000000127a000 engine 05 [BAR2] client 08 [HUB/HOST_CPU_NB] reason 02 [PTE] on channel -1 [00ffebf000 unknown]
Aug 23 06:59:47 janos-work-host kernel: [ 115.144046] nouveau 0000:01:00.0: fifo: fault 00 [READ] at 0000000011402000 engine 15 [ce0] client 01 [HUB/CE0] reason 00 [PDE] on channel 2 [00ff962000 Xorg[2683]]
Aug 23 06:59:47 janos-work-host kernel: [ 115.144083] nouveau 0000:01:00.0: fifo: channel 2: killed
Aug 23 06:59:47 janos-work-host kernel: [ 115.144090] nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recovery
Aug 23 06:59:47 janos-work-host kernel: [ 115.144100] nouveau 0000:01:00.0: fifo: engine 0: scheduled for recovery
Aug 23 06:59:47 janos-work-host kernel: [ 115.144106] nouveau 0000:01:00.0: fifo: engine 7: scheduled for recovery
Aug 23 06:59:47 janos-work-host kernel: [ 115.144117] nouveau 0000:01:00.0: Xorg[2683]: channel 2 killed!
Aug 23 07:00:05 janos-work-host kernel: [ 132.316992] nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 0000000000020000 engine 06 [HOST0] client 07 [HUB/HOST_CPU] reason 02 [PTE] on channel 9 [00fedbc000 gst-plugin-scan[7003]]
Aug 23 07:00:05 janos-work-host kernel: [ 132.317005] nouveau 0000:01:00.0: fifo: channel 9: killed
Aug 23 07:00:05 janos-work-host kernel: [ 132.317008] nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recovery
Aug 23 07:00:05 janos-work-host kernel: [ 132.317042] WARNING: CPU: 3 PID: 6824 at drivers/gpu/drm/nouveau/nvkm/engine/fifo/gk104.c:284 gk104_fifo_engine_id+0x4f/0x80 [nouveau]
Aug 23 07:00:05 janos-work-host kernel: [ 132.317140] snd_pcm mac80211 snd_seq_midi snd_seq_midi_event libarc4 snd_rawmidi kvm_intel snd_seq btusb kvm btrtl btbcm btintel btmtk snd_seq_device iwlwifi bluetooth nls_iso8859_1 snd_timer rapl mei_me intel_cstate input_leds joydev snd think_lmi ecdh_generic intel_wmi_thunderbolt firmware_attributes_class ecc wmi_bmof cfg80211 ee1004 mei soundcore mac_hid acpi_pad acpi_tad sch_fq_codel msr parport_pc ppdev lp parport ramoops reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 dm_crypt hid_generic i915 nouveau r8152 mxm_wmi drm_ttm_helper i2c_algo_bit ttm cdc_ncm drm_kms_helper cdc_ether syscopyarea sysfillrect usbnet usbhid sysimgblt fb_sys_fops mii hid crct10dif_pclmul cec crc32_pclmul ghash_clmulni_intel rc_core aesni_intel crypto_simd drm nvme cryptd e1000e i2c_i801 xhci_pci nvme_core i2c_smbus xhci_pci_renesas wmi video
Does anyone have any idea how I could improve the situation? Maybe switch over to the official NVIDIA driver?