Author Topic: Sometimes system hangs - baytrail - help to find out the reason and to fix  (Read 7211 times)

0 Members and 1 Guest are viewing this topic.

Offline mark001

  • Sr. Mitglied
  • ****
  • Posts: 263
  • Branch: stable
  • Desktop: xfce4
  • GPU Card: Intel HD Graphics
  • GPU driver: i915
  • Skill: Novice
Re: Sometimes system hangs - help to find out the reason and to fix
« Reply #30 on: 08. April 2016, 14:01:31 »
Just adding an information, C_State 0 and 1 should have the same effect.
http://www.breakage.org/2012/11/14/processor-max_cstate-intel_idle-max_cstate-and-devcpu_dma_latency/

You mean some default value causes the mentioned problem? is it -1 or? Comment please


Offline mark001

  • Sr. Mitglied
  • ****
  • Posts: 263
  • Branch: stable
  • Desktop: xfce4
  • GPU Card: Intel HD Graphics
  • GPU driver: i915
  • Skill: Novice
Re: Sometimes system hangs - help to find out the reason and to fix
« Reply #31 on: 08. April 2016, 14:08:46 »
The moment I used intel_idle.max_cstate=2 on my system those
freezes disappeared for good!

So you should try the value two (2).

As far as I understand it, this parameter limits the maximum level
of power saving states/modes used by the cpu. So a value of one (1)
would lead to more power consumption as it practically disables most
of its power saving capabilities. A value of two (2) disables less
and keeps the power consumption moderate.

I never used/tried a value of one (1) as proposed in the bug report.

Don't be mistaken! This is only a workaround and HAS TO be
solved by kernel developers/intel soon.

To verify that this setting has been applied correctly you can do:
Code: [Select]
$ dmesg | grep intel_idle
and should see something like:
Code: [Select]
[    0.000000] Command line: BOOT_IMAGE=/vmlinuz-4.4-x86_64 root=/dev/mapper/croot rw cryptdevice=/dev/sda3:croot quiet nosplash noresume intel_idle.max_cstate=2
[    0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-4.4-x86_64 root=/dev/mapper/croot rw cryptdevice=/dev/sda3:croot quiet nosplash noresume intel_idle.max_cstate=2
[    1.089734] intel_idle: MWAIT substates: 0x33000020
[    1.089737] intel_idle: v0.4 model 0x37
[    1.089740] intel_idle: lapic_timer_reliable_states 0xffffffff
[    1.089743] intel_idle: max_cstate 2 reached

I tried today to change the cstate value to 2 so that's what I got on dmesg after reboot
Code: [Select]
$dmesg | grep intel_idle
...
[    0.567517] intel_idle: MWAIT substates: 0x33000020
[    0.567520] intel_idle: v0.4 model 0x37
[    0.567523] intel_idle: lapic_timer_reliable_states 0xffffffff
[    0.567525] intel_idle: max_cstate 2 reached

But I just wanted to ask why you set
Code: [Select]
intel_idle.max_cstate=2 after the
Code: [Select]
quiet nosplash?
...the boot sequence I mean
Code: [Select]
...quiet nosplash noresume intel_idle.max_cstate=2

Offline c00ter

  • Held Mitglied
  • *****
  • Posts: 1534
  • Towelie's cupcake
  • Branch: ☮Olive☮
  • Desktop: Depends©
  • GPU Card: Intel HD4400M CPU: Core i7-4510U
  • GPU driver: Intel/Free
  • Kernel: 4.4-lts & 4.5
  • Skill: Novice
Re: Sometimes system hangs - help to find out the reason and to fix
« Reply #32 on: 08. April 2016, 14:43:21 »
Wow. OP, you just hijacked your own thread.

On second thought, go for it.  >:D >:D >:D
“What, me worry?” ― Alfred E. Newman

Manjaro Wiki: https://wiki.manjaro.org/
Arch Wiki: https://wiki.archlinux.org/
Pacman Rosetta: https://wiki.archlinux.org/index.php/Pacman/Rosetta

Offline BbigTree

  • Neuling
  • *
  • Posts: 25
  • Skill: Intermediate
Re: Sometimes system hangs - help to find out the reason and to fix
« Reply #33 on: 08. April 2016, 15:51:50 »
Hello mark001!

Any chance your cpu is an intel baytrail like e.g. N2930, N3540 or N3050 ?

Then there are some serious freeze issues as elaborately discussed in https://bugzilla.kernel.org/show_bug.cgi?id=109051

On my Acer Notebook with N3540 cpu I could remedy this problem with kernel parameter "intel_idle.max_cstate=2".

This is the reason, why i send my new hardware for my new router-setup back.
I'm so sad, such a cute little baord, even with 2 nics onboard:



ATM imo there is too much wrong with Braswell / Bay Trail in the kernel. Bad job on Intels side,  :-[


===


@Topic: I'm with jonathon on this one.
Manjaro is a rolling release. Why don't you update? Then why choose manjaro in the first place.
Sure, it is your choice, but then it is rly hard to help you!

"Grep" your dmesg ( journalctl ) and participate here: https://bugzilla.kernel.org/show_bug.cgi?id=109051
Intel needs to fix this! in the kernel.

Offline schaschlik

  • Neuling
  • *
  • Posts: 13
  • Yes, it's pork!
  • Branch: testing
  • Desktop: Xfce
  • GPU Card: Intel ValleyView Gen7
  • GPU driver: free
  • Kernel: linux44-x64
  • Skill: Intermediate
Re: Sometimes system hangs - help to find out the reason and to fix
« Reply #34 on: 09. April 2016, 14:50:45 »
You mean some default value causes the mentioned problem? is it -1 or? Comment please


Have a look at linux kernel 4.1 source code for intel_idle.c module:
http://lxr.free-electrons.com/source/drivers/idle/intel_idle.c?v=4.1#L211

Starting at line #211 it lists all available cstates for intel baytrail (BYT) cpu. Therefore setting
intel_idle.max_cstate=2 limits cstate usage to the first two: "C1-BYT" and "C6N-BYT". The
default max_cstate setting is the number of array elements of
Code: [Select]
struct cpuidle_state byt_cstates[]
minus one. Here its 5.

Offline mark001

  • Sr. Mitglied
  • ****
  • Posts: 263
  • Branch: stable
  • Desktop: xfce4
  • GPU Card: Intel HD Graphics
  • GPU driver: i915
  • Skill: Novice
Re: Sometimes system hangs - help to find out the reason and to fix
« Reply #35 on: 10. April 2016, 11:44:08 »
As I know, linux has started to support bay trail at least since version ~3.19.x (correct me if I am wrong) see sources  /linux-4.0/drivers/pinctrl/intel/pinctrl-baytrail.c so it should be more or less fine; Anyways how to know exactly which linux version has fix for the issue cause is is still ACPI issue, as I could get it?

btw, as I was saying, I've added param
Code: [Select]
intel_idle.max_cstate=2 so will see how it goes  ::) the most headache freezes causes of kazam app (couldn't find out why) cause freezes so badly I cannot even alt+ctl+F1   :'(  How you think may that kazam issue be caused with some memory leak because that happens after 1h kazam work? I am still on my search about that

p.s. http://lxr.free-electrons.com/source/drivers/pinctrl/intel/pinctrl-baytrail.c?v=3.19
« Last Edit: 10. April 2016, 12:06:13 by mark001 »

Offline mark001

  • Sr. Mitglied
  • ****
  • Posts: 263
  • Branch: stable
  • Desktop: xfce4
  • GPU Card: Intel HD Graphics
  • GPU driver: i915
  • Skill: Novice
Re: Sometimes system hangs - help to find out the reason and to fix
« Reply #36 on: 10. April 2016, 12:47:53 »
This is the reason, why i send my new hardware for my new router-setup back.
I'm so sad, such a cute little baord, even with 2 nics onboard:



ATM imo there is too much wrong with Braswell / Bay Trail in the kernel. Bad job on Intels side,  :-[


===


@Topic: I'm with jonathon on this one.
Manjaro is a rolling release. Why don't you update? Then why choose manjaro in the first place.
Sure, it is your choice, but then it is rly hard to help you!

"Grep" your dmesg ( journalctl ) and participate here: https://bugzilla.kernel.org/show_bug.cgi?id=109051
Intel needs to fix this! in the kernel.

Oh, I am sorry for your motherboard that's in fact a really sad case... I wish you good luck with that  ^-^

As I can get it, the cstate problem cannot be caught by dmesg or journalctl that's the most hard thing to analyze  :P I am still not sure concerning the acpi issue really takes place so I, as a test, integrated param as
Code: [Select]
intel_idle.max_cstate=2 and just wait how it will go  ::)  As I was saying, the freeze issue happens basically after resume on day 10 (after xf86-video-intel  1:2.99.917+556+g0532031-0.1 was installed) or on day 1-3 depending of how gpu pressing apps run (lets say the kazam or guvcviewer) (before xf86-video-intel  1:2.99.917+556+g0532031-0.1 was installed) for example (I don't switch off my notebook sometimes); So first I decided the issue may somehow be initiated with "cache" or similar cause of my too-long-running suspend/resume notebook so I am still analyzing  ^-^

« Last Edit: 10. April 2016, 13:07:04 by mark001 »

Offline mark001

  • Sr. Mitglied
  • ****
  • Posts: 263
  • Branch: stable
  • Desktop: xfce4
  • GPU Card: Intel HD Graphics
  • GPU driver: i915
  • Skill: Novice
OK guys my system freezed in 10 hours after
Code: [Select]
intel_idle.max_cstate=2 and reboot and one resume :P Now I test 
Code: [Select]
intel_idle.max_cstate=1
The dmesg shows :
Code: [Select]
$dmesg | grep intel_idle
...
[    0.567878] intel_idle: MWAIT substates: 0x33000020
[    0.567881] intel_idle: v0.4 model 0x37
[    0.567884] intel_idle: lapic_timer_reliable_states 0xffffffff
[    0.567886] intel_idle: max_cstate 1 reached
« Last Edit: 10. April 2016, 19:47:06 by mark001 »

Offline c00ter

  • Held Mitglied
  • *****
  • Posts: 1534
  • Towelie's cupcake
  • Branch: ☮Olive☮
  • Desktop: Depends©
  • GPU Card: Intel HD4400M CPU: Core i7-4510U
  • GPU driver: Intel/Free
  • Kernel: 4.4-lts & 4.5
  • Skill: Novice
OK guys my system freezed in 10 hours after
Code: [Select]
intel_idle.max_cstate=2 and reboot and one resume :P Now I test 
Code: [Select]
intel_idle.max_cstate=1
The dmesg shows :
Code: [Select]
$dmesg | grep intel_idle
...
[    0.567878] intel_idle: MWAIT substates: 0x33000020
[    0.567881] intel_idle: v0.4 model 0x37
[    0.567884] intel_idle: lapic_timer_reliable_states 0xffffffff
[    0.567886] intel_idle: max_cstate 1 reached

If you quit messing with CPU states, perhaps this stuff might be self-correcting?  :o

Regards
“What, me worry?” ― Alfred E. Newman

Manjaro Wiki: https://wiki.manjaro.org/
Arch Wiki: https://wiki.archlinux.org/
Pacman Rosetta: https://wiki.archlinux.org/index.php/Pacman/Rosetta

Offline jonathon

  • Core Team
  • *****
  • Posts: 2104
  • Technologist - Teacher - Tea drinker
  • Branch: Unstable
  • Desktop: MATE 1.14
  • GPU Card: Nvidia GTX680M
  • GPU driver: Bumblebee nvidia+intel
  • Kernel: 4.6.0-*-MANJARO x86_64
  • Skill: Advanced
You appear to be messing about with various settings and looking for feedback from others without giving any information about the current configuration.

What kernel are you making these changes under? The last thing you said was that you were running kernel 4.0 and an older version of the Intel drivers because you hadn't updated your system.

Better Baytrail etc. support is introduced in later kernel versions. If you're faffing with kernel 4.0 you're on to a losing battle and you should stop it.

Unless you're working from a known starting point (i.e. fully updated with a specified, maintained, kernel on the 'stable' branch) you're wasting everyone's time.
--
MSI GT70: Core i7-3630QM, 16GB, Nvidia GTX680M, Intel 2230, Manjaro-MATE-amd64-EFI
Lenovo X230: Core i5-3320M, 4GB, Intel HD4000, Intel 6205, Manjaro-MATE-amd64
Dell Studio 1749: Core i5 540, 8GB, ATi HD5650, Intel WLAN, Manjaro-Xfce-amd64
Let's go in the garden; you'll find something waiting.

Offline mark001

  • Sr. Mitglied
  • ****
  • Posts: 263
  • Branch: stable
  • Desktop: xfce4
  • GPU Card: Intel HD Graphics
  • GPU driver: i915
  • Skill: Novice
You appear to be messing about with various settings and looking for feedback from others without giving any information about the current configuration.

What kernel are you making these changes under? The last thing you said was that you were running kernel 4.0 and an older version of the Intel drivers because you hadn't updated your system.

Better Baytrail etc. support is introduced in later kernel versions. If you're faffing with kernel 4.0 you're on to a losing battle and you should stop it.

Unless you're working from a known starting point (i.e. fully updated with a specified, maintained, kernel on the 'stable' branch) you're wasting everyone's time.

I am using linux 4.0.9 right now cause it is much stable for my hardware; I am still testing the state
Code: [Select]
intel_idle.max_cstate=1 since then no freezing takes place so seems like
Code: [Select]
intel_idle.max_cstate=1 is better working state in my case; I am on testing it right now so I am going to report my results a bit later; The non freezing state is holding about a week of non-switching off suspend/resume mode of work so seems it is more or less fine...
« Last Edit: 17. April 2016, 17:28:13 by mark001 »

Offline curiouseag

  • Sr. Mitglied
  • ****
  • Posts: 267
  • I'm new. Be nice!
  • Branch: stable
  • Desktop: undicieded
  • GPU Card: Intel Core i3-5010U
  • GPU driver: Intel
  • Kernel: grsec self build
  • Skill: Advanced
I own a baytrail tablet and wanted to create a patched manjaro kernel for it. But I do not have time to do it right now.
Maybe next weekend.

Offline jonathon

  • Core Team
  • *****
  • Posts: 2104
  • Technologist - Teacher - Tea drinker
  • Branch: Unstable
  • Desktop: MATE 1.14
  • GPU Card: Nvidia GTX680M
  • GPU driver: Bumblebee nvidia+intel
  • Kernel: 4.6.0-*-MANJARO x86_64
  • Skill: Advanced
I am using linux 4.0.9 right now cause it is much stable for my hardware

This kernel is end-of-life and is receiving no further updates. It has vulnerabilities and bugs that will never be corrected.

You will face problems and the fact you are apparently unwilling to take any advice, test any other way of doing things, or even perform regular updates means helping you further would appear to be pointless.

A rolling distro such as Manjaro is entirely unsuitable for your approach. Please move to another distro so you don't waste anyone's time any further.

--
MSI GT70: Core i7-3630QM, 16GB, Nvidia GTX680M, Intel 2230, Manjaro-MATE-amd64-EFI
Lenovo X230: Core i5-3320M, 4GB, Intel HD4000, Intel 6205, Manjaro-MATE-amd64
Dell Studio 1749: Core i5 540, 8GB, ATi HD5650, Intel WLAN, Manjaro-Xfce-amd64
Let's go in the garden; you'll find something waiting.

Offline mark001

  • Sr. Mitglied
  • ****
  • Posts: 263
  • Branch: stable
  • Desktop: xfce4
  • GPU Card: Intel HD Graphics
  • GPU driver: i915
  • Skill: Novice
OK, guys, I may finally confirm cstate=1 really works for Intel(R) Pentium(R) CPU  N3520  @ 2.16GHz (BayTrail);
Currently the bug 109051 "how to fix" solution was added to wiki https://wiki.archlinux.org/index.php/intel_graphics#X_freeze.2Fcrash_with_intel_driver;

intel_idle.max_cstate=1

kernel param is really workable workaround for me so I am good with it;
« Last Edit: 30. May 2016, 14:00:30 by mark001 »

Offline CommonSourceCS

  • Neuling
  • *
  • Posts: 9
  • Branch: Unstable
  • Desktop: Xfce
  • GPU Card: Integrated
  • Kernel: 4.7.X
  • Skill: Intermediate
###Repost from other forum###
( https://forum.manjaro.org/t/intel-baytrail-freezes-the-linux-kernel/1931/9 )

Thanks to Wolfgang M. Reimer we might have a fix for this issue!
https://bugzilla.kernel.org/show_bug.cgi?id=109051#c437
This fix should work with the following processors :

J2850, J1850, J1750, N3510, N2810, N2805, N2910, N3520, N2920, N2820, N2806, N2815, J2900, J1900, J1800, N3530, N2930, N2830, N2807, N3540, N2940, N2840, N2808

I am currently testing out the script on my N3540. No problems so far.

cstateInfo.sh is to verify the script has worked. The "Disabled" Column will show a "1" for disabled states (or you take a look before running the script and after to check whether the offending C6* states have not been used [ie no change in 'time'])
c6off+c7on.sh is the script which will actually disable the C6* states. And enable C7

This will not work if you have booted with any cstate parameters. (remove intel_idle.max_cstate=1)
The script should be run at boot. If it works, make the change permanent by adding it to startup.

If all is well, you should probably also benefit from better power savings than originally. :smiley:

A quick walkthrough here - on systemd you're gonna want to create a service.

Drop the script at "/usr/bin/c6off+c7on.sh"

Make sure it can run with following command

"sudo chmod 755 /usr/bin/c6off+c7on.sh"

Then create a file "/etc/systemd/system/cstatefix.service"
with following contents:
[Unit]
Description=My script

[Service]
ExecStart=/usr/bin/c6off+c7on.sh

[Install]
WantedBy=multi-user.target

Then enable service with command
"systemctl enable cstatefix.service"

When you restart, use the info script to see if all is well.