I did an experiment with some interesting results. It started out as my beginner's attempt to compare two kernels.
It evolved into providing insight (I think) regarding up/down threshhold parameters for the "conservative" cpu governor
If you dont’ want to read the whole thing, jump to the conclusion section posted at the end.
Phone configuration – installed qkster’s UCLB3 with AT&T bloatware removed, added custom kernel, rooted.
To remove variable cpu loads, I turned wifi/data off and turned off the continuously-running programs that I have installed myself (Power Tutor and Tasker).
To create a steady cpu load, I started the program “relax and sleep” (calm background noise program, available for free). I checked one audio channel in the program, and pushed back button to place program in the background, still creating noise. (I think relax and sleep is a good choice of program for cpu testing in general because it allows to check variable number of channels which does put variable cpu load.. although in this case I only used only one channel.. and note that you cannot recreate this experiment if you use mp3 music app instead, because it uses much less cpu than one relax and sleep channel)
Then I started setcpu for monitoring and experimentation. Repeated with several different kernels.
Results with Entropy’s daily driver kernel.
I set the test setup in setcpu to conservative governor, Fmax = 1200, Fmin = 100, i/o scheduler = noop.
The cpu frequency in Mhz now has the following pattern:
800, 1000,800, 1000, repeat ... (frequency changing approx once per second)
Results with with Zen Infusion-Z A/1600 kernel.
I set the test setup in setcpu to conservative governor, Fmax = 1200, Fmin = 100, i/o scheduler = noop (same as before, intending to compare the performance of the kernels).
The cpu frequency in Mhz now has the following pattern:
1200 (constant).
ok. On the surface one might conclude Entropy’s kernel is somehow handling the load better without ratcheting up the frequency. But the story gets more interesting than that:..
Next I tried Zen’s same kernel Infusion Z/1600 with everything the same except change the cpu governor from “conservative” to “on-demand “
The cpu frequency in Mhz now has the following pattern:
200, 400, 200, 400, 200, 400, 200, 400, 200, 400, 200, 400, 200, 400, 200, 400, 1200 repeat
(changing about once per second, mostly 200, 400, pops into 1200 only very infrequently).
But wait! The "conservative" governor is supposed to be better on the battery than the on-demand governor, and yet for the exact same conditions, we’re gettings higher cpu frequency (1200 constant) with conservative than with on-demand (mostly 200/400 with occasionaly 1200). It’s the exact opposite of how it's supposed to be. Surprising, don’t you think!!???!!
So now let’s look at some other governor settings that don't seem to get much attention.
Go to “governor” page of setcpu with “conservative” selected on the main page. The following values appear repeatably after kernel installation for each kernel, so I am ASSUMING these are default values provided in the kernels themselves (open to comment if I have somehow come to the wrong conclusion)
For Zen’s Infusion-Z (A or B, 1600 or 1400)
up threshold = 80
down threshold =20
(also freq step = 5, sampling rate = 78124 although I don’t think these are important for this post)
For Entropy’s DD
up threshold = 50
down threshold =35
(also freq step = 20, sampling rate = 40000 although I don’t think these are important for this post)
(both kernels have sampling down factor = 1, ignore nice load = 0).
I think we can explain my "experimental" results by examining the above up and down thresholds and making some assumptions about the nature of the load (my assumptions are admittedly contrived in attempt to explain these observations, but they seem reasonable to me).
I ASSUME the steady cpu load I have created in my steup varies in the range 350-400 Mhz quasi-steady state (not perfectly constant due to other processes jumping up in the background).
I ASSUME that before the steady cpu load is reached, there is a temporary increase in cpu loading to 700Mhz or more associated with me flipping screens around to get from the relax and sleep appliation to the setcpu application. Within several seconds, this temporary increase will be gone and only the quasi-steady portion 350-400 Mhz remains.
First look at performance of Zen's Infusion-Z A/1600 while in conservative with default settings in the above experiment. That initial spike of 700Mhz load was enough to get us above the up-threshhold of the 800M-hz level (80%*800Mhz = 640Mhz) and push us to 1200M-hz (1200 comes after 800 in progresion for Zen A, which has no 1000). Once we got to 1,200Mhz, we are NEVER going to get down from there until we reach a load corresponding to the down-threshhold of that level which is 240Mhz (20%*1200Mhz = 240Mhz). And with my relax and sleep application running at 350-400Mhz, it won't happen. That is quite a depressing thing to think – I could put my relax and sleep on for an hour as backgorund noise, and my cpu would be buzzing at 1,200Mhz even though the load is only 350-400Mhz.
This seems very undesriable for battery life.
Now lets look at performance of Entropy’s daily driver in conservative/default setings in the above experiment. The postulated 350-400Mhz cpu load occasionally exceeds the up threshhold of 800Mhz (50%*800Mhz = 400) and once in 1000Mhz occasionally drops below dropout setting of the 1000M-hz (35%*1000Mhz = 350). (And now you know why I postulated 350-400). I have two comments about these Entropy results. The first is minor/tangential, the second more important.
1 - First comment (minor/detour) has to do with cycling between different cpu frequencies which is created by the governor (not the load). I don't think it's any problem at all, but this type of cycling is more likely to occur when the diferences between adjacent frequencies are large. For example let’s say the cpu load was rock solid pure steady state (not varying) at 250Mhz. The up threshhold for 400Mhz setting is 200 (50%*200) while the down threshhold for 800Mhz is 280 (35%*800). So we have postulated a situation where the cpu demand is pure steady state (250Mhz), yet the governor will never find a steady state solution... if it’s in 400Mhz it wants to upshift and if it’s in 800Mhz it wants to downshift. Again I don’t think it’s a problem (it's probably ok to let the two frequencies time-share back and forth) but there is a strategy to avoid it if we want to avoid it, as follows. Considering the highest possible ratio between adjacent frequencies (for these kernels) is 2.0, then we should set things so the ratio of (UpThreshold / DownThreshold) > 2.0 in order to avoid this cycling (which is probably not a problem, more later)
2 – Second comment is more important because it relates to battery usage (as I percieve it). Original postulated load that explains this experiment results is 350Mhz-400Mhz. Yet the cpu is running at 800-1000Mhz. Twice as high. That’s wasting some battery I think.
To summarize results so far, it seems to me that Zen’s kernel default thresholds have potential to waste battery due to low down threshhold (20%), which can keep it at high CPU rate forever, even though the load has decreased substantially. In theory we could be setting the cpu almost 5 times as it needs to be in the situation where steady load decreases to just above 20% of the higher level.. The Entropy’s kernel default threshhold have potential to waste battery due to the low up threshold (50%). In theory we can be setting the cpu almost twice as fast as it needs to be in the situation where the steady load increases to just above 50% of the lower level. Entropy's kernel defaults also create the potential for continuos cycling between frequencies even in the presence a of perfectly steady cpu load, since ratio up/down is less than 2 (I don’t think that's a problem, the only reason I mention cycling is because it feeds into my strategy for selecting down threshold - see below).
So what settings should we use for up/down thresholds? Actually I haven't done my complete due diligence in searching before posting this thread, if someone has a good link with recommendations and/or discussion on this subject I'd be interested. I have seen the generic xda thread on governors and I don't think it was covered there in terms of specific recommended values. Here's my thought process fwiw:. Higher is better for battery on both numbers (at some possible expense of performance). I remember seeing on other sites a default up threshhold of 95% (listed, but not discussed). That makes sense to me for battery saving... shift up at the last minute. Perhaps this high value gives small slowing of response to demand increase, but I don’t think it’s much slower (especially for rapid cpu load increase which is the most critical for response...rapid increase means short time to get from 50% to 95%... short time means not a big response penalty difference) and certainly it seems worthwhile to try to strive for an efficient operating point in long-term steady state. Additionally, we're talking about the "conservative" governor which is supposed to favor battery (we can set up a setcpu profile to invoke on-demand or interactiveX in situations when we want more responsiveness and don't care as much about battery, at least these are availble in Zen's). I don’t recall seeing any number for down threshhold, but should be high as possible again to save battery. How high? I don't know. The only way to put a limit I can think of it is to impose an arbitrary (maybe unnecessary) requirement that we don't want any cycling in pure steady state as discussed above. This means we need down threshhold at least a factor of 2 below up. So I pick 95/2, rounded down to nearest round nubmer of 45%. There may be further improvements if we drop that requirement to avoid cycling and allow even higher down-threshhold, but at least we know the down threshhold of 45% would seem to have moved in the right direction for battery from the defaults. So up/down 95/45 is my pick for now.
Using conservative governor with 95% up threshold and 45% down threshhold (still noop i/o) in above conditions on Zen’s kernel, I’m seeing frequency pattern
400, 400, 400, 400, 400, 400, 400, 400, 800, repeat
in other words mostly 400, intermittent jump to 800.
Certainly the up/down 95/45 settings for conservative governor perform better batterywise than the default settings for conservative governor given in both kernels for this one experiment. To me, it seems very reasonable to expect it to also be better batterywise accross a wide range of expected operations, but it's open to comment.
Small detour - why did we do better batterywise on Zen's on-demand default settings than on Zen's conservative default settings for this particular cpu loading?. The settings for Zen's on-demand default include an up threshhold of 95% and no down threshhold. So on-demand governor apparenlty finds some other way to shift down. Since the 20% down threshold that was causing the problem in Zen's conservative governor default settings is not present in the on-demand governor... that probably explains why the on-demand didn't get hung up at the higher level and performed better afterwards. Another thing to note, if 95% up threshhold is responsive enough for the on-demand, it should surely be responsive enough for the for the conservative...supports the previous sugestion to increase conservative up threshhold to 95%.
CONCLUSIONS:
There is only thing in this entire thread that I am completely 100% positive about, and it is that Zen and Entropy know lightyears more about this stuff than me. In fact, that is the very reason that I was extremely careful to record the as-found default settings in order to preserve any intelligence that went into those defaults before I started tweaking.
So I can reach one of two conclusions:
#1 – I am completely misunderstanding how this conservative cpu governor works
or...?
#2 – The developers never intended for the “default” values to be used, instead they envisioned the users would adjust them as needed.
In the event that #2 were correct, then it would seem logical for battery-concious users to tweak these up/down threshhold settings of the conservative governor. My thoughts would be to set them to 95/45 by the logic above.... may or may not be considering all relevant factors. I'm open to thoughts and comments....
In the above analysis, I have assumed that power consumed by the cpu can be predicted from the cpu frequency (for a given voltage setting of course).
I now believe that assumption might be incorrect.
The reason I believe it is false is a result of another experiment I just did.
I set the cpu governor to performance to maintain constant 1200Mhz.
Then I looked at the cpu power usage trace in "Power Tutor" program.
I expected to see power attributed to cpu usage as constant, but it was varying up and down.
And by moving the homescreens around, I could create a dramatic and predictable increase in power usage of cpu (as indicated on Power Tutor).
All of this change in power consumption of the cpu occurred while the cpu governor was in performance mode with cpu frequency constant at1200Mhz.
I didn't expect that. I can't really explain it (can anyone else?). But clearly there is more to the story than I thought (assuming that the cpu power usage reported by the Power Tutor is correct, which I'm not sure of either).
To evaluate the battery friendliness of variouis governor settings, it might be more useful to watch the power tutor results when performing the above experiments, instead of just watching the cpu frequency as I did before.
Over my head...
How does the battery cycle pan out?
It would be nice if you or someone else had a spare phone to test this battery consumption theory.
I would wonder if the report of consumption is also correct.
The bottom line that I would be interested in seeing is how long can the phone, running a certain kernel and governor, last.
For example: Charge to 100%. Take off charger. Wifi off, Cell data off and in Airplane mode (removing signal variable).
Then run kernel with governor - record the battery duration.
e.pete - nice work! I appreciate your empirical approach to this topic.
I can add the following: you have described accurately how the conservative governor works. For OnDemand, the governor behavior results in the CPU at max under load and minimum when idle, with a smaller amount of time being spent at the steps in between based on thresholds. On my phone today with OnDemand, for example, I'm at 1600mhz 6% of the time, 100mhz 5% of the time, and 800mhz 2% of the time. Deep sleep is 83% and the other CPU frequencies are all below 0.5%.
In general use, conservative should be kinder to the battery and to the hardware. The recommendations I have seen for Linux platforms is to use conservative where battery life matters and ondemand when there is a constant external source of power (i.e. PC or server). Of course, actual use determines how the governor performs. Most smartphones have a lot going on even when the screen is off. A good indicator of average CPU use over the course of a day is CPUSpy. This app, combined with a decent battery monitor can help tell the story from a macro/whole system perspective over time. On the "be kind to hardware" topic, conservative should increment and decrement to adjacent frequencies based on load. This behavior might be happening too fast for SetCPU or other realtime monitor to capture..that's where CPUSpy can show what is happening over a larger period of time. These more gradual transitions may result in less wear and tear on the phone hardware, but I have not seen any significant evidence that this is a factor in the usual life span of a smartphone. (On the flip side, setting the governor to performance and OC to max setting...that is NOT recommended and could harm the phone).
That said, the two kernels you tested have the following default characteristics:
Entropy DD - Conservative/BFQ - Optimized for stability and battery life
Infusion (Bedwa/Zen) - OnDemand/CFQ - Optimized for performance
The Infusion kernels do not include optimization settings for conservative. As you surmised, the expectation is that if you are going to change these settings you have some idea of what you are aiming for and will adjust accordingly.
If battery life is your aim, I've found that the best savings are realized in optimizing transition to sleep when the phone is not being used, minimizing the number of apps that attempt to keep the phone awake, and being selective in your use of wifi and data radios (although too much mucking around with this last option can lead to triggering some of the known bugs in these kernels which manifest as a higher than normal Android OS or Dialer/RILD drains - as seen on the standard battery usage screen in Settings).
On this last topic, there's another thread (which I see you have visited) which covers discussion and work on these known battery drain anomalies: http://forum.xda-developers.com/showthread.php?t=1408433
Here is some additional information on governors courtesy of Big Blue... http://publib.boulder.ibm.com/infoc...?topic=/liaai/cpufreq/TheOndemandGovernor.htm
And a bit more info ... https://wiki.archlinux.org/index.php/CPU_Frequency_Scaling
Truckerglenn said:
Over my head...
Click to expand...
Click to collapse
:what: I'm so glad there are people much smarter then myself here. Great work electricpete :thumbup: even if I followed about half of it
Sent from a de-FUNKt Infuse
Pete - Here's a link to a thread that has a lot of information about governors, i/o schedulers, tweaks, scripts, and kernel objects:
http://forum.xda-developers.com/showthread.php?t=1369817
Thanks everyone, a lot of good info.
Especially Zen, very useful info and links.
It was a very interesting comment about certain governor strategies being hardware unfriendly if they jump increase the cpu from min to max.
I never realized that was a factor (only thought the max speed was important).
But it definitely sounds plausible. Maybe (?) the rapid temperature increase causes uneven temperature (cpu gets hot before attached plate gets hot) and therefore uneven thermal expansion, which causes mechanical stresses. I can imagine there are other subtle aspect of the cycling up/down that can be important. If you have any more info or links on the effects of cpu governor strategy upon hardware life readily available, I'd be very interested to hear it. (If not readily available, that's fine too, I will do some googling).
I did find this link which suggested it's better for the hardware life to run at 100% (I guess for us that's 1200Mhz) than it is to cycle up/down. It's not written about phones but about PCs. There might be some differences in the technical aspects. There are of course big differences in the priorities for PCs... they don't care about power usage as much as phones do, and pc users probably expect a longer life than phone users do.
http://www.overclockers.com/overclockings-impact-on-cpu-life/
Thanks again.
I will report back if I get some free time to continue experimenting... maybe this weekend.
Primary consideration (as you've noted) with smartphones is battery conservation. ARM processors are engineered to operate at multiple frequency steps, and to turn off where possible. Without this capability phones would need a much higher capacity battery. As for PCs, current processors include frequency stepping technology to reduce power consumption and heat, and perhaps extend life by keeping temps lower.
The main conclusion of the article you referenced (which is 13 years old, btw, but does contain a wealth of good foundational information) is that heat is the primary enemy. This is a major factor with smartphones as they have limited ability to dissipate heat. An Infuse running at 1200mhz (or 1600mhz OCed) confined to a purse or pocket, or (as reported in these forums a while back) under a pillow gets hot very quickly. This will lead to conditions that will harm the phone. At one time, my phone had an error condition causing the Dialer to go crazy (rild process) and peg the CPU at 100% for an extended period of time, while the phone was also plugged into a charger (thus heat from the charging process too). The end result of this was a temperature that tripped a heat sensor threshold, causing the phone to shut itself off. So there are, at least, limited protections against extreme events.
As I noted above, I've not seen any evidence that the normal (or even OCed) frequency stepping that occurs with smartphones leads to failures within the normal in service period for these devices - 2 to 4 years in most cases. Running at 100% all the time may put your phone's health at risk and will definitely impair your battery life.
Zen - Good points. One thing I do take away from the article (along with your comments) is the cuumulative effect of cycling, So when I settle on up/down threshholds, I may try to avoid putting them too close together in order to avoid extra cycling (keep Max/Min threshhold ratio >2). Although I do realize these particular cycles between two adjacent frequencies are not as bad as bad as the cycle between min and max frequency.
more test results
I have completed some testing using PowerTutor and results reported in attached spreadsheet.
I would say the results only muddy things further. Don’t read any further if you don’t have a tolerance for ambiguity.
SETUP (common to all tests)
In all tests, I had similar setup as in the original post: 1 channel of “relax and sleep” running to create a constant cpu load, and all other continuous-run programs turned off except power tutor.
Some other details common to all tests: Tasker off, Wifi off, data off, Power Tutor on
No uv
Noop I/o governor used throughout.
WHAT CHANGES BETWEEN TESTS:
See the spreadsheet, tab labeled “summary”.
The things that changed between tests are in rows 3 thru 8, labeled “Tested Configuration”
As you can see, between tests, I varied the Up threshhold and Down threshhold. I varied the kernel. I varied the governor (mostly conservative, but performance used).
WHAT WAS RECORDED DURING EACH TEST:
1 – Recorded power usage of CPU, LCD, Audio as reported in Power Tutor over the course of one minute. I converted them to battery %/hr (conversions shown in tab “Notes”) and listed them in rows 12-14
2 – Recorded the actual cpu frequencies seen in setcpu “home” screen, similar to original post and listed these in row 15. I attempted to guess the average frequency over time and put this in row 16.
3 Rows 18-23 are the Quadrant results for the six categories that Quadrant reports (yes, I know people don’t like Quadrant, just recorded as a datapoint)
WHAT PATTERNs EMERGE:
1 – Entropy’s DD and Zen’s Infusion-A use comparable power (as reported by Power Tutor) in this particular experiment.
2 – Zen’s Infusion does better on quadrant score in this particular experiment, when both are set at the same governor configuration (100-1200, conservative), . Not surprising since Zen said he has optimized for performance.
3 – How does the governor frequency affect power usage? This is the muddy part. There is no doubt that if we blindly take the data at face value, then there IS a correlation between cpu frequency and power attributed to the cpu by Power Tutor. However the correlation that emerges from the data is in the opposite direction from what anyone in the world would think: this data suggests that increasing CPU frequency causes decrease in power consumption reported by power tutor. See tab labeled “chart 1” for graphical depiction of this result.
As you can see in the graph, there is not a random spread of results (as would be the case if random unacconted-for errors were at work). There is a definite correlation. What it suggests perhaps is that there can be a systematic error in the way PowerTutor measures power that depends on cpu frequency.... in other words the error itself (between measured and actual) somehow depends on cpu frequency.
So, I am just reporting some results. I am definitely not suggesting anyone overclock to save power (that would be truly bizarre and I’d probably be kicked out of xda for suggesting something so silly).
On the other hand, as stated in the 2nd post of this thread, I’m still very leery of using cpu frequency as an indicator of power that cpu is drawing... because there is just too much going on inside that black box that I don’t know about. For one thing, the cpu itself may draw different amounts of power at a given frequency depending on it’s loading because the registers may not be doing anything at low loads. For another thing, there are a lot of other things in the phone (like RAM, bus) that may draw some power but probably get lumped in with the reported cpu power in Power Tutor and others. Perhaps the cpu is somehow more efficient at interfacing with these others parts of the system when cpu is at high speed, enabling it to reduce the power they draw. The point is, it’s a lot more complicated then I assumed in my first post.
I have heard Entropy mention that before changing kernels we should always reset UV settings (and reboot) and reset other cpu related settings (Fmin and Fmax I assume).
I would like to add another item: always uninstall setcpu before changing kernels and reinstall it after you change.
The reason: I have seen some very weird results of setcpu when I left it installed in between swapping kernels. Like for example cpu running at 1600Mhz even though Fmax is 1200Mhz and there are no profiles allowing 1600. Those weird results are not included in the above data (I observed the frequencies during each trial as reported in the spreadsheet).
Power Tutor has a great interface and very detailed stats available. Seems to have great credentials based on their website.
But I can only conclude we can’t trust it for our particular phone because of results above (power draw goes down as cpu speed goes up) and some other results I have seen (it seems to suggest that power used by my display does not change depending on dark/light backround, and also seems to suggest that power used by the phone does not change when I change the volume of music playing).
So, I’m looking for another way to be able to track power usage closely.
I kind of like qkster’s idea to just watch the battery go down.
I’d like to try to automate that using Tasker. I can write a program which will help me build a log of power usage.
The interface will be:
push a start icon and it prompts me to enter description of conditions that will be tested
wait some period of time (this is the constant-load test period that we're evaluating...may be listening to mp3)
push a stop icon and it prompts me for comments about anything that happened during the test.
At time of pressing the start icon, it will also record from the system:
1 - clock time (in seconds)
2 - voltage in millivolts to 4 digits of resolution (like 3784 millivolts)
3 - battery life remaining in percent to 2 digits of resolution (like 43%)
The same info will be recorded from the system upon pressing the stop icon.
All this info will be appended to a logfile and we can compute drain based on change in battery divided by change in time.
I can get these voltage and % life stats using the method suggested by Brandall’s tutorial here:
http://tasker.wikidot.com/using-linux-shell-with-tasker-for-a-technical-battery-widget
I couldn’t get the grep command to work, but I can still extract the required voltage and percent-life-remaining from the Battery sysdump using the tasker variable splitter command (I’ve already got that part programmed).
No-load voltage has a roughly known relation to battery life, but there’s also the matter of voltage drop accros the internal battery impedance that varies with load at the time of the measurement, so we don’t see no-load voltage, we see something lower which makes the whole thing somewhat variable.
Percent Remaining is the exact thing I want. But it is only given rounded to two digits, (43%). If I wanted to do a trial run listening to a 5-minute MP3 draining something like 12% per hour, the battery drop during that that 5 minutes will be only around 1%... the difference between two kernels or cpu frequency settings will be only a very small fraction of that 1%, so comparing the start and stop values which are both rounded to 1% would introduce an enormous error compared to the thing we're interested in. I can surely reduce that error by working with longer times, but that starts to become a PITA. That may end up being the only solution, but if there's any way to avoid it I'd like to be able to gather data in shorter chunks.
Which leads me to a QUESTION:
Does anyone know whether there is any way to retrieve or estimate “battery % remaining” with greater resolution that two digits (ie 43.26% instead of just 43%)?
unintended consequences from changing up threshhold
I used the following setup for almost a month:
Zen’s Infusion A Kernel (with my stock GB), conservative governor
UpThreshold = 95
DownThreshhold = 45
Only twice during the month, I saw the following:
Received a phone call. I could see the name of the caller. I couldn’t hear the caller. When I finally got hold of them later, they told me they could hear me even though I couldn't hear them.
That was very tough to figure out because it only occurred on two out of probably 30 or 40 phone calls received in a month.
The two phone calls did originate from cell phones in the same geographic area (near my work, an hour away from my home).
Then I had a breakthrough when I set up my work voicemail to automatically call my Android phone. Almost every time it called, the problem appeared (I couldn’t hear the robot voice telling me I had a message).
I kept leaving myself messages to reproduce the problem and narrow it down.
I found out it only occurs when my phone is asleep at time of the call (doesn’t occur if phone is awake at time of the call).
I removed my UV and problem continued.
I adjusted my governor and could make problem go away.
I narrowed it down to the up threshhold.
Repeatable with 95/45 up/down, the problem occurs.
Repeatably with 80/45 up/down, the problem does not occur.
I have gone back and forth between those two settings at least four times and each time it confirms the symptom is directly related to the governor setting.
Exactly why that is I’m not sure. Maybe the cpu is to slow to wake up to handle the call? Sounds kind of hokey, but I guess it doens’t really matter.
The bottom line for me: 80/45 is a great place for me to stay. Eliminates the “can’t hear caller” problem and still does pretty good at preventing cpu from going to high frequency when listening to my relax and sleep program for long periods of time.
If anyone has gone to 95/45 based on my recommendation, you might rethink it, especially if you see unusual behavior.