[Q] 10 Minutes or Less for AOSP Builds (distcc anyone?) - Miscellaneous Android Development

Current build time: ~23 minutes
Target build time: <10 minutes
I'm getting to the point where I'm customizing my own ROMs, typically from forks of other ROMs that are *close* to what I like personally. Having been in the systems engineering field for a number of years supporting (primarily Linux-based systems/clouds/clusters), I feel the natural desire to make things go fast(er) - I'll be straightforward here and admit that I'm a ricer.
In the few builds that I have done, I can see that the builds are heavy on the CPU layer. I ran a test build on a beefy Amazon EC2 instance that had the following specs:
1 x r3.8xlarge (Ubuntu 14.04 LTS)
32 CPU threads (E5-2670 v2's @2.50 GHz)
244GB RAM
Building AOSP for the Nexus 6 (shamu), I allocated a little over 200GB of the memory as a ramdisk for the AOSP files and ccache. I ran make with make -j48, but in the end the build took a hair over 23 minutes from scratch. Sure, I didn't have in depth monitoring tools in place when I did the build to log metrics, but I could see pretty clearly from the command line that CPU was the main resource consumer.
I feel that a distributed build setup is the way to go. Have any of you deployed distcc or any other sort of distributed Android build environment?

Side note, the sad story here was that it took almost 30 minutes to rsync AOSP over to the ramdisk from the SSD-backed storage volume that I attached.

Related

Understanding Android platform in a nutshell (in layman's terms)

System.out.println("Hello peoples");
==>The purpose of this guide is to help people who don't know anything about programming,aren't modders,guys with knowledge about technology.
==>Initially I loved computers and their capabilities and have a little knowledge on the C and Java languages and just how computers (think).
You have to understand that computing tries to emulate human behaviors on how to solve problems.This is where programming kicks in.
So what is Android?
===================
1)Android in general is an operating system that was meant to run on mobile devices e.g. cell phones
and has expanded to tablets and notebooks.Android is divided into three language groups:
a)the system's framework and apps are written in java.
b)the Android's core [kernel] is pure C -language.
c)the Android's libraries are written in C++.
App libraries are called by apps that need more functionality that java can't provide.These are usually plugins like
decoders e.g. ffmpeg libraries that are used in video decoding,flash-players e.t.c. Here,java native methods are used and the android NDK platform
is used to enable the java apps call these libraries during execution.(Please read further on Android SDK,NDK and Jni).
So what happens when you have just flashed that new Rom.
====================================================================
First you have to understand that cell phones have their own embedded firmware not including the recovery and you will see why.
a)the recovery partition can be flashed to install aftermarket recovery roms.So even if you mess your recovery,you can still install again.(This varies with different phones).
b)their is that system chip which you cannot touch and there is a reason for it.Think of this partition like a PC BIOS.If you mess with your bios
your system is toast aka Bricking the system.Since phones are classed as embedded systems,manufactures don't want people messing with it as it would result into
cryptic errors and system vulnerability.
when the on button is pressed something simple yet complicated happens:
1==>The kernel which is compressed to save space usually in a (zImage) format is deflated or expanded.Since your NAND chips are partitioned,the kernel is given a very
special chunk in which is protected from user data.
2==>The kernel finds which base address it needs to start executing e.g.(0x00200000) and mostly when you put a wrong kernel base
address your phone enters into a boot-loop because arguments are being passed to invalid locations.It is important to know where your kernel base address
starts.You can try looking it up in your kernel sources(try searching for the mach msm folder and into the makefile) or just goggle it up or use Xda-Kitchen.
3==>with the correct kernel base address,execution starts.Usually the (init.rc) file in the (ramdisk folder) gives symlinks and creates structures i.e. folders that will house
modules and sets the correct paths to android files and framework.
4==>After arguments have successfully been passed,the handles are now passed to Android.Basically Android checks for (init.d) scripts that are available
this is true to GingerBread and Cyanogenmod 7.After that audio checks are done followed by camera services and then arguments are passed to (core.jar),a
critical framework file which is huge around 50Mb in size in CyanogenMod 7.
5==>Here the DalvikVM (Dalvik virtual machine) is called and the process of optimizing your system files starts.The framework get optimized first as this contains critical
code needed to run your device.Then your flingers and renderers and called.This are engines used by android e.g. (pixelflnger) which is used in touchscreen.
Your system's sensors are usually started around this point (your compass,light sensors e.t.c)your phone apps e.g. contacts,calendar get optimized around this point
and depending on the number of apps your manufacturer installed,it will take some time.
6==>your network get's activated around this point and is probably the time the capacitive buttons and lights on your gadget light up.This is usually a good indication
that your system has loaded.When your bootanimation ends the handles are passed to activate your home launcher.There is usually still a lot of activity going
on to fully ready your system and this is why if you try to use it immediately especially on low-end phones,your system lags or get (not responding) errors.
common misconceptions
=====================
1==>There is are reason why goggle gave minimum specifications needed to run Android because this is a full operating system(OS) unlike the past relic
phones that ran on 50MHz processors.
2==>please don't complain that some of your Ram is not the same as specified on the phones catalog(e.g. you have 256Mb of Ram but in the task manager it indicates you have 179Mb).
There is a reason to it.There are core processes that eat a chunk of your Ram and are hidden so that you don't try silly stuff like trying to kill them in thinking that your are trying to get more memory.
Think of it like this,it's like trying to kill (services.exe or svchost.exe) processes in Windows.You will just be trying to get system hangs,bluescreens
or just a system crash.
Will Add more info later.Please feel free to correct anything i might have not addressed properly or share your views.
Happy modding.
Kernel topic
============
In simple terms,the kernel is the core of any operating system e.g. windows, Mac,or any Linux distro like Ubuntu and Android.Android kernels come mostly
in the form of a compressed kernel (zImage).The kernel is written in pure C-language,which gives it direct access to memory and registers unlike java
which has to pass through the java VM(virtual machine).This makes code written in C-language to be very fast and robust but also dangerous.
==>Many Androids in the eclair regime ran on kernel 2.6.29. This was not a complete kernel and as by my experience there was alot of code missing from it.
2.6.29
======
==>a lot of androids did not have adb functionality due to the framework being embedded to allow USB mounting to PC.This was a very rigid method
of doing it(also a very old method).
==>In the case of other devices, when viewing the internal task manager,many processes were viewed as (0.00) byte files.In essence you could not determine
the amount of RAM your app was taking.This is true in the case of huawei u8120.
==>In the case of shutting down the phone,even in some cases under load,it did so very fast.It killed threads and handles mercilessly. Many people misunderstood
this concept and thought their phones ran faster as compared to 2.6.32 kernel.
2.6.32.9
========
Here there was a ton of improvements as developers and modders became more aware of support and tweaks.
==>Fixed issues like the internal task managers.It was now possible to accurately know how much RAM your apps were taking.
==>Resolved how the android system shutdown.Instead of merciless killing of handles and threads still running,it killed them appropriately.This is why
when shutting down your system takes a while.You can use adb to see these events.
==>Usb mounting to pc was also made somewhat generic and flexible across many devices.
==>It did change some methods on how the camera is being accessed mostly in eclair,donout and earlier versions of android phones.This issue made
cameras not function.
==>wifi methods were also changed and developers and modders had to re-write their code to allow compatibility.
==>This was also the year of many froyo phones and overclocking was a common thing.
extra notes
===========
During eclair era,many developers were in such a hurry to produce new android phones every few months that they never even thought of long-term support
to newer versions of android that would come much later.This is where i have to praise Iphones for standardisation of its OS across all of its devices.
==>If you try to copy paste code from 2.6.32.9 and paste it into 2.6.29 and expect it to work then expect tons of errors when compiling.The (g_android)
module was properly coded into 2.6.32.9 so if you try enabling it an older kernel,expect errors.
==>If you are a modder and looking to buy a new phone,please see if it has a fan base of coders or support.Avoid buying phones where you will not get
any support from the manufacturers or other devs e.g Huawei Technologies.This company sucks alot.After they produce a phone they forget about you the customer
so you will have to handle the upgrades all by yourself.
Do not use this command (make -i)when compiling kernels.It will skip errors and you may smile after it's finished but the end is just a tragedy.your kernel is bound
not to function properly or even function at all.
Happy modding.

[DevTOOL][2012-10-01] Fast AAPT (#2) - Speed up Eclipse/apktool/etc

Presenting: Fast AAPT - aka FAAPT
Lately Android development has been getting me down. Slow builds all over the place in many of my app projects, and my PC is blazing fast - it shouldn't all be that slow, even if you're running Eclipse!
Working on DSLR Controller has been driving me mad - testing a minor change in the underlying communications library, then building and launching the app - ugh! So I set out to fix this. I had done all the usual tricks, even gave Eclipse loads more memory (helped with regular performance, but not building) but nothing major seemed to change. Then I figured out most of the time building was spent in AAPT. So I synced my AOSP repo (2012.09.26, took a few minutes), tried to get the Windows SDK to build on my Linux box (took several hours) and finally got to actually mucking with the source.
Found the bottleneck (for my long-build-time projects at least, related to XML file compilation) and fixed it (by introducing a simple cache). "DLSR Controller" build time has gone down from 35 seconds minimum, to 2-3 seconds ( >10 times faster). Hell, I can even turn "Build Automatically" back on without getting constant delays!
Note that my build times quoted only apply to incremental internal builds. If your images still need to be "crushed" (optimized), or you're "exporting" an APK (final build for publication), build time will still be significantly longer. However, during normal development and testing (by far most builds if you're making an app in Eclipse) those stages are not performed, and builds should be lightning fast.
"Fixed" is a big word though, right now it's more of a "hack", and it needs some pollish, so the patch can be submitted to AOSP. I don't want to keep it from you for that long, so my first test build is attached - don't use it in production builds..
Patch code has been submitted to:
AOSP - #1 Cancelled, #2 Review in Progress ... superseded by ctate rewrite
AOKP - #1 Merged, #2 Merged
CM - #1 Cancelled, #2 Merged
Attached ZIP includes Linux, Windows and Mac OS X versions.
The files are drop-in replacement, but I would certainly advise you to backup the originals for your production builds! Also, don't forget to chmod/chown on Linux or it won't work.
Enjoy and leave some feedback
Will this help your project build ?
A quick way to spot if this will have effect on your slow build is as follows:
- In Eclipse, set Build output to Verbose under Window -> Preferences -> Android -> Build.
- Clean and build your project.
- If the build pauses on lines in the "(new resource id <filename> from <filename>)" format, you have the problem FAAPT fixes
(of course, you can also run aapt manually if you know how, you'll get the same output)
In a full framework build the optimizations only affect a very small portion of the actions done during the build, so you won't see any spectaculair speed increases there.
Update (#2)
I have updated the patch code to fix problems with Mac OS X compatibility, I've also included a Mac OS X binary in the new zip file.
-----
( v1: 557 )
So that's what's been killing the speed of my eclipse then. It froze so often I had to switch to AIDE on my phone. Hopefully this'll speed it up a little
Sent from my Galaxy Nexus using Tapatalk 2
Awesome work Chainfire, will play around with this in a few
Thanks brother, gonna give it a try right now on Linux
EDIT:
It works. Gave it three tries. Went consistently around 19.1 sec with FAAPT and 33.9 sec with regular AAPT. This is on the Linux version. Good job
Amazing work. Handles large resources like framework-res a lot faster.
wildstang83 said:
Thanks brother, gonna give it a try right now on Linux
EDIT:
It works. Gave it three tries. Went consistently around 19.1 sec with FAAPT and 33.9 sec with regular AAPT. This is on the Linux version. Good job
Click to expand...
Click to collapse
Glad to hear the Linux version also works! Too bad your increase is not as much as mine, but I guess it depends heavily on the amount and type of assets in your project.
Chainfire said:
Glad to hear the Linux version also works! Too bad your increase is not as much as mine, but I guess it depends heavily on the amount and type of assets in your project.
Click to expand...
Click to collapse
I tested on a theme project of mine, so its heavy in img files. Thats probably why. I'm not complaining one bit. Absolutely love it and can't wait to try it out on my other apps
Chainfire said:
Glad to hear the Linux version also works! Too bad your increase is not as much as mine, but I guess it depends heavily on the amount and type of assets in your project.
Click to expand...
Click to collapse
Indeed, tested the Windows version and yes it gives a very good result with a amount of assets, i got 7 time faster than the normal one on Eclipse, too bad we can't use it for production, we need one for Apktool .
wanam said:
Indeed, tested the Windows version and yes it gives a very good result with a amount of assets, i got 7 time faster than the normal one on Eclipse, too bad we can't use it for production, we need one for Apktool .
Click to expand...
Click to collapse
Actually, I was using it with APKTool.
Thanks CF.
Decompiling recompiling now is so fast
apktool and Maps.apk on Linux.
aapt - 29.2s to compile
faapt - 2.2s to compile
Amazing job, Chainfire!
Tested both in linux and winz....
It works great!
Untested time but noticeble faster.
What's the next step after elite rec dev for chainfire?
Thanks for the great work!
Could you post your tweaked source file(s)?
Would like to compile from source for OS X..
On my projects I see anywhere from 5 to 20x speed increase on build. Thank you for this magic.
I don't get how the google android SDK team did not optimize this (I know they done some crazy optimizations for different stuff).
Yeah, could you post your patch?
berdon said:
Yeah, could you post your patch?
Click to expand...
Click to collapse
He did say in the OP. He had to clean it up for AOSP inclusion, so just wait till you see it on Gerrit
Sent from MIUIAndroid Phone.
That is good stuff. It's gonna save me a whole lot of time. The thanks button was just not enough to thank you!
Thanks for the great work Chainfire.
Whoa! Now I will surely stop cursing while waiting for xenoAmp tu build (just not enough time to say "kurwa!")! Thank you!
I'am using ArchLinux x64 as my normal desktop OS. (Yes, I don't have Windows.) and tried it with Eclipse, but it don't work.

Porting Chromium to Windows RT

So, I've been at this for about 48 hours now (not continuously, but closer than you might think) and I figured I should take a break from modifying project files and puzzling over alignment issues to discuss the project, share some of the problems I've been having and ask if anybody can help, and so on.
The general idea is "Chromium build for Windows (on x86/x64) and build on ARM (for Linux), so there must be a way to build it for Windows on ARM". For the most part, that even looks like it's true. Probably at least 80% of 654 Visual Studio projects (no, that's not a joke) either build just fine with only minor amounts of work, or are things that we don't actually need (I'll try building the test suites... once everything else builds!!)
Areas that have given me problems (caution: some chance of brief rants ahead):
v8. Less than you might think, though. Setting the flags for Arm seems to have been enough.
Sandbox. There's a fair bit of thunking coded in assembly going on in the sandbox for x86. Not sure what's up with it (I don't know exactly how the Chromium sandbox works) but it'll have to come out or be replaced. The Linux (including ARM) sandbox seems to be SELinux-based, which doesn't help at all.
Native Client (NaCl). I think all the assembly is in test code, though, so I may just boldly #ifdef if all away.
libjpg-turbo (libjpg). Piles of carefully optimized assembly... for x86 and x64. There is a set of ARM assembly (for Linux) that Visual Studio won't compile, but something else might... or I may tweak until it works. Of course, I could also just accept the speed hit and use the version of libjpg implemented in nice, portable C.
Anything where the developers tried to use some SSE to speed things up. I may be able to replace it with NEON code, or I may just remove it and hope **** doesn't break. We'll see.
Inline assembly in general. Even when it's ARM assembly, Visual Studio / CL.exe don't want anything to do with it (__asm is apparently now an invalid keyword). I suspect I'll have to just pull the assembly out into stand-alone functions in their own files, then compile them to object files and link them back in later. If I can figure out the best way to do this (for example, I'll want to inline the asm functions) then it shouldn't impact performance. Seriously though, I kind of hate inline assembly. I can read assembly just fine, but I'm usually staring at it in a debugger or disassembly tool, not in the middle of source code I'm trying to build...
Everywhere that the current state of the CPU is cared about (exception and crash handlers, in particular) because the CONTEXT structure is, of course, CPU-specific. They're pretty easy to get past, though.
Low-level functions, like MemoryBarrier. Fortunately, it's implemented in ntdll.h... but as a macro, which breaks at least half the places it's referenced. Solution: where it breaks things, undefine the macro and just have it be an inline function that does what the macro did.
Running out of memory. Not even joking... well, OK, a little bit. I've got 32GB; I won't actually run out. Both Visual Studio and cl.exe do at times, though!. Task Manager says VS is currently using 1,928 MB, and before I restarted it, it broke 2.5GB private working set. Pretty good for a program that for some reason is still 32-bit...
Goddamn compiler flags. Seriously, every single project (I mentioned there are over 600, right?) has its LIBPATHs hardcoded to point at x86. Several projects have /D:_X86_ or similar (that's supposed to be set by the build tools, not the user, you idiots...) which plays merry hell with the #ifdef guards. Everything has /SAFESEH specified, not in the actual property table where the IDE could have removed it (unneeded and invlaid on ARM) but in the "extra stuff we'll pass on the build command line" field, which means every single .EXE/.DLL project must be modified or the linker will fail.
My current biggest goal is the JPG library; nobody wants to use a browser without it. After that, I'll tackle the sandbox, leaving NaCl for last... well, last before whatever else crops up.
Anyhow, thoughts/comments/advice are welcome... in the mean time, I'm going to go eat something (for the first time in ~22 hours) and then get some sleep.
Kudos for having the patience to look though this monster.
It's my understanding that NaCl is still a pretty niche thing at the moment. Is it possible to easily either disable it or completely hack it out, or do other more critical parts of Chromium now depend on it?
I don't think anything truly depends on it. I'll look in the VS dependency hierarchy and see how many things list it, and how awful it would be to remove them.. after I get the other stuff working. I may pass on the sandbox as well, if possible; it makes the security guy in me cringe something awful, but as they say, shipping is a feature..
great
Please make that happen !
Working on it! I've gotten over half of the projects to build and link, but some other stuff is adamantly refusing to work. I'm beginning to suspect I'll need to work from the other direction - rather than starting at the bottom and building all the dependencies, then combining them into browser components, and then eventually combining all the components into a complete piece of software, I may have to work from the top, removing components until the whole thing builds (at which point it will likely be useless, or all-but) and then seeing what I can add back in. I thought it would be faster to just assume everything can be made to work and only exclude something if it proved intractable, but at this point I've got a ton of very small components and almost no ability to combine them.
It would also help if VS was better at managing such truly immense tasks. For example, I have no simple graph of what all is and is not building, so I'm being forced to manually map that onto the VS dependency tree and see what is blocking a given component from building successfully, and how much is dependent upon it, one erroring project at a time (and there are a *lot* of erroring projects - my last attempt to build any substantial part of the system saw 50 of 400 projects fail).
GoodDayToDie said:
Working on it! I've gotten over half of the projects to build and link, but some other stuff is adamantly refusing to work. I'm beginning to suspect I'll need to work from the other direction - rather than starting at the bottom and building all the dependencies, then combining them into browser components, and then eventually combining all the components into a complete piece of software, I may have to work from the top, removing components until the whole thing builds (at which point it will likely be useless, or all-but) and then seeing what I can add back in. I thought it would be faster to just assume everything can be made to work and only exclude something if it proved intractable, but at this point I've got a ton of very small components and almost no ability to combine them.
It would also help if VS was better at managing such truly immense tasks. For example, I have no simple graph of what all is and is not building, so I'm being forced to manually map that onto the VS dependency tree and see what is blocking a given component from building successfully, and how much is dependent upon it, one erroring project at a time (and there are a *lot* of erroring projects - my last attempt to build any substantial part of the system saw 50 of 400 projects fail).
Click to expand...
Click to collapse
I thinkt tht is a mutch better taktic and mutch less frustrading.
I would love to see just a minimal version of it. After that all the small componens can follow.
50 of 400 is pretty good i think. Better then i expected
Bear in mind that the entire thing is 650 projects. If 50 fail at that level, many of the higher-level ones (dependent upon the lower-level) will fail too. I'll see what I can do. I may or may not be able to get v8 actually working (without it, the JS speed will be very bad, think IE8 at best) and I may have to fall back to the legacy libjpeg (which will cut JPEG render speeds by at least a factor of 2). Skia (2D drawing library used by Chrome) has a bunch of assembly optimizations that I need to get it to use the Arm version of instead. There's a couple of total hacks with the library files I've had to pull, which may or may not result in a working final build. We'll see.
GoodDayToDie said:
Bear in mind that the entire thing is 650 projects. If 50 fail at that level, many of the higher-level ones (dependent upon the lower-level) will fail too. I'll see what I can do. I may or may not be able to get v8 actually working (without it, the JS speed will be very bad, think IE8 at best) and I may have to fall back to the legacy libjpeg (which will cut JPEG render speeds by at least a factor of 2). Skia (2D drawing library used by Chrome) has a bunch of assembly optimizations that I need to get it to use the Arm version of instead. There's a couple of total hacks with the library files I've had to pull, which may or may not result in a working final build. We'll see.
Click to expand...
Click to collapse
the v8 engine ( used in nodejs ) has been ported to ARM :
I still can't link : htt p://ww w.it-wars.com/article305/compiler-node-js-pour-arm-v5
perhaps it will help you
Edit : oups, I just see that another great user of this forum made the port of nodejs to RT
Yep... but they did it without v8. That's not an encouraging result, but I feel like I'm so close...
Is there a GitHub repo so we can help or track the progress of the project ?
Sorry, not at present. There probably should be. The sheer size of the codebase is incredible (about 2.4GB) and having some way to share it practically would be good.
Also, I suspect this would go a lot faster if I don't have to repeat the work of others. I know that there's a working Webkit DLL out there, for example (though with several features, including the V8 JS engine, missing) and if I could get my hands on that it would drastically reduce the number of additional components I need to build. Currently I'm working on the sandbux, but expect that I will need to rip the whole thing out and basically have the browser run as though it was always passed the --no-sandbox parameter, at least for the first build. Too damn much assembly.
http://www.engadget.com/2013/01/22/google-chrome-native-client-arm-support/
This wouldn't have any impact on this project, would it?
Sent from my SCH-I535 using xda-developers app, complete with annoying signatures.
It probably means that NaCl on Windows RT will be possible in the future. At present, I'm cutting it out of the build - too much x86-specific stuff there to port it over myself, and it owuldn't be able to run x86-compiled NaCl code anyhow.
You might have bit off more than you could chew. It'd better if you put your current progress under version control on some public site so that other people may be able to help you.
It's a big and complex project. You are taking a lot of time, and understandably so. But just open up to other people and you could get this done faster.
Yeah, this is probably true. My life also got unexpectedly *busy* in the last week; a couple weeks ago I had many times as much free time as I do now, and so porting has slowed down.
My upload speed would take ages (literally probably at least a day of solid activity; it's embarassingly slow) to push the full source anywhere, but I may make the effort anyhow. I'll have to post it somewhere for GPL compliance in any case...
You may upload only the diff files, they'll probably be smaller then the whole distribution.
Not to pour cold water on you however, IE10 is already faster than the latest Chrome build in Windows Phone, Windows 8.
I don't see the point of this.
I have personally jumped from IE8 > FF > Chrome and finally back to IE10 over the years depends on its usability, smoothness, speed, etc
Speed isn't the only reason to use a browser. I actually prefer IE myself, but there are some things that other browsers do better than it (in the case of Chrome, parts of HTML5, the syncing across Google services, etc.) Also, Chrome gets updated far more often than IE; IE9 was equal with Chrome on speed at its release, and was far behind by the time IE10 came out.
The reason for this project, though, is a mixture of interest in what it takes, and a desire to benefit the community. Microsoft has deeped that only software which they have blessed may run on the Windows RT desktop. I disagree, and have chosen (among several other things) to port a web browser because I feel that it's important for users to have choice.
LastBattle said:
Not to pour cold water on you however, IE10 is already faster than the latest Chrome build in Windows Phone, Windows 8.
I don't see the point of this.
Click to expand...
Click to collapse
Some websites do not get along with the trident rendering engine. Some webdevs are so "Oh f*** IE I don't care" and block access to features just because it is IE. I have experienced this first hand on IE10 on my surface where it tells me to come back when I have a decent browser, only to not have the choice to do that.
This really isn't the webdevs fault either, for years IE was the scum of the internet, only recently has IE caught up to the rest of the browsers (and in my opinion exceeded some) but the years of IE being bad have left a lot of disjointed webdevs who won't even consider giving the latest IE a chance.

[app port] [1/23] dosbox with thumb2 dynrec

hi all,
I've updated DOSBOX's dynrec core to take advantage of the THUMB2 ARMv7 instructions that must be supported on Windows RT. Among other things, this includes being able to move immediates to registers without using a PC relative load, 32-bit instructions that allow access to all registers, etc. I also added a PLI instruction to preload a cacheblock into the instruction cache before it is executed, and PLD instructions to prefetch data for a minor speedup.
Testing on doom timedemo, I get 3721 realtics, which is a modest 5% speedup from the standard dynrec core. Probably not having directX is a significant bottleneck on video drawing, this can get fixed once I get a wrapper around directinput working. Hopefully these changes can also help out mamaich's x86 pe loader out a well.
-- old post below--
I managed to get dosbox compiled with mamaich's new updates (see below), so we now have dosbox with the dynamic recompiling core. I also included network support which seems to work.
The dynamic recompiling core does give better performance for newer apps. 3982 realtics vs 18088 realtics on doom timedemo, so about a 5x gain
The next step would be to get a better graphics core. There is an opengl wrapper for DX11 called gl2dx that I'm looking into.
I'm also looking at putting at optimizing dynrec for thumb2, for example, it seems that CBZ might be useful (except it only lets you branch forward!!!)
Hmm, I have to unfortunately note that DOS4GW doesn't seem to work in this build. I'm working on it!
In the mean time, enjoy commander keen 1 -6?...
no2chem said:
Hmm, I have to unfortunately note that DOS4GW doesn't seem to work in this build. I'm working on it!
In the mean time, enjoy commander keen 1 -6?...
Click to expand...
Click to collapse
So in the end, I think the dynamic recompiler has some issues.
I know that the VirtualAlloc function is correctly setting execute permissions on memory pages, but I don't know if FlushInstructionCache is actually flushing the instruction cache. It's also possible that the dynamic recompiler is emitting bad assembly, but inspecting the code... doesn't look like it!
no2chem said:
So in the end, I think the dynamic recompiler has some issues.
Click to expand...
Click to collapse
The same issue for me (but I'm using a bit obsolete DosBox sources, need to update). May be this is due to EABI changes (for example registers are not preserved), as DosBox dynamic core targets Linux.
Regarding DosBox. Look here - http://vogons.zetafleet.com/viewtopic.php?t=31787
This is the ARMv7 optimized dynamic core for DosBox by M-HT. It is not yet integrated into the official DosBox branch.
If you're concerned about FlushInstructionCache not being implemented correctly, let me know. As part of my work to port Chromium, I've learned how to do cache management in ARM a little closer to the metal (moving the relevant values to the coprocessor registers directly) and have both more control (for example, clearing the data cache lines as well) and more precise control over exactly what is done to the cache lines.
However, bear in mind that I haven't been able to properly test this yet, as too much of the Chromium core still doesn't want to compile for me... *sigh*.
GoodDayToDie said:
If you're concerned about FlushInstructionCache not being implemented correctly, let me know.
Click to expand...
Click to collapse
As far as I see - FlushInstructionCache works as expected (flushes icache). DosBox crashes due to some other reason, maybe due to the ARM-THUMB interworking problems (it is just a guess).
Good to know. It would be annoying if the official API were wrong...
I've found whu DosBox dynamic core crashes at random.
All existing dynamic cores use ARM code, even the thumb files are using the ARM prologue. Though our CPUs can run the ARM code, there is a bug (?) in RT that switches ARM code to be executed as a THUMB at random. For example you call an ARM procedure 100000 times, but at 100001 something happens and the code starts to execute as THUMB. Maybe a task switch occurs, or maybe some interrupt happens, and when OS returns to your code - it sets T bit in CPSR, so the code is interpreted as THUMB.
I was able to modify the prologue generator in risc_armv4le-thumb-iw.h so that it generates only THUMB code, and now everything started to work in my project. Attached. Code is ugly and unfinished and uses some dirty C hacks, I'll fix them later as it is deep night now. But you may look on the changes and implement similar things in a DosBox build.
mamaich said:
I've found whu DosBox dynamic core crashes at random.
All existing dynamic cores use ARM code, even the thumb files are using the ARM prologue. Though our CPUs can run the ARM code, there is a bug (?) in RT that switches ARM code to be executed as a THUMB at random. For example you call an ARM procedure 100000 times, but at 100001 something happens and the code starts to execute as THUMB. Maybe a task switch occurs, or maybe some interrupt happens, and when OS returns to your code - it sets T bit in CPSR, so the code is interpreted as THUMB.
I was able to modify the prologue generator in risc_armv4le-thumb-iw.h so that it generates only THUMB code, and now everything started to work in my project. Attached. Code is ugly and unfinished and uses some dirty C hacks, I'll fix them later as it is deep night now. But you may look on the changes and implement similar things in a DosBox build.
Click to expand...
Click to collapse
I'll take a look at getting that implemented in my build.
Edit: Haven't looked into it much, but DosBox crashes whenever I open any 32-bit stuff.
mamaich said:
I've found whu DosBox dynamic core crashes at random.
All existing dynamic cores use ARM code, even the thumb files are using the ARM prologue. Though our CPUs can run the ARM code, there is a bug (?) in RT that switches ARM code to be executed as a THUMB at random. For example you call an ARM procedure 100000 times, but at 100001 something happens and the code starts to execute as THUMB. Maybe a task switch occurs, or maybe some interrupt happens, and when OS returns to your code - it sets T bit in CPSR, so the code is interpreted as THUMB.
I was able to modify the prologue generator in risc_armv4le-thumb-iw.h so that it generates only THUMB code, and now everything started to work in my project. Attached. Code is ugly and unfinished and uses some dirty C hacks, I'll fix them later as it is deep night now. But you may look on the changes and implement similar things in a DosBox build.
Click to expand...
Click to collapse
thanks for figuring that out! its a bit odd though, don't you think - wouldn't normal ARM apps crash if this was the case? I'm sure most stuff is compiled with interworking, maybe ms interworking convention is different?
Anyway, I was thinking, doesn't homm3 use directx? Is your emulation layer also doing conversion of that code? It seems like it would be useful to have native wrappers for old versions of DX so they can take direct advantage of DX11 native. Alas, that's for another day.... Also, mfc would be nice too... (its there... MS just didn't provide the .lib, ordinals are [noname]'d out, and .pdb from debug server seems to be missing some exports....)
no2chem said:
hi all,
I managed to get dosbox compiled with mamaich's new updates (see below), so we now have dosbox with the dynamic recompiling core. I also included network support which seems to work.
The dynamic recompiling core does give better performance for newer apps.
The next step would be to get a better graphics core. There is an opengl wrapper for DX11 called gl2dx that I'm looking into.
I'm also looking at putting at optimizing dynrec for thumb2, for example, it seems that CBZ might be useful (except it only lets you branch forward!!!)
Click to expand...
Click to collapse
How big performance gains are you talking about?
bartekxyz said:
How big performance gains are you talking about?
Click to expand...
Click to collapse
Eh, 3982 realtics vs 18088 realtics on doom timedemo, so about a 5x gain
Nice! That should (eventually) allow quite a few legacy apps, if they're old enough and/or not to performance-sensitive (i.e. not realtime) to run acceptably.
no2chem said:
doesn't homm3 use directx? Is your emulation layer also doing conversion of that code?
Click to expand...
Click to collapse
No conversion, I just pass DX to the native OS libraries with some minimal wrappers. RT provides even DirectX 1.0 interfaces
The only problem is that RT prohibits non 32-bit color modes, though video driver supports them (at least 16 bit color is supported by tegra driver).
no2chem said:
I managed to get dosbox compiled with mamaich's new updates
Click to expand...
Click to collapse
I'm using ARM/Tumb dynamic code created by M-HT: http://vogons.zetafleet.com/viewtopic.php?t=31787
I've changed just one function.
I'm also looking at putting at optimizing dynrec for thumb2, for example, it seems that CBZ might be useful (except it only lets you branch forward!!!)
Click to expand...
Click to collapse
Please post your code changes somewhere, so that I could use them in my project too
don't you think - wouldn't normal ARM apps crash if this was the case? I'm sure most stuff is compiled with interworking, maybe ms interworking convention is different?
Click to expand...
Click to collapse
NTARM kernel is a THUM2-only kernel. There is no interworking it it. So it is possible that MS code intentionally sets the T bit to THUMB when returning from interrupt or a task-switch. Unfortunately this means that we can't use ARM code in our programs (I have not seen any ARM code in MS progs, only THUMB2). Anyway, THUMB2 code is very efficient now, its speed can be the same as of ARM code, and size is much smaller.
Generally, THUMB2 should outperform ARM, yes. Also, the state of Thumb vs. Arm is actually set in a kind of weird way: rather than manually setting a processor bit, it's controlled by the jump address. Since even THUMB2 instructions are always 16-bit (2-byte) aligned, the least significant bit of an instruction address is never 1. Therefore, that bit is used (on any jump-type instruction) as a flag to indicate whether the processor should use ARM or THUMB2 mode. In theory, this should mean that the OS doesn't care about the mode being used... but I've never written an interrupt handler for ARM (a couple other platforms, but not ARM) and there might be something weird going on with them.
GoodDayToDie said:
rather than manually setting a processor bit, it's controlled by the jump address
Click to expand...
Click to collapse
This is true for normal jumps. And that is why we have to set lowest bit to 1 manually on indirect branches inside autogenerated THUMB code. But for CPU itself the mode is kept in T bit in CPSR. And "something" sets it to 1 "sometimes". I have not done much testing, but I think that this should be an interrupt, as I've seen that bit changed in the middle of a generated ARM code.
Anyway the THUMB2 code is better then ARM for us - it is more compact, so more code may be kept in cache (both in CPU and DosBox dynamic recompilator's).
updated with a thumb2 dynrec.
Touch input for mouse
For anyone that wants to use their touchscreen for mouse input, navigate to their %AppData%\Local\DosBox folder and modify the dosbox-0.74.conf file so that "Autolock=true" is "Autolock=false"
It works pretty well with the touchscreen and an on-screen keyboard for gaming and such

[DEV][ROM] Help with Building Vanilla Android 13 GSI with 16GB RAM and 24GB Swap!!

Hey XDA community!
(I wasn't sure what tags to type and where to post pardon me. I had read the posting guides...)
I'm reaching out to you all for help with building Vanilla Android 13 GSI on a system with limited RAM. Unfortunately, I'm unable to upgrade my RAM due to various constraints. Here's a breakdown of the issue I'm facing:
Memory Consumption and Terminal Closure: When I start the build process, everything seems to work fine initially. However, after approximately 3 minutes, the terminal abruptly closes itself. During this time, the memory consumption reaches its peak, utilizing nearly all available RAM. Next the terminal to close unexpectedly.
High RAM and CPU Usage: Throughout the brief duration that the build process runs, the RAM and CPU usage remain consistently high. This behavior contributes to the subsequent closure of the terminal.
Limited Swap Usage: Despite having a sizable swap space of 24GB, the swap usage remains within 7.5GB limit. It doesn't exceed this threshold and the terminal closure occurs.
Given the constraints preventing me from upgrading the RAM, I'm seeking your expertise to find alternative solutions or workarounds for this issue. I'm open to suggestions, such as optimizing the build environment, modifying specific configuration parameters, or implementing any strategies to stabilize the build process within the limited resources available.
Your valuable insights, experiences, and troubleshooting suggestions would be greatly appreciated. Together, let's explore different avenues and find a way to successfully build the Vanilla Android 13 GSI on this system configuration and if its useful to know I'm on Ubuntu 23.04 LTS and 8GBx2 DDR4 2400Mhz RAM.
Thank you for your support and contributions!
Best regards,
FiniteCode

Categories

Resources