Related
Ok, I've seen numerous questions about the app called HTC Performance & have disassembled the executable. While my knowledge of these thing is by no means great, I have found some very interesting functions.
Maybe someone with more reverse engineering & code experience can take a look, but with IDA Pro there are some very interesting functions & strings.
Some of the calls & code are deprecated & no longer used in WM6 + but some of them are.
It is possible, especially for evB equiped roms, that this prgram acts like a server of sorts for some programs & processes. But being as it is initiated with Smartphone only functions I doubt it.
some of the more interesting functions in the HTC Performance app are:
SHInitExtraControls Which appears to be for Smartphone only
GetSystemMetrics WM6 Pro valid - Gets System Width & Heigth in pixels. Posible uses include program optimization based on the appropriate pixel returns
CreateMutexW - coredll - used to connect to core via net cf for obtaining device info- Usually eVB related apps use to call coredll info
memmove = takes more memory than memcpy but may be used to ensure unicode strings not used on odd memory addresses, this could increase speed on apps that incorrectly do this.
InterlockedCompareExchange, InterlockedDecrement, InterlockedExchange, InterlockedExchangeAdd, and InterlockedIncrement = functions provide a simple mechanism for synchronizing access to a variable that is shared by multiple threads. The threads of different processes can use this mechanism if the variable is in shared memory.
InterlockedCompareExchange = function performs an atomic comparison of the Destination value with the Comperand value. If the Destination value is equal to the Comperand value, the Exchange value is stored in the address specified by Destination. Otherwise, no operation is performed
YAXPAX = can speed up access of written C++ Code
ReleaseMutex = Mutex functions are used to release shared functions
EnumWindows = (..) to execute a task. EnumWindows (..) enumerates through all existing windows (visible or not) and provides a handle to all of the currently open top-level windows. Each time a window is located, the function calls the delegate (named IECallBack) which in turn calls the EnumWindowCallBack (..) function and passes the window handle to it. Not sure howthis is used though.
LoadAcceleratorsW = ??? Appears to be old CE function. Deprecated???
realloc = String Optimization
malloc = RAM Allocation
GetDeviceCaps = gets dev info, can be used to the optimize redraw based on device constraints already known
LocalReAlloc = This function changes the size or the attributes of a specified local memory object. The size can increase or decrease.
EnterCriticalSection = The threads of a single process can use a critical section object for mutual-exclusion synchronization. The process is responsible for allocating the memory used by a critical section object, which it can do by declaring a variable of type CRITICAL_SECTION. can grant exclusive access to memory
ReleaseDC = This function releases a device context (DC), freeing it for use by other applications. The effect of ReleaseDC depends on the type of device context.
Again, I am not a programmer, I know a few things, & am pretty competent with the lower operations of firmware, but the rest of the CE code is not my cup of tea. There are many more functions in HTC Performance. These are only a few functions found after a brief 20 minute peak.
But maybe, maybe, some of the function calls can help us to understand if this app can be moddified to properly function on the Kaiser.
It is possible that on some evB enabled apps, that maybe some of the HTC Performance app are retained & possibly function, that is pure speculation though, & again I doubt it.
Any CE code experts out there wanna take a look? I have, & based on what I've seen, I'll have to say FICTION!
Info
Hi,
Since I haven't really had time to see whats new and all I haven't the foggiest idea what HTC Performance is/what it is supposed to do.
But I can tell you that the functions you listed are not special in any way. Most of them would appear in every application that displays anything on the screen. For instance getting system metrics is required for any application displaying scroll bars, etc. All the interlocked and critical section stuff is just thread synchronization.
But that's OK, the use of windows APIs really doesn't mean much, other than the application runs on Windows...its the non-API stuff that defines an application. If the application you're looking at writes changes to registry keys, etc. you may want to look into that as those would be the lasting changes to the device.
Cheers,
Why is there concurrency related stuff in there? Surely that should all be handled by the operating system, rather than a running application? (That said, most of my concurrency knowledge is either theoretical or based at a high level, so I could be wrong here).
High Performance Cab
You can also check this thread...
http://forum.xda-developers.com/showthread.php?t=366792
Quentin- said:
Hi,
Since I haven't really had time to see whats new and all I haven't the foggiest idea what HTC Performance is/what it is supposed to do.
But I can tell you that the functions you listed are not special in any way. Most of them would appear in every application that displays anything on the screen. For instance getting system metrics is required for any application displaying scroll bars, etc. All the interlocked and critical section stuff is just thread synchronization.
But that's OK, the use of windows APIs really doesn't mean much, other than the application runs on Windows...its the non-API stuff that defines an application. If the application you're looking at writes changes to registry keys, etc. you may want to look into that as those would be the lasting changes to the device.
Cheers,
Click to expand...
Click to collapse
No, registry would not necesarily be the place to look. For this application the registry will only report whether or not the App is running or not. It is supposed to be a speed optimization application. My thought were that it could possibly be acting as a server of sorts, handling some thread optimization & resource allocation. Correct though, most of those API's are importing device info, beyond that, I am lost as to how it handles it, if it does at all. That said, there are many things that don't show up in the registry & many things can't be altered via the registry b'c they are set or handled before initialization or loading of the registry, possibly thru the OAL. Even tougher to say in a two chip device with as little known info as the msm7xxx processors. If anyone with real coding knowledge could take a look at the executable & see just what it's doing with the info, that would be great.
dperren said:
Why is there concurrency related stuff in there? Surely that should all be handled by the operating system, rather than a running application? (That said, most of my concurrency knowledge is either theoretical or based at a high level, so I could be wrong here).
Click to expand...
Click to collapse
That is indeed the center of my question & also what leads me to question how the app functions. Is it playing a role in thread priority optimization, & possibly redraw based on the polls, or is it just a partially gutted application miising a ton of registry data that never worked?
I made a kitchen that creates ROMs in both XPR and LZX compressions.
I'm also trying to port it to the Artemis, the Trinity, and the Hermes.
The way the kitchen work is:
Run "RunMe.bat"
Choose compression algorithm. (XPR or LZX)
Follow the normal Bepe's kitchen process.
Wait as the kitchen creates the ROM (like Bepe's kitchen, but with whatever compression you chose.)
The kitchen will automatically open up the imgfs.bin in a hex editor and automatically adjust it for the wanted compression before it builds the ROM.
It automatically inserts the proper XIP drivers.
It will automatically set the Pagepool to 4MB but give you the option to change it to something else as it does.
It then automatically creates the NBH and then finally launches whatever flasher (CustomerRUU, FlashCenter, or whatever your devices use) to flash the ROM.
For those who don't know what LZX compression is:
It's a compression algorithm that, although slower (by 1-4% in real life use) gives a good amount of free storage space. In some case (like in the Herald) it makes the ROM so small that it has to be flashed through an SD card due to the Herald's flashing size requirements. On an average 50mb ROM, it takes off about 10mb. The actual cooking itself does take a LOT more CPU and RAM to do in your PC, though. Especially the RAM. (It's because the tools that actually do the compression weren't really optimized for the job.)
Anyhow, let me know if you want it.
I would love to use it if you're willing to share. I really want to get into cooking. I just don't have time these days.
Here is a bit that one of my buddies wrote on compression, for those interested. He's actually discussing LZMA (which is what 7zip uses by default), which is similar to the LZ1 dictionary (the same dictionary used in LZX compression algorithms). This should help people to understand what settings to use depending on the situation.
Hehe. I've long been fascinated by compression algorithms ever since studying the original Lempel-Ziv algorithm some years ago.
The key here is the dictionary, which, in a simplified nutshell, stores patterns that have been encountered. You encounter pattern A, compress/encode it, and then the next time you see pattern A, since it's already in the dictionary, you can just use that. With a small dictionary, what happens is that you'll encounter pattern A, then pattern B, C, D, etc., and by the time you encounter pattern A again, it's been pushed out of the dictionary so that you can't re-use it, and thus you take a hit on your size. All compression algorithms basically work on eliminating repetition and patterns, so being able to recognize them is vital, and for dictionary-based algorithms (which comprise most mainstream general-purpose algorithms), a small dictionary forces you to forget what you saw earlier, thus hurting your ability to recognize those patterns and repetitions; small dictionary == amnesia.
Solid compression is also important because without it, you are using a separate dictionary for every file. Compress file 1, reset dictionary, compress file 2, reset dictionary, etc. In other words, regular non-solid compression == amnesia. With solid compression, you treat all the files as one big file and never reset your dictionary. The downside is that it's harder to get to a file in the middle or end of the archive, as it means that you must decompress everything before it to reconstruct the dictionary, and it also means that any damage to the archive will affect every file beyond the point of damage. The first problem is not an issue if you are extracting the whole archive anyway, and the second issue is really only a problem if you are using it for archival backup and can be mitigated with Reed-Solomon (WinRAR's recovery data, or something external, like PAR2 files). Solid archiving is very, very important if you have lots of files that are similar, as is the case for DirectX runtimes (since you have a dozen or so versions of what is basically the same DLL). For example, when I was compressing a few versions of a DVD drive's firmware some years ago, the first file in the archive was compressed to about 40% or so of the original size, but every subsequent file was compressed to less than 0.1% of their original size, since they were virtually duplicates of the first file (with only some minor differences). Of course, with a bunch of diverse files, solid archiving won't help as much; the more similar the files are, the more important solid archiving becomes.
The dictionary is what really makes 7-Zip's LZMA so powerful. DEFLATE (used for Zip) uses a 32KB dictionary, and coupled with the lack of solid archiving, Zip has the compression equivalent of Alzheimer's. WinRAR's maximum dictionary size is 4MB. LZMA's maximum dictionary size is 4GB (though in practice, anything beyond 128MB is pretty unwieldy and 64MB is the most you can select from the GUI; plus, it doesn't make sense to use a dictionary larger than the size of your uncompressed data, and 7-Zip will automatically adjust the dictionary size down if the uncompressed data isn't that big). Take away the dictionary size and the solid compression, and LZMA loses a lot of its edge.
In the case of the DirectX runtimes, because of their repetitive nature and the large size of the uncompressed data, a large dictionary and solid compression really shines here, much more so than it would in other scenarios. My command line parameters set the main stream dictionary to 128MB, and the other much smaller streams to 16MB. For the 32-bit runtimes, where the total uncompressed size is less than 64MB, this isn't any different than the Ultra setting in the GUI, and 7-Zip will even reduce the dictionary down to 64MB. For the 64-bit runtimes, the extra dictionary size has only a modest effect because 64MB is already pretty big, and also because the data with the most similarity are already placed close to each other (by the ordering of the files). The other parameters (fast bytes and search cycles) basically makes the CPU work harder and look more closely when searching for patterns, but their effect is also somewhat limited. In all, these particular parameters are only slightly better than Ultra from the GUI, but I much rather prefer running a script than having to wade through a GUI (just as I prefer unattended installs over wading through installs manually).
Oh, and another caveat: Ultra from the GUI will apply the BCJ2 filter to anything that it thinks is an executable. This command line will apply it to everything. Which means that this command line should not be used if you are doing an archive with lots of non-executable code (in this case, the only non-executable is a tiny INF file, so it's okay). This also means that this command line will perform much better with files that don't get auto-detected (for example, *.ax files are executable, but are not recognized as such by 7-Zip, so an archive with a bunch of .ax files will do noticeably better with these parameters than with GUI-Ultra). If you want to use a 7z CLI for unattended compression but would prefer that 7-Zip auto-select BCJ2 for appropriate files, use -m0=LZMA:d27:fb=128:mc=256 (which is also much shorter than that big long line; for files that do get BCJ2'ed, it will just use the default settings for the minor streams).
dumpydooby said:
I would love to use it if you're willing to share. I really want to get into cooking. I just don't have time these days.
Here is a bit that one of my buddies wrote on compression, for those interested. He's actually discussing LZMA (which is what 7zip uses by default), which is similar to the LZ1 dictionary (the same dictionary used in LZX compression algorithms). This should help people to understand what settings to use depending on the situation.
Hehe. I've long been fascinated by compression algorithms ever since studying the original Lempel-Ziv algorithm some years ago.
The key here is the dictionary, which, in a simplified nutshell, stores patterns that have been encountered. You encounter pattern A, compress/encode it, and then the next time you see pattern A, since it's already in the dictionary, you can just use that. With a small dictionary, what happens is that you'll encounter pattern A, then pattern B, C, D, etc., and by the time you encounter pattern A again, it's been pushed out of the dictionary so that you can't re-use it, and thus you take a hit on your size. All compression algorithms basically work on eliminating repetition and patterns, so being able to recognize them is vital, and for dictionary-based algorithms (which comprise most mainstream general-purpose algorithms), a small dictionary forces you to forget what you saw earlier, thus hurting your ability to recognize those patterns and repetitions; small dictionary == amnesia.
Solid compression is also important because without it, you are using a separate dictionary for every file. Compress file 1, reset dictionary, compress file 2, reset dictionary, etc. In other words, regular non-solid compression == amnesia. With solid compression, you treat all the files as one big file and never reset your dictionary. The downside is that it's harder to get to a file in the middle or end of the archive, as it means that you must decompress everything before it to reconstruct the dictionary, and it also means that any damage to the archive will affect every file beyond the point of damage. The first problem is not an issue if you are extracting the whole archive anyway, and the second issue is really only a problem if you are using it for archival backup and can be mitigated with Reed-Solomon (WinRAR's recovery data, or something external, like PAR2 files). Solid archiving is very, very important if you have lots of files that are similar, as is the case for DirectX runtimes (since you have a dozen or so versions of what is basically the same DLL). For example, when I was compressing a few versions of a DVD drive's firmware some years ago, the first file in the archive was compressed to about 40% or so of the original size, but every subsequent file was compressed to less than 0.1% of their original size, since they were virtually duplicates of the first file (with only some minor differences). Of course, with a bunch of diverse files, solid archiving won't help as much; the more similar the files are, the more important solid archiving becomes.
The dictionary is what really makes 7-Zip's LZMA so powerful. DEFLATE (used for Zip) uses a 32KB dictionary, and coupled with the lack of solid archiving, Zip has the compression equivalent of Alzheimer's. WinRAR's maximum dictionary size is 4MB. LZMA's maximum dictionary size is 4GB (though in practice, anything beyond 128MB is pretty unwieldy and 64MB is the most you can select from the GUI; plus, it doesn't make sense to use a dictionary larger than the size of your uncompressed data, and 7-Zip will automatically adjust the dictionary size down if the uncompressed data isn't that big). Take away the dictionary size and the solid compression, and LZMA loses a lot of its edge.
In the case of the DirectX runtimes, because of their repetitive nature and the large size of the uncompressed data, a large dictionary and solid compression really shines here, much more so than it would in other scenarios. My command line parameters set the main stream dictionary to 128MB, and the other much smaller streams to 16MB. For the 32-bit runtimes, where the total uncompressed size is less than 64MB, this isn't any different than the Ultra setting in the GUI, and 7-Zip will even reduce the dictionary down to 64MB. For the 64-bit runtimes, the extra dictionary size has only a modest effect because 64MB is already pretty big, and also because the data with the most similarity are already placed close to each other (by the ordering of the files). The other parameters (fast bytes and search cycles) basically makes the CPU work harder and look more closely when searching for patterns, but their effect is also somewhat limited. In all, these particular parameters are only slightly better than Ultra from the GUI, but I much rather prefer running a script than having to wade through a GUI (just as I prefer unattended installs over wading through installs manually).
Oh, and another caveat: Ultra from the GUI will apply the BCJ2 filter to anything that it thinks is an executable. This command line will apply it to everything. Which means that this command line should not be used if you are doing an archive with lots of non-executable code (in this case, the only non-executable is a tiny INF file, so it's okay). This also means that this command line will perform much better with files that don't get auto-detected (for example, *.ax files are executable, but are not recognized as such by 7-Zip, so an archive with a bunch of .ax files will do noticeably better with these parameters than with GUI-Ultra). If you want to use a 7z CLI for unattended compression but would prefer that 7-Zip auto-select BCJ2 for appropriate files, use -m0=LZMA:d27:fb=128:mc=256 (which is also much shorter than that big long line; for files that do get BCJ2'ed, it will just use the default settings for the minor streams).
Click to expand...
Click to collapse
You have no idea how much the geek in me loved this post. ^_^ I have a thing about compression, too, but I'm just not very well versed in it. I only know the basic details of compression.
Answered 2 of my questions I tried finding out the other day. Thank you guys. I learn so much here.
The problem
Since HTC introduced Sense 3.5, themers faced a huge problem. The previously used software "M10Tools" wouldn't work with the new version of Sense.
Flemmard and me tried countless hours decoding the new image format, but without any success. The new image format is totally diferent to anything else previously seen.
I made this thread to search for help from all the awesome devs on XDA, hoping that we might find one who can help.
The history
Let me start this with some introduction to the m10 format itself.
The images I am talking about are parts of one big file - the m10 file. We usually have multiple images per m10 file, but the number doesn't really matter.
Together with the raw image data we get a set of meta information. We are not exactly sure what the values mean, but we can guess the meaning from the history of the old, decodable images.
We used to have information like width, height, payload of the image and an integer indicating what kind of image type we have. We know the actual image type for a few of these intergers, but with Sense 3.5, 3.6 and 4.0 HTC added at least two new types.
The facts
We don't have any hard facts for these image types but looking at the "old" image types, we can guess a few things:
The images are in a format the GPU can render directly (Like s3tc, ATC, QTC, etc) (At least this used to be the case, might have changed)
Images are most likely compressed. The ratio between assumed size (based on meta data) and the actual data size indicates some heavy compression. The data itself obviously looks compressed too.
There are no headers or any other help. It is just raw data.
We don't know exactly how the decoded images actually look like, so we can't say what the images display. However, due to latest archievements we "might" know this for images from Sense 3.5 and 3.6 if needed.
The handling software side is all in a few libs and NOT in smali / java, so we can't look for stuff there, however we have the libs, so if someone is pro with assembler he might find out something
I will provide a download which contains several chunks of image data and the according meta data.
If you consider working on this, please do not refrain from thinking about super simple solutions, we worked so long on this that we might be totally confused.
One thing though, this might sound arrogant, but this here really is only for people who have some decent knowledge about file formats, image compression or OpenGL.
The image types
Here is a list of image type we already know ( remember, we don't know where the numbers come from, might be some enum in native code or so)
Type 4: Raw RGB
Type 6: Raw RGBA (still used rather often)
Type 8: ATC RGB (doesn't seem to be used at all anymore)
Type 9: ATC RGBA Explicit (doesn't seem to be used at all anymore)
As you can see we got types WITH and WITHOUT alpha encoding.
Here is the list of UNKNOWN formats:
Type 13 (used way less than type 14, so maybe no alpha?)
Type 14 (this is the most used type, so I assume this one supports alpha encoding)
When thinking about what the data might be, don't throw away crazy ideas like "The data is S3TC /ATC /whatever but compressed again by some 'normal' compression algorithm". Maybe they just replaced type 8 and 9 with an additional compression on top of these types.
The meta data
Okay, so now lets talk about the meta data we get together with the actual data:
We get 4 more or less known chunks of information per image (plus a few unknown things)
Image type (described earlier) (Example: 6)
Image width (Example: 98)
Image height (Example: 78)
A more complex value containing multiple values at once.
Example: "98:78:0:30576"
We used to know the meaning of three of these values. However we are not sure for the new images. Lets explain the old meaning first:
98: Width, same value as the value above
78: Height, same value as the value above
0: It's always 0, we have no idea what it means, but since it's static we didn't care
30576: this used to be the data size. This image has a resolution of 98*78 = 7644 pixels. With a data size of 30576 that means we got 4bytes per pixel.
Lets take a look at the new images now. We still get the same information, however the meaning seems to have changed a bit:
Image type (described earlier) (Example: 14)
Image width (Example: 997)
Image height (Example: 235)
A more complex value containing multiple values at once.
Example: "1000:236:0:118000"
This is the assumed new meaning
1000: Width, but rounded up to a multiple of 4
236: Height, but rounded up to a multiple of 4
0: It's always 0, we have no idea what it means, but since it's static we didn't care
118000: this value is now exactly half of the rounded resolution (1000 * 236 / 2 = 118000)
This would mean only half a byte per pixel. One big problem here: the actual data size does not match this value at all!
The data is way smaller than this value, which indicates that it got compressed a second time
Now lets talk about some very important piece of information: HTC uses the SAME image formats on BOTH a Tegra 3 and a Qualcomm Snapdragon S4.
This obviously means that both Tegra and Snapdragon need to be able to handle this. However, also keep in mind that HTC bought S3 graphics and thefore might got some advantages here.
You can find a statistic on the used formats in the download, it's an Excel sheet with two diagrams showing the usage.
Now this was a long post, I hope someone is still reading this and might have some ideas about what's going on here.
Feel free to ask any questions concerning this.
I am also available in #virtuousrom on Freenode, per PM here or via email: diamondback [at] virtuousrom [dot] com
Download:
The download contains a bunch of unknown images of types 13 and 14 together with their meta data (like explained above)
Download image pack
Solution
After some digging I estimated that the data is compressed with fastlz [0]. Also so if you decompress it, you get exactly Width*Height bytes of data. I dont know the format this data is in, but i guess its the same the uncompressed data (type 8 or 9 or so?) was. Maybe someone could check up on that.
[0] http://fastlz.org/
onlyolli said:
After some digging I estimated that the data is compressed with fastlz [0]. Also so if you decompress it, you get exactly Width*Height bytes of data. I dont know the format this data is in, but i guess its the same the uncompressed data (type 8 or 9 or so?) was. Maybe someone could check up on that.
[0] http://fastlz.org/
Click to expand...
Click to collapse
You are indeed right. We actually found the same a few hours ago What a weird conincidence... :victory:
Type 4 and 6 are changed, they are zipped now too. Which actually breaks backwards compatibility with older Sense versions...
Inside of the zipped data are ETC images, which also explains how they can use the same on S4 and Tegra 3.
The type 14 actually contains TWO images, both ETC. Since ETC doesn't support alpha one is the image and one is an alpha mask...
Funny trick HTC!
I implemented a browser based encryption solution which runs on Windows RT (and many other Windows computers). All I wrote was the HTML page, I am leveraging Crypto.JS javascript library for encryption algorithm. I am using the HTML 5 File API implementation which Microsoft provides for reading and writing files.
I make no claim on this but seems to work good for me. Feel free to feedback if you have any suggestions. The crypto.js library supports many different algorithms and configuration so feel free to modify it to your own purposes.
You can download the zip file to your surface, extract it and load the TridentEncode.htm file into Internet Explorer.
If you want to save to custom directory you probably need to load it from the Desktop IE instead of metro IE (to get the file save dialog). I usually drag and drop the file onto desktop IE and from there I can make favorite. This should work in all IE 11 and probably IE 10 browsers... if you use other browsers you may need to copy paste into the fields since the File API implementation seems rather browser specific. Running the html page from the local filesystem means that there is no man-in-the-middle which helps eliminate some of the vulnerabilities of using a javascript crypto implementation. You could also copy the attached zip file to your skydrive to decrypt your files from other computers.
Skydrive files in theory are secure (unless they are shared to public) so this might be useful for adding another layer of protection to certain info.
Again, use at your own risk, but feel free to play around and test it, and offer any suggestions or critiques of its soundness, or just use it as a template for your own apps.
Ok... this is really cool! Nice idea, and a good first implementation.
With that said, I have a few comments (from a security perspective). As an aside, minimized JS is the devil and should be annihilated with extreme prejudice (where not actually being used in a bandwidth-sensitive context). Reviewing this thing took way too long...
1) Your random number generation is extremely weak. Math.random() in JS (or any other language I'm aware of, for that matter) is not suitable for use in cryptographic operations. I recommend reading http://stackoverflow.com/questions/4083204/secure-random-numbers-in-javascript for suggestions. The answer by user ZeroG (bottom one, with three votes, as of this writing) gets my recommendation. Unfortunately, the only really good options require IE11 (or a recent, non-IE browser) so RT8.0 users are SOL.
NOTE: For the particular case in question here (where the only place I can see that random numbers are needed is the salt for the key derivation), a weak PRNG is not a critical failing so long as the attacker does not know, before the attack, what time the function is called at. If they do know, they can pre-compute the likely keys and possibly succeed in a dictionary attack faster than if they were able to generate every key only after accessing the encrypted file.
2) Similarly, I really recommend not using a third-party crypto lib, if possible; window.crypto (or window.msCrypto, for IE11) will provide operations that are both faster and *much* better reviewed. In theory, using a JS library means anybody who wants to can review the code; in practice, the vast majority of people are unqualified to either write or review crypto implementations, and it's very easy for weaknesses to creep in through subtle errors.
3) The default key derivation function (as used for CryptoJS.AES.encrypt({string}, {string})) is a single iteration of MD5 with a 64-bit salt. This is very fast, but that is actually a downside here; an attacker can extremely quickly derive different keys to attempt a dictionary attack (a type of brute-force attack where commonly used passwords are attempted; in practice, people choose fairly predictable passwords so such attacks often succeed quickly). Dictionary attacks can be made vastly more difficult if the key derivation process is made more computationally expensive. While this may not matter so much for large files (where the time to perform the decryption will dominate the total time required for the attack), it could matter very much for small ones. The typical approach here is to use a function such as PBKDF2 (Password-Based Key Derivation Function) with a large number of iterations (in native code, values of 20000-50000 are not uncommon; tune this value to avoid an undesirably long delay) although other "slow" KDFs exist.
4) There's no mechanism in place to determine whether or not the file was tampered with. It is often possible to modify encrypted data, without knowing the exact contents, in such a way that the data decrypts "successfully" but to the wrong output. In some cases, an attacker can even control enough of the output to achieve some goal, such as compromising a program that parses the file. While the use of PKCS7 padding usually makes naïve tampering detectable (because the padding bytes will be incorrect), it is not a safe guarantee. For example, a message of 7 bytes (or 15 or 23 or 31 or any other multiple of 8 + 7) will have only 1 byte of padding; thus there is about a 0.4% (1 / 256) chance that even a random change to the ciphertext will produce a valid padding. To combat this, use an HMAC (Hash-based Message Authentication Code) and verify it before attempting decryption. Without knowing the key, the attacker will be unable to correct the HMAC after modifying the ciphertext. See http://en.wikipedia.org/wiki/HMAC
5) The same problem as 4, but from a different angle: there's no way to be sure that the correct key was entered. In the case of an incorrect key, the plaintext will almost certainly be wrong... but it is possible that the padding byte(s) will be correct anyhow. With a binary file, it may not be possible to distinguish a correct decryption from an incorrect one. The solution (an HMAC) is the same, as the odds of an HMAC collision (especially if a good hash function is used) are infinitesimal.
6) Passwords are relatively weak and often easily guessed. Keyfiles (binary keys generated from cryptographically strong random number generators and stored in a file - possibly on a flashdrive - rather than in your head) are more secure, assuming you can generate them. It is even possible to encrypt the keyfile itself with a password, which is a form of two-factor authentication: to decrypt the data that an attacker wants to get at, they need the keyfile (a thing you have) and its password (a thing you know). Adding support for loading and using keyfiles, and possibly generating them too, would be a good feature.
The solutions to 3-5 will break backward compatibility, and will also break compatibility with the default parameters for openssl's "enc" operation. This is not a bad thing; backward compatibility can be maintained by either keeping the old version around or adding a decrypt-version selector, and openssl's defaults for many things are bad (it is possible, and wise, to override the defaults with more secure options). For forward compatibility, some version metadata could be prepended to the ciphertext (or appended to the file name, perhaps as an additional extension) to allow you to make changes in the future, and allow the encryption software to select the correct algorithms and parameters for a given file automatically.
Wow thanks GDTD that's great feedback
Not sure about his minified sources, the unminified aes.js in components is smaller than the minified version (which I am using) in rollups. I'll have to look into what his process for 'rollup' is to see if I can derive a functional set of non-minified script includes. If I can do that it would be easier to replace (what I would guess is) his reliance on Math.random.
His source here mirrors the unminified files in components folder : https://code.google.com/p/crypto-js/source/browse/tags/3.1.2/src
msCrypto that would be great, I had no idea that was in there. I found a few (Microsoft) samples so I will have to test them out and see if I can completely substitute that for crypto.js. Would be more keeping in line with the name I came up with.
Currently this version only works for text files, I am using the FileAPI method reader.readAsText(). I have been trying to devise a solution for binary files utilizing reader.readAsArrayBuffer but as yet I haven't been able to convert or pass this to crypto.js. I will need to experiment more with base64 or other interim buffer formats (which Crypto.js or msCrypto can work with) until I can get a better understanding of it.
Metadata is a great idea, maybe i can accommodate that with a hex encoded interim format.
You seem extremely knowledgeable in the area of encryption, hopefully i can refine the approach to address some of the issues you raised by setting up proper key, salt, and IV configuration... I'm sure I will understand more of your post as i progress (and after reading it about 20 times more as a reference).
Too bad we don't a web server for RT, that would at least open up localStorage for json serialization (mostly for other apps I had in mind). I guess they might not allow that in app store though. Could probably run one of a developers license though (renewed every 1-2 months)?
nazoraios said:
Too bad we don't a web server for RT, that would at least open up localStorage for json serialization (mostly for other apps I had in mind). I guess they might not allow that in app store though. Could probably run one of a developers license though (renewed every 1-2 months)?
Click to expand...
Click to collapse
I cant comment too much on the encryption, GoodDayToDie has covered anything I could contribute and more. But there is a functioning web server on RT. Apache 2.0 was ported: http://forum.xda-developers.com/showthread.php?t=2408106 I dont know if everything is working on it, I dont own an RT device and last time I tried I couldnt get apache to run on 64 bit windows 8 anyway (needed it at uni, spent hours going through troubleshooting guides and it never worked on my laptop, gave up and ran it under linux in virtualbox where it took 2 minutes to have functioning the way I needed it to).
Curious about the performance. Speaking of encryption, 7-Zip has it built-in, and from the discuss in StackExchange, it seems pretty good.
One of the neat things about this thing (local web app? Pseudo-HTA (HTml Application)? Not sure if there's a proper name for such things) is that it runs just fine even on non-jailbroken devices. That's a significant advantage, at least for now.
Running a web server should be easy enough. I wrote one for WP8 (which has a subset of the allowed APIs for WinRT) and while the app *I* use it in won't be allowed in the store, other developers have taken the HTTP server component (I open-sourced it) and packaged it in other apps which have been allowed just fine. With that said, there are of course already file crypto utilities in the store anyhow... but they're "Modern" apps so you might want to develop such a server anyhow so you can use it from a desktop web browser instead.
Web cryptography (window.crypto / window.msCrypto) is brand new; it's not even close to standardization yet. I'm actually kind of shocked MS implemented it already, even if they put it in a different name. It's pretty great, though; for a long time, things like secure random numbers have required plugins (Flash/Java/Silverlight/whatever). Still, bear in mind that (as it's still far from standardized), the API might change over time.
Yep, I think of them as Trident apps since trident is what Microsoft calls their IE rendering engine, but I guess they are sort of offline web apps (which come from null domain). Being from null domain you are not allowed to use localstorage which is domain specific. You also are not allowed to make ajax requests. You just have file api and json object serialization to make do with I/O.
Another app I am working on is a kind of Fiddler app similar to http://jsfiddle.net/ where you can sandbox some simple script programs.
Kind of turning an RT device into a modern/retro version of a commodore 64 or other on-device development environments. Instead of basic interpreter you've got your html markup and script.
I have an attached demo version which makes available jquery, jquery-ui, alertify javascript libraries in a sandbox environment that you can save as .prg files.
I put a few sample programs in the samples subfolder. Some of the animation samples (like solar system) set up timers which may persist even after cleared so you might need to reload the page to clear those.
It takes a while to extract (lots of little files for all the libraries) but once it extracts you can run the html page and I included a sample program 'Demo Fiddle.prg' you can load and run to get an idea.
I added syntax highlighting editors (EditArea) which seems to work ok and let's you zoom each editor full screen.
The idea would be to take the best third party javascript libraries and make them available and even make shortcuts or minimal API for making it easier to use them. Common global variable, global helper methods, ide manipulation. I'd like to include jqplot for charting graphs, maybe for mathematical programs and provide api for user to do their own I/O within the environment.
These are just rough initial demos, and obviously open source so if anyone wants to take the ideas and run with them i'd be interested in seeing what others do. Otherwise I will slowly evolve the demos and release when there are significant changes.
Author: Apriorit (Research and Reverse Engineering Team)
Permanent link: www(dot)apriorit(dot)com/dev-blog/76-reversing-of-mobile-phones-insertions
Once we faced the need to investigate how Samsung cellular phones work; we required some information from them, which is not documented (and will never be, for sure). So this article is about interesting points our reverser had met while working with Samsung cellular phones firmware.
Reversing of Insertions for ARM-based Mobile Phones
I have managed to research insertions of all Samsung' generations, including CDMA (except for the smartphones only). In every Samsung phone the ARM-compatible processor with a set of ARM7TDMI commands is used. Insertions are built on the basis of three OS: RTCX, RTK, Nucleus, and compiled on different compilers. I have seen insertions compiled on ADS (SDT) and IAR.
On forums people call Samsung's generations in different way: somebody divides them into Gumi/Suvon (2 cities in Korea), others give code names - "Sysol", "Agere", "VLSI", "Conexant" and "Ancient". I have come to a conclusion that it's more correctly to divide them according to the phone processor.
Processor Models
OM6357 (aka Sysol) E100, E700, E720, E800, E820, S50x, X100, X460, X60x
M46 (aka Conexant) A100, A110, A200, A300, A400, M100, T208
SkyWorks (aka Conexant) C100, C108, C110, P510, P518
ONE-C (aka VLSI) R2XX, Nxxx, Txxx(except for T208)
Trident (aka Agere) Dxxx, Qxxx, Sxxx(except for S50x), Vxxx, C200, E105, E310, E400, E600, E710, E810, X105, X400, X42x, X450 etc.
MSMxxxx all CDMA
Hope, I haven't made any mistakes in this list. )
According to the list, insertions within the same generation are very similar and, to be honest, sometimes they are twins at all (with extremely slight changes). For example, in X100 there are obvious traces of E100/E700/X600 - why then there is a code for working with the second display, camera and IRDA, which it didn't have in a whole life?
Naturally, OS is the same for the whole generation:
OM6357 - RTK
M46 - RTCX OSE
SkyWorks - RTCX OSE
ONE-C - RTK
Trident - Nucleus
MSMxxxx - don't know exactly, it might be any OS from Qualcomm. It's just clear that they are collected to ADS/SDT.
If you are going to investigate the low level, then SDK from corresponding OS will be to the point. Another helpful thing is the symbolical information, which can be met in some insertions archives. Sometimes you can come across the insertions with .lst, .sym, .map, .out files, containing the information, extremely useful in researches. In particular, such files occur in almost all C100, S500 insertions. When talking about the other models, the situation is worse and you have to content yourself with symbols signatures, made for insertion of the same generation. For example, for M46 I have managed to find just one insertion with symbols and it was from A110. But signatures made from it perfectly lie down on A200, A300 etc.
Interpretation of the symbolical information
MAP format
.map files contain the information on modules included in the insertion and look like
Code:
Base Size Type RO? Name
0 20 CODE RO AAA_vectors from object file obj/isr.o
20 38e8 CODE RO C$$code from object file../../src/t9latin.o
3908 30 CODE RO C$$code from object file obj/mmi_date.o
3938 5a4 CODE RO C$$code from object file hw_slow.o
3edc 874 CODE RO C$$code from object file rtkgo.o
etc.
where
Base - displacement in an insertion file.
Size - length.
Type - region type.
RO? - region access type.
Name - original file name, part of which was included in the insertion
How all this can be interpreted? For example, this way: starting with displacement 20, there is a block of the code (CODE) 38e8 length - it's an access to Read Only block. The fact that block has CODE attribute is far from being means that the WHOLE area is filled with a code. Actually, it is a code plus data, just as if the block has DATA type it does not mean that it is necessary to make it all by data in IDA.
Without the names/symbols file this information can be used only for determination of insertion code size (i.e. to not get into the graphics). Therefore, we will better examine SYM format.
SYM Format
.sym files are the mines of information. They look like:
Code:
Symbol Table
AAA_vectors$$Base 000000
AAA_vectors$$Limit 000020
VectorMap$$Base 1006a3c
VectorMap$$Limit 1006a60
isr$$Base 12774c
isr$$Limit 127bb0
gl_MaskIT 1000078
Rtk_RegionCount 100564c
rtk_WorthItSched 10056a0
Rtk11_Schedule 11f5c8
etc.
It is a little bit easier here, because the name-address correspondence exists. But as for the addresses, there are some secrets - a set of names exists, containing $ sign and having the special status. Symbols with $$Base at the end indicate the beginning of virtual address space area, $$Limit indicates the end. I.e. here we have the information on segments. It is possible to make a memory map of these segments and see how the parts of binary code are being thrown to different addresses. Building memory map should be started with such symbols:
Code:
Image$$RO$$Base 000000
Image$$RO$$Limit 1afef4
Image$$RW$$Base 1000000
Image$$RW$$Limit 107dad4
Image$$ZI$$Base 1006a60
Image$$ZI$$Limit 107dad4
RO - Read Only, indicates code addresses.
RW - Read/Write i.e. it is RAM.
ZI - Zero Initialized. RAM, which is being stuffed with zero values when mobile phone is turned on.
Thus segments can be easily created on these addresses. Now we look further:
Code:
AAA_vectors$$Base 000000
AAA_vectors$$Limit 000020
C$$code$$Base 000020
C$$code$$Limit 127310
C$$code$$__call_via$$Base 127310
C$$code$$__call_via$$Limit 127320
Example$$Base 127320
Example$$Limit 127324
HAL_boot$$Base 127324
HAL_boot$$Limit 12735c
RtkCode$$Base 12735c
RtkCode$$Limit 127408
SysSupportCode$$Base 127408
SysSupportCode$$Limit 12744c
boot$$Base 12744c
boot$$Limit 127654
clib$$Base 127654
clib$$Limit 12774c
isr$$Base 12774c
isr$$Limit 127bb0
C$$constdata$$Base 127bb0
C$$constdata$$Limit 1afef4
C$$data$$Base 1000000
C$$data$$Limit 1005a38
Stacks$$Base 1005a38
Stacks$$Limit 1006a3c
VectorMap$$Base 1006a3c
VectorMap$$Limit 1006a60
C$$zidata$$Base 1006a60
C$$zidata$$Limit 107dad4
In this interesting way they go one after another. If you wish, it is possible to divide them into segments to corresponding addresses, but this is merely a logic division. Moreover, in .sym file these lines are scattered badly. And more sooner or later a question appears: why the code size is 1afef4, if length of insertion file is 1b6950? Where to put the rest 6a60 byte? We look again on the initial memory map:
Code:
Image$$RW$$Base 1000000
Image$$RW$$Limit 107dad4
Image$$ZI$$Base 1006a60
Image$$ZI$$Limit 107dad4
RAM ends on 107dad4 address, block 1006a60-107dad4 is zero initialized, hence there is a question: what does initialize the 1000000-1006a60 block, which size is exactly 6a60? Absolutely right, those odd bytes. If analyse the OS start code, then in the RAM initialization procedure you will find the same copying.
In the newer insertions there is a chance to come across the next inscriptions:
Code:
Load$$IRAM$$Base 639a74
Image$$IRAM$$Base 2010000
Image$$IRAM$$Length 0015a4
They should be understood this way: data of 15a4 length are being loaded from 639a74 file displacement to the 2010000 address.
We continue the analysis of symbols with the $ sign:
x$litpool$ - Literal Pool, pieces of the data from functions. At the end of many functions indexes, lines, constants are placed, and x$litpool$ specifies the beginning of such constants.
x$litpool_e$ - Literal Pool end.
$T is merely for debugger. Indicates the addresses where the PC register change take place. So, at these addresses transition commands BL/BEQ/B/BX etc. are placed.
$$- addresses where there is a change of ARM/THUMB state.
There are also C$$code symbols, but I haven't found what it is.
Other names without $ sign are the names for constants and functions. They can be freely used.
If the archive with an insertion contains both MAP and SYM, it is an ideal variant - when you set a name taken from SYM it is possible to check up whether it lays in the code area by using data from MAP. If yes, we may freely indicate it as code not being afraid, that code/data will be determined in IDA incorrectly.
LST Format
It's a real paradise for a reverser, in these files lays all at once. They consist of five parts:
Image Symbol Table - symbols... their meaning I have not understood yet
Local Symbols - everything is clear from the name
Global Symbols - .sym file analogue.
Memory Map of the image - memory map! All at once!
Image component sizes - .map file analogue
The information is so detailed, that even the processor mode for each function is specified.
OUT Format
Have met it only in the Nucleus-based insertions. Here can be tlink.out and tsymb.out files:
tsymb.out - ordinary SYM
think.out - MAP file to which almost useless linker information is added.
Now when we are armed with the symbolical information we can load the insertion in IDA.
What to do if there are no symbols at all
"When there is no toothbrush at hand..." Yes, we take IDA, emulating debuggerand brains in the hands. IDA is "must have". The emulating debugger for ARM, called Trace32, can be taken here.
First of all, we load the insertion in IDA to 0 address. I.e. the whole insertion is being loaded to default addresses. Then look what is on 0 address.
Code:
BOOT:00000000 B ResetHandler
BOOT:00000004 B loc_3B4
BOOT:00000008 B loc_410
BOOT:0000000C B loc_42C
BOOT:00000010 B loc_488
etc.
The code in any case begins with 0 address. In all Samsungs and, as I guess, not only in Samsungs an insertion begins with the interruption vectors. These are eight B commands in ARM state, i.e. 8 vectors. 0 address is a vector of null interruption or insertion start/restart. This zero interruption simply starts the mobile phone and thus handler leads to the system loader:
Code:
BOOT:00000048 ResetHandler; CODE XREF: BOOT:loc_0 _ j
BOOT:00000048 MRS R0,CPSR
BOOT:0000004C BIC R0,R0,#0x1F
BOOT:00000050 ORR R0,R0,#0x13
BOOT:00000054 ORR R0,R0,#0xC0
BOOT:00000058 MSR CPSR_cxsf,R0
BOOT:0000005C LDR R3,=(InitialHWConfig+1)
BOOT:00000060 MOV LR,PC
BOOT:00000064 BX R3
If the jump from zero address goes to the non-existent address it means that the rest part of the code is mapped to some other addresses. It's easy to determine to which exactly. For example, we have such beginning:
Code:
BOOT:00000000 B 0x4003CE
And there is no code on the 4003CE address. We look on 3CE displacement and see an ARM-code. It means the rest part of insertion is displaced on 0x400000. So we have to copy piece of insertion with interruption handlers, load them to zero address and then load an insertion from 400000 address. Now our code is in the right place. We go further. It is necessary to find out where are the RAM and area of input/output ports. The ports are usually either in the end (addresses from about e0000000 and higher) or in the beginning of the memory (up to 0x200000), depending on where the insertion is being loaded. There can be several RAM areas. First of all, we see ports initialization:
Code:
BOOT:00000588 MOV R1,#1
BOOT:0000058A LDR R0,=0xE0006000
BOOT:0000058C LSL R1,R1,#0x1B
BOOT:0000058E STR R1,[R0]
BOOT:00000590 STR R1,[R0,#0x10]
BOOT:00000592 STR R1,[R0,#0x20]
BOOT:00000594 LDR R1,=loc_20102
BOOT:00000596 LDR R0,=0xE0003040
BOOT:00000598 STR R1,[R0, #4]
BOOT:0000059A LDR R1,=0x20003
BOOT:0000059C STR R1,[R0, #8]
BOOT:0000059E LDR R0,=0xE0003000
BOOT:000005A0 MOV R1,#0xC
BOOT:000005A2 STR R1,[R0,#0x24]
I.e. since around E0000000 there is an area of input/output ports. Its size doesn't exceed the size of segment and therefore it's possible to create a segment of 0x10000 size. Now we go further. In any insertion there are RAM area which is initialized by zero values and the area which is filled by initial settings which are taken from an insertion. We are looking for copy cycles, so we need the debugger.
Here we see copying:
Code:
BOOT:000000D4 LDR R0,=0x63B018
BOOT:000000D8 LDR R1,=0x1000000
BOOT:000000DC LDR R3,=0x1045B38
BOOT:000000E0 CMP R1,R3
BOOT:000000E4 BEQ loc_F8
BOOT:000000E8
BOOT:000000E8 loc_E8; CODE XREF: BOOT:000000F4 _ j
BOOT:000000E8 CMP R1,R3
BOOT:000000EC LDRCC R2,[R0],#4
BOOT:000000F0 STRCC R2,[R1],#4
BOOT:000000F4 BCC loc_E8
The block is being copied from 63B018 address to 1000000 address of insertion. The length is 45B38.
This is the first RAM area. Now we look for the second one, whose zero initialization should be nearby:
Code:
BOOT:000000F8 LDR R1,=0x11ED9E4
BOOT:000000FC MOV R2,#0
BOOT:00000100 CMP R3,R1
BOOT:00000104
BOOT:00000104 loc_104; BOOT:00000108 _ j
BOOT:00000104
BOOT:00000104 STRCC R2,[R3],#4
BOOT:00000108 BCC loc_100
Indeed, there is a stuffing with zero values in the area from 1045B38 to 11ED9E4, so here we have the second part. If there are any areas, then there will certainly be zero or copy initialization. Other memory pieces can be found only analytically, but we have got the basis already.
The further research depends on the presence of symbols/signatures. If yes, then everything comes to looking for the necessary function in the names list. What to do if not? First of all, it is necessary to determine approximate code bounds and, if possible, to find functions in the code. The most primitive and effective way is to search for a push command with which 60 % of insertion code begins. Insertion code usually consists of Thumb code on 90 %, so we should look for B5 byte (push) and try to define it as the code in IDA. Insertion code usually takes less than 50 % of the whole size, the rest part is for graphics and language resources. Else I can say that very often at the end of the code there are copyrights lines, a kind of "Samsung corp. 199x-200x ARM ADS 1.2".
Some code has been revealed, around 20% were harmed by IDA itself, because it often can't cope with THUMB/ARM transition. And now we have to take anything left lying around loose, i.e. what had been left by programmers. And what they had left? Trace and Assert. And any trace and assert doesn't go without sprintf/printf. We have to find it. It's easy - we should just look for the "%s" line. We need that which obviously contains a pattern of the error message. With xref we find where this line is used and it will be exactly sprintf, followed by Trace or Assert. Now, with basing on the error messages, we can name the functions. I.e. walking with xref to the Trace/Assert function, we can find output of more than half of mistakes. Further functions naming is possible by searching the following words:
Code:
Bad
Fail
Incorrect
Invalid
Error
Memory
File
Null
No
Critical
Abnormal
etc.
This way we will find some more error output functions. Thus we will gradually gain the information, not being based on anything except for the insertion.