The goal of this project is to build a Spotify client that can learn my listening habits and skip some songs that I would normally skip. I have to admit, this need comes from my laziness. I don't want to have to create or find playlists when I'm in the mood for something. What I want is to select a song in my library and be able to shuffle other songs and remove songs that don't "flow" from the queue.
In order to achieve this, I need to learn some kind of model that can perform this task (maybe more on that in a future post). But in order to be able to train a model, I first need data to train it.
I need the complete listening history, including those songs I skipped. Getting the history is easy. While the Spotify API only allows getting the last 50 played songs, we can set up a cron job to repeatedly poll the endpoint. The full code has been posted here: https://gist.github.com/SamL98/c1200a30cdb19103138308f72de8d198
The hardest part is tracking the skips. The Spotify Web API does not provide any endpoints for this. Previously I created some services to control playback using the Spotify AppleScript API (the rest of this article will cover the MacOS Spotify client). I could use these services to keep track of skipped content, but that feels like sidestepping the challenge. How can I complete it?
I recently learned about the technique of hooking, where you can "intercept" function calls generated from the target binary. I think this would be the best way to track skips.
The most common hook type is interpose hook. This type of hook overrides relocations in the PLT, but what does that actually mean?
The PLT or Procedure Linkage Table allows your code to reference an external function (think libc) without knowing where that function is in memory, you just reference an entry in the PLT. The linker performs a "relocation" at runtime for each function or symbol in the PLT. One benefit of this approach is that if the external function is loaded at a different address, only the relocation in the PLT needs to be changed, rather than each reference to the function in the code.
So when we create an interpose hook for printf, whenever the process we are hooking calls printf, we will call the implementation of printf instead of libc (our custom library will usually call the standard implementation as well) .
After having some basic background knowledge about hooks, we are ready to try to insert a hook into Spotify. But first we need to figure out what we want to hook.
As mentioned before, you can only create an interpose hook for an external function, so we will look for the function in libc or Objective-C runtime.
When researching where to hook, I thought a good place to start hooking would be the Spotify handle "media control keys" or F7-F9 on my MacBook. Assume the handlers for these keys call functions when the Next button is called in the Spotify app. I finally found the SPMediaKeyTap library at: https://github.com/nevyn/spmediakeytap. I thought I'd give it a try and see if Spotify copied and pasted the code from this library. In the SPMediaKeyTap library, there is a method startWatchingMediaKeys. I ran the strings command on the Spotify binary to see if they had this method, and sure enough:
Bingo!! If we load the Spotify binary into IDA (the free version of course) and search for this string, we find the corresponding method:
If we look at this function corresponding Looking at the source code, we will find the interesting parameter tapEventCallback of the CGEventTapCreate function:
If we look back at the disassembly, we can see that the sub_10010C230 subroutine is passed as the tapEventCallback parameter. If we look at the source code or disassembly of this function, we see that only one library function CGEventTapEnable is called:
Let's try hooking this function.
The first thing we need to do is create a library to define our custom CGEventTapEnable. The code is as follows:
#include <corefoundation> #include <dlfcn.h> #include <stdlib.h> #include <stdio.h> void CGEventTapEnable(CFMachPortRef tap, bool enable) { typeof(CGEventTapEnable) *old_tap_enable; printf(“I'm hooked!\n”); old_tap_enable = dlsym(RTLD_NEXT, “CGEventTapEnable”); (*old_tap_enable)(tap, enable); }</stdio.h></stdlib.h></dlfcn.h></corefoundation>
dlsym function call to obtain the address of the actual library CGEventTapEnable function. Then we call the old implementation so we don't accidentally break anything. Let’s compile our library like this (https://ntvalk.blogspot.com/2013/11/hooking-explained-detouring-library.html):
gcc -fno-common -c <filename>.c gcc -dynamiclib -o <library> <filename>.o</filename></library></filename>
现在,让我们尝试在插入钩子时运行Spotify:DYLD_FORCE_FLAT_NAMESPACE=1 DYLD_INSERT_LIBRARIES=
Spotify打开正常,但Apple的系统完整性保护(SIP)没有让我们加载未签名库:(。
幸运的是,我是Apple的reasonably priced developer项目的成员,所以我可以对库进行代码签名。这个问题算是得到了解决。让我们用100美元证书签名我们的库,运行上一个命令,然后......
失败。这一点不奇怪,Apple不允许你插入使用任何旧标识签名的库,只允许使用签名原始二进制文件时使用的库。看起来我们必须要找到另一种方法来hook Spotify了。
作为补充说明,细心的读者可能会注意到我们hook的函数CGEventTapEnable,只有在media key event超时时才会被调用。因此,即使我们可以插入钩子,我们也可能不会看到任何的输出。本节的主要目的是详细说明我最初的失败(和疏忽),并作为一个学习经验。
经过一番挖掘,我发现了一个非常棒的库HookCase:https://github.com/steven-michaud/HookCase。HookCase让我们实现一种比插入钩子( patch hook)更为强大的钩子类型。
通过修改你希望hook的函数触发中断插入Patch hooks。然后,内核可以处理此中断,然后将执行转移到我们的个人代码中。对于那些感兴趣的人,我强烈建议你阅读HookCase文档,因为它更为详细。
Patch hooks不仅允许我们对外部函数的hook调用,而且允许我们hook目标二进制文件内的任何函数(因为它不依赖于PLT)。HookCase为我们提供了一个框架来插入patch和/或interpose hooks,以及内核扩展来处理patch hooks生成的中断,并运行我们的自定义代码。
既然我们已经有办法hook Spotify二进制文件中的任何函数了,那么只剩下最后一个问题......就是位置在哪?
让我们重新访问SPMediaKeyTap源码,看看如何处理媒体控制键。在回调函数中,我们可以看到如果按下F7,F8或F9(NX_KEYTYPE_PREVIOUS,NX_KEYTYPE_PLAY等),我们将执行handleAndReleaseMediaKeyEvent选择器:
然后在所述选择器中通知delegate:
让我们看看repo中的这个delegate方法:
事实证明它只是为处理keys设置了一个模板。让我们在IDA中搜索receiveMediaKeyEvent函数,并查看相应函数的图形视图:
看起来非常相似,不是吗?我们可以看到,对每种类型的键都调用了一个公共函数sub_10006FE10,只设置了一个整数参数来区分它们。让我们hook它,看看我们是否可以记录按下的键。
我们可以从反汇编中看到,sub_10006FE10获得了两个参数:1)指向SPTClientAppDelegate单例的playerDelegate属性的指针,以及2)指定发生了什么类型事件的整数(0表示暂停/播放,3表示下一个,4表示上一个)。
看看sub_10006FE10(我不会在这里包含它,但我强烈建议你自己检查一下),我们可以看到它实际上是sub_10006DE40的包装器,其中包含了大部分内容:
哇!这看起来很复杂。让我们试着把它分解一下。
从这个图的结构来看,有一个指向顶部的节点有许多outgoing edges:
正如IDA所建议的那样,这是esi(前面描述的第二个整数参数)上的switch语句。看起来Spotify的处理的不仅仅是Previous,Pause/Play和Next。让我们把关注点集中到处理Next或3 block:
Admittedly, it took me some time to do this, but I want to draw your attention to call r12 on the fourth line at the bottom. If you look at some other cases, you'll find a very similar pattern of calling registers. This seems like a nice function, but how do we know where it is?
Let's open a new tool: debugger. I had a lot of trouble initially trying to debug Spotify. Now maybe it's because I'm not very familiar with debuggers, but I think I came up with a pretty clever solution.
We first set a hook on sub_10006DE40, and then we trigger a breakpoint in the code. We can do this by executing the assembly instruction int 3 (e.g. debugging like GDB and LLDB).
Here's what a hook looks like in the HookCase framework:
After adding this to the HookCase template library, you must also add it to the user_hooks array:
# Then we can use the template provided by Makefile HookCase to compile it. The library can then be inserted into Spotify using the following command: HC_INSERT_LIBRARY=
We can then run LLDB and attach it to the running Spotify process like this:
Try pressing F9 (if Spotify is not the active window , it may open iTunes). The line int $3 in the hook should trigger the debugger.
Now we can enter the sub_10006DE40 entry point. Note that the PC will be at the location corresponding to the address shown in IDA (I think this is due to where the process loads into memory). In my current process, the push r15 instruction is at 0x10718ee44:
In IDA, the address of this instruction is 0x10006DE44, which gives us an offset of 0x7121000. In IDA, the address where the r12 instruction is called is 0x10006E234. We can then add the offset to that address and set a breakpoint accordingly, b -a 0x10718f234, and continue.
When we hit the target instruction, we can print out the contents of register r12:
All we have to do is subtract the offset from this address , look, we got our nominal address: 0x100CC2E20.
Now, let’s hook this function:
Add it to the user_hooks array, compile, run, and Observation: Every time we press F9 or click the next button in the Spotify app, our messages are logged.
Now that we have hooked the skip function,
I will post the rest of the code, but I will not complete the reverse engineering of the rest because of this article The article is already long enough.
In short, I also hooked the previous function (this is a good exercise if you follow this). Then, in both hooks, I first check if the current song is already halfway through. If so, I'm doing nothing and assuming I'm just bored of the song rather than finding it inappropriate. Then on backs (F7), I pop last skip.
I would like to say a few words about how to check whether the current song is halfway through. My original approach was to actually call popen and then run the corresponding AppleScript command, but that doesn't feel right.
I ran class-dump on the Spotify binary and found two classes: SPAppleScriptObjectModel and SPAppleScriptTrack. These methods expose the necessary properties required for playback position, duration and track ID. I then hooked getters for these properties and called them using next and back hooks (I think Swizzle makes more sense, but I can't get it to work).
I use a file to track skips, where the first line contains the number of skips, on skips we increment this counter and write the tracking ID and timestamp to the file on the line specified by the counter. On the back button, we just decrement this counter. This way, when we press the back button, we just set the file to write new skips to the backtracked file.
The above is the detailed content of How to reverse engineer Spotify.app and hook its functions to obtain data. For more information, please follow other related articles on the PHP Chinese website!