Reverse engineering the Vuze XR camera

15-3-2020

The Humaneyes Vuze XR is a 2D360 and 3D180 camera in one. It can be controlled over Wi-Fi through an app from iOS or Android. It allows live streaming to Facebook and YouTube (although it seems the iOS app is pending some verification at YouTube) and allows streaming to arbitrary RTMP services from Android.

Here, we look at the camera’s hardware and software to figure out if it is possible to obtain a live camera feed from it. Up to now, I have not been successful; however, there are some interesting areas to further explore:

  • Finding out how to use the already running RTMP server on port 1935 to provide me with live video. It appears all that I would need is the correct stream identifier.
  • Modifying the firmware, i.e. to enable a remote shell (this should be fairly easy as the functionality is already in there, but disabled; i.e. a remote execution script on port 2211).
  • Reverse engineering the WebSocket/Protobuf-protocol used between the iOS app and the camera. This appears to be plaintext and allows access to the video feed.

Hardware

  • The device appears to be based on the Ambarella H2 platform. This SoC is designed for video applications and contains a quad-core ARM Cortex 53 (AArch64), codecs (H.264/H.265) and ISP (image signal processor).

Software

  • The Ambarella H2 is advertised with a “Fast Boot ThreadX / Linux Dual OS”.
  • The device’s firmware can be downloaded here. Binwalk reveals that this is likely a firmware image containing multiple partitions, each with what appears cryptographical certificates (this means changing the firmware likely is not going to be easy).
  • Extracting with dji-firmware-tools‘ amba_fwpak script reveals this is likely an Ambarella standard firmware file:
$ dji-firmware-tools/amba_fwpak.py -s -vv -m ./firmwareVXR.bin 
 ./firmwareVXR.bin: Opening for search
 ./firmwareVXR.bin: Extracting entry  0, pos     6144, len 11817984 bytes
 ./firmwareVXR.bin: Entry  0 checksum 03E5F1B7
 ./firmwareVXR.bin: Extracting entry  1, pos 11824384, len  6438912 bytes
 ./firmwareVXR.bin: Entry  1 checksum A7CE918A
 ./firmwareVXR.bin: Extracting entry  2, pos 18263552, len  8777728 bytes
 ./firmwareVXR.bin: Entry  2 checksum 5548B611
 ./firmwareVXR.bin: Extracting entry  3, pos 27041536, len  6386312 bytes
 ./firmwareVXR.bin: Entry  3 checksum A10973BB
 ./firmwareVXR.bin: Extracting entry  4, pos 33428104, len 14610432 bytes
 ./firmwareVXR.bin: Entry  4 checksum B37C192B
 ./firmwareVXR.bin: Extracting entry  5, pos 48038792, len    21587 bytes
 ./firmwareVXR.bin: Entry  5 checksum 3102C068

Looking at the extracted firmware contents:

  • Part 0: contains libjpeg, audio things, and some references to “HEVM RTOS” (the “HE” in “HEVM” appears to refer to “HymanEyes”). This could very well be the ThreadX part of the software.
  • Part 1: contains references to “orccode” and “ucode”, which makes this very likely to be firmware for the ISP.
  • Part 2: an 8MB blob containing references to an audio file and “Rhonda Software”. Looking at the string contents this appears to contain things related to testing/calibrating the device at manufacturing.
  • Part 3: appears to be Linux kernel files.
    • A linux make config file
    • Two CPIO archives (init ram disks?), one of which is XZ compressed
    • a Linux kernel for aarch64, “Linux kernel version “4.4.13 (root@atrbuildl11.rhonda.vtc.ru) (gcc version 7.2.1 20171011 (Linaro GCC 7.2-2017.11) ) #1 SMP PREEMPT Tue Aug 13 17:51:1”.
  • Part 4: a SquashFS file system
  • Part 5: A ‘device tree blob’, a file indicating to a Linux kernel how a device’s hardware is laid out.

Looking at the SquashFS file system, a few interesting things can be found:

  • The hostname of the machine is “H2” and the Linux installation is based on Buildroot.
  • In /etc/init.d, several services are started in the following order:
    • syslog (Busybox)
    • mdev (Busybox). The mdev.conf file does not contain anything out of the ordinary.
    • Kernel modules are loaded from /etc/modules.d. The loaded modules are likely related to the camera hardware (rh_sockets, v4l2loopback, rh_v4l2).
    • Random seed is loaded
    • A script mounts ‘ambafs’ file systems FL0, FL1 and SD0.
    • dbus
    • network (dhcp server and loopback are configured)
    • inetd. The /etc/inetd.conf file contains two services:
      • a (Busybox) telnetd service on port 23 with no further special configuration.
      • The ‘/bin/tx-rexec’ script on 127.0.0.1:2211. This script simply executes any command it receives as shell command (!). It might be used as a quick way to execute things as root (but as everything appears to run as root anyway, it might rather be called from the ThreadX part of the system).
    • nginx. This provides a web server at port 81 which allows access to the DCIM folder on the SD card, allowing (probably) the app to download videos and pictures.
    • Some “rh_conn_app”. This daemon appears to set up a WebSocket server on port 9001 and appears to talk to ThreadX (likely to configure the camera).
    • dnsmasq
    • rygel (a DLNA server). DLNA can be started by holding down certain keys on the device.
  • The /opt folder contains scripts to start/stop certain functionality (streaming, dlna, wifi).
  • It appears the ThreadX part is responsible for the basic device functionality (i.e. controlling the camera, responding to button presses, et cetera). It is very likely the ThreadX part is controlled from Linux over a socket (through the rh_conn_app) and that ThreadX controls Linux by calling the /opt shell scripts (i.e. to start DLNA in response to a button press).
  • Below we will see that an RTMP server is running. This service is not started from Linux but can be started from the /opt/streamer.sh or /opt/playback.sh scripts. It is likely that the service is started after a call from the ThreadX firmware.

Protocol

Nmap

  • When connecting over Wi-Fi the device hands out IP addresses in the 192.168.42.x range. The device itself can be found at 192.168.42.1.
  • Running nmap shows a few open ports:
PORT     STATE SERVICE
 23/tcp   open  telnet
 53/tcp   open  domain
 81/tcp   open  hosts2-ns
 1935/tcp open  rtmp
 9001/tcp open  tor-orport 
  • The DNS server is likely necessary to tell devices that connect to the device over Wi-Fi that they in fact cannot reach the internet through this device.
  • The telnet service accept connections but immediately disconnects.
  • On port 81, the nginx server is running.
  • On port 1935, an RTMP server is running. VLC is able to connect to it but cannot obtain any video, not knowing the required stream name.
  • Port 9001 appears to be running a WebSocket server based on libwebsockets.

App communication

Using ‘rvictl -s UDID’, a virtual interface can be created that is able to peek into the traffic between an iPhone and the device using Wireshark. This reveals that, the app is talking to port 3000 (this is a bit strange as nmap only sees an open port 9001). It is very likely that this is the rh_con_app. There appears to be communication based on Protobuf RPC calls. At some point the app requests to start streaming and the stream is send over the connection itself (i.e. the RTMP service is not used).