The mobile threat landscape continues to evolve with alarming speed, shifting away from simple credential harvesting toward complete device subversion. Recently, the Zimperium zLabs research team identified a highly invasive, multi-stage Android banking trojan named Rokarolla. Named after its underlying Command and Control (C2) infrastructure, this piece of mobile malware is engineered not just to steal data, but to achieve complete administrative and operational control over infected hosts. Rokarolla is distributed primarily via malicious websites, such as hxxps[://]infocontablidades[.]it[.]com/, where it utilizes sophisticated social engineering to masquerade as mainstream applications like TikTok or Google Chrome. Once active, the trojan systematically targets an expansive footprint of 217 distinct cryptocurrency and financial applications, wielding an arsenal of 137 unique commands to execute unauthorized financial fraud entirely undetected by the victim.
The Infection Lifecycle: Dropper Mechanics and Permission Escalation
Typical structural workflow of Accessibility Service exploitation by mobile malware. Source: Guardsquare The compromise begins with a multi-stage infection chain designed to bypass modern Android security configurations. Initially, a user unwittingly downloads a malicious dropper application. This primary stage acts as a delivery vehicle, tricking the victim into executing a secondary installation under the guise of an urgent Google Play Protect update. By decoupling the initial delivery from the core malicious execution block, the malware effectively sidesteps automated gatekeeping and static analysis tools. Once the secondary payload is installed, it immediately targets Android’s Accessibility Services. This framework, originally intended to assist users with disabilities, is abused by Rokarolla to programmatically parse on-screen User Interface (UI) nodes, map visual coordinates, and simulate touch events. To seal its control over device communications, the malware aggressively requests specialized system privileges, explicitly forcing the user to grant it the default SMS handler and default Call handler roles, alongside notification listener permissions.


Network Telemetry and Dynamic C2 Infrastructure Resilience
Upon completing its local privilege escalation, Rokarolla establishes an encrypted HTTPS communication channel with its C2 server to transmit a comprehensive JSON-formatted telemetry payload. This device fingerprinting packet provides the threat actors with deep visibility into the host’s operating environment. The transmitted data includes granular metrics such as the unique botId, exact appVersion, hardware architecture (hardware: "ranchu"), androidVersion (tested up to SDK 32/Android 12), time zone, battery levels, display density, and exact RAM and storage availability (e.g., total vs. free megabytes). This precise profiling enables attackers to identify whether the malware is running within an emulator sandbox or an actual victim’s device. To safeguard its communication lines against network-level blocks, domain revocation, or security sinkholes, Rokarolla features a robust configuration framework. The C2 infrastructure utilizes remote configuration responses via commands like update_config_domen, allowing it to dynamically distribute a live array of backup domains—such as beralisvc.info, blestorians.cfd, abiorime.cfd, and morevoms.cfd—maintaining operational persistence even under active incident response.
Exploiting UI Overlays for Credential Theft and Background Concealment
A core capability of Rokarolla is its highly effective overlay framework, which allows it to draw arbitrary visual windows directly on top of legitimate applications. The malware exploits this to harvest the host’s master unlock credentials—including PINs, unlock patterns, and alphanumeric passwords. By rendering a fake lock screen overlay that perfectly mimics the native Android system interface, the malware intercepts the user’s keystrokes, instantly exfiltrating the credentials back to the C2 server. This lets the threat actors maintain persistent interactive access, even if the device is locked. Beyond credential theft, overlays are used tactically to mask malicious background actions. During unauthorized processing or deep background data extraction, the trojan triggers commands like <liveoverlay16> or <show_loading_overlay> to display full-screen “installing update” animations. This completely suppresses user interaction and visually blinds the victim to the malicious tasks executing beneath the surface.
Financial Target Mapping and Automated SQLite Injection
The targeted theft of banking and cryptocurrency assets relies on a highly automated web injection pipeline. Rokarolla continuously communicates with a specialized server-side endpoint labeled <monitored_app_full> to pull an active list of target package names. Each application entry in the server’s payload contains a specific package name, a target phishing URL, and a binary status value. A status value of 0 means the application is in passive monitoring mode, while a value of 1 immediately activates the overlay injection engine.
The malware inventories the local device using connections like <get_html_mapping> and <save_apps>. If a match is found with an active financial application (such as Imagin bank), the trojan silently fetches the corresponding phishing payload content from the C2 server and caches it inside a local SQLite database. When the victim attempts to open the genuine financial app, Rokarolla detects the launch event via Accessibility Services, intercepts the window focus, and instantly draws the locally stored HTML phishing interface over the legitimate application to steal credit card details and login credentials.

Advanced Data Exfiltration: Keylogging, Crypto-Swapping, and Pseudo-VNC
For broader data collection, Rokarolla implements deep content scanning and active data manipulation. By monitoring active on-screen UI nodes, the malware scans for specific terms tied to popular communications platforms, such as WhatsApp identifiers like ‘Chats’, ‘Calls’, and ‘New group’. This lets the malware isolate, categorize, and extract specific contacts via the <get_contact> command. Simultaneously, it runs background utilities like <start_keylogger>, <startuilogger>, and <textextract> to capture raw keystrokes and textual outputs across all apps. It also features clipboard manipulation capabilities, allowing it to silently modify data copied to the system clipboard without user knowledge. This is heavily exploited during cryptocurrency transactions to swap out a legitimate recipient’s wallet address with an attacker-controlled address, routing funds directly to the thieves.
Furthermore, instead of relying on the standard Android MediaProjection API for video streaming—which frequently triggers explicit user permission prompts and visible recording icons—Rokarolla introduces a unique “Pseudo-VNC” surveillance mechanism. The malware programmatically captures localized screenshots of the screen, compresses the raw image data into standard PNG files, appends precise millisecond timestamps, and exfiltrates them silently. Immediately following each transfer, the trojan runs a strict cleanup and memory reset routine, ensuring stability and erasing its local forensic footprint before the next capture cycle.
Communication Hijacking and Absolute Evasion Tactics
To prevent financial institutions from notifying victims of anomalous account activity, Rokarolla hijacks the device’s communication channels. Because it holds default calling and SMS application roles, it can execute commands like <disable_calls> and <calls_block> to intercept inbound phone calls and mute live alerts from fraud detection centers. It can also read, suppress, and send SMS messages on behalf of the user, facilitating the silent interception and deletion of critical One-Time Passwords (OTPs) and Two-Factor Authentication (2FA) tokens.
To maintain persistence, the trojan actively cripples the device’s built-in defenses by targeting Google Play Protect using commands such as <disable_google_play> and <protectorgoogle_disable>. Finally, it hides its launch icon from the application drawer to prevent uninstallation, mutes all system audio and haptic feedback to mask unauthorized activity, and forces the display back-light to remain on indefinitely so that network connections and background overlay routines are never interrupted by system sleep cycles.
Our Technical Perspective and Strategic Opinion on the Rokarolla Variant
The discovery of the Rokarolla banking trojan underscores a shifting paradigm in mobile malware design, moving away from simple data collection and toward comprehensive, automated device subversion. In our opinion, the most alarming aspect of Rokarolla is not just its multi-stage injection framework, but its highly tactical avoidance of standard Android APIs. By bypassing the MediaProjection API in favor of a customized, snapshot-based Pseudo-VNC surveillance engine, the developers of Rokarolla have highlighted a critical blind spot in standard runtime behavioral monitoring.
Furthermore, this case illustrates that relying solely on user-perceived permission boundaries is no longer enough to secure mobile operating systems. Once a user is socially engineered into enabling Accessibility Services, the traditional Android application sandbox is effectively compromised. Rokarolla’s ability to silently load tailored HTML phishing templates into an internal SQLite database and swap them dynamically based on active package names shows a high level of operational maturity.
Defending against threats like Rokarolla requires moving beyond legacy signature-based detection and shifting toward continuous, behavioral AI-driven analysis. Security systems must look for foundational anomalies—such as an application requesting default communication roles while simultaneously muting system audio and disabling device protections. Ultimately, Rokarolla serves as a stark reminder that as mobile OS security tightens, threat actors will continue to weaponize legitimate accessibility and administrative features, turning the system’s own design principles against the user.
