Machine Learning-Based Mobile Threat Monitoring and Detection

Mobile computing is now dispersed and ubiquitous throughout our society, providing new avenues for communication, productivity, and commerce. Mobile networks are available and free to access throughout public spaces, laptops have provided a platform for on-the-go business management, and smartphones and tablets extend our access to information to the moment when we wake up in the morning. Yet, as we have seen with the adoption of each new piece of technology, end users are often at significant risk. Malicious intentions and knowledge of the underlying technology provide the means for cyber attacks that compromise personal and business data. The need for dynamic defense systems to analyze and prevent malicious intrusion is then self-apparent. To address the pertinent issue of security in mobile technology, in this paper we propose a security system to detect malicious activities in Android OS devices. Our proposed system is designed to operate in a cloud environment, incurs low overhead to the Android device, and facilitates multiple smartphones simultaneously. The system centers around four primary components, the Android App, the Security Server, Google Cloud Messaging (GCM) service, and the Analysis Module. Facilitating message delivery, the GCM service processes requests from the security server to the Android app. Transmitting from the mobile app, data is collected and stored from multiple devices to the security server for preprocessing. In the analysis module, static and dynamic analysis are performed simultaneously, allowing for rapid inspection of common attributes in Android malware, while complex algorithms are applied in extended examination. Once the analysis is completed, a report can be sent to the device, and a security administrator overseeing the system can view the status of the various devices in the web visualization to improve security awareness and act on security risks. The remainder of the paper is as follows: In Section II, we give the background and provide a literature review on the topics of smart mobile security and cloud computing security. In Section III, we describe the designed system architecture and outline the basic workflow. In Section IV, we describe the data analysis module and process and evaluation results. Finally, we conclude the paper in Section V.

SYSTEM ARCHITECTURE AND WORKFLOW Our developed security framework is designed to be generic, and can operate as a cloud-based service. The primary components are the Security Server, the Google Cloud Messaging (GCM) service, the Mobile Application, and the Analysis Testbed, as outlined below. In combination, they provide the scaffolding for the interconnection of the mobile device to a powerful analysis testbed. • Security Server: The security hub is a typical LAMP (Linux, Apache, MySQL, PHP) server. Specifically, the Linux operating system is Ubuntu 14.04 server, running Apache2, MySQL 5.5 and PHP-5. The server is managed by the web application programmed in PHP, implementing the Laravel 5 framework, and the requisite dependencies. The web application utilizes the MySQL relational database model to store and manage smartphone system information, and application and log data, received from connected Android devices. It also provides the interface for security visualization for the security operator. • GCM: Google Cloud Messaging is a cloud-based messaging service provided by Google for developing applications compatible with Android, iOS, and Chrome. The primary feature of the GCM is to provide an authenticated project message host that queues messages while the device is not connected, and supports upstream and downstream messaging. • Mobile App: The mobile application is developed for Android OS devices. While operating, the mobile application is designed to listen for GCM messages and send system, application, and log data to the security server upon request for security analysis. • Data Analysis: The Data Analysis module utilizes Weka software [17] to analyze the test dataset comprised of dynamically obtained Android system calls and static permission information of malicious and benign applications. From the training analysis, the module can make predictive assertions about new applications based on their attributes. The workflow, shown in Figure 1, illustrates the typical interaction between the system components. The two timedependent system operations are on Startup of the application, and Daily updates to identify system changes. These daily updates can additionally be initiated from the visualization in the security hub, at the discretion of a security administrator. Startup – (1) Upon initializing the Android application, the GCM server is contacted to retrieve the registration token. This enables the initialization of new devices, as well as for situations where the registration ID is refreshed. After (2) retrieving the registration token, (3) the application contacts the web server and passes three key values: the GCM registration token, and the device Brand and Serial. The application server then queries the database for the target data. If the information matches, no further action is taken. However, if the GCM registration token has changed, it is then updated in the database. Should the device identifying information not be found, it is immediately added to the database, and (4) the server messages the GCM server, requesting additional system information from the device. (5) The GCM server passes the message to the device, and (6) the device passes the requested data to the web server to be added to the newly created database entry. Daily – (7) Independently, the web server will message the GCM server daily, requesting application data for analysis. (8) The GCM will pass along the request when the device is connected. (9) The device then transmits the requested data to the web server for analysis. The received device information is stored in the database, preprocessed, and (10) transmitted to the analysis module. The analysis module then operates on the data and determines the risks, if any. The module composes a report that is (11) returned to the web server. This report is stored in the database as for review, and copies are transmitted to the security official and the (12) GCM server. Finally, the GCM server (13) delivers the report to the device. Once a device has been registered, the security server, running in the cloud, sends daily messages to the GCM. The GCM queues the messages and transmits the requests to the mobile device. The mobile app, listening for GCM messages, processes the requests and responds to the server directly. Once the requested data is received by the server, it updates the database and triggers the analysis module. The module reduces the data and determines the status of the mobile device. If the device has been compromised, notification is sent to both the security officer, as well as to the mobile device.