An Overview of On-going Projects

IIS: Collaborative Research: Harnessing Big Data for Improving Career Mobility

EAGER: Collaborative Research: Substructure-aware Spatiotemporal Representation Learning

IIS: A Multi-source Data Driven Optimization Framework for Interconnected Express Delivery System Design and Inventory Rebalance

Enhancing the Capacity for Information Assurance Education Through Interdisciplinary Collaboration

MILAN: Multi-Modal Passive Intrusion Learning in Pervasive Wireless Environments

Financial Fraud Detection with Data Mining Techniques

Recent years have witnessed increased interests in financial fraud detection and prevention. This is driven by the ever-worsening financial crisis and an increased awareness of the importance of financial risk management. Indeed, financial losses due to fraudulent financial statement are very significant. A number of high-profile companies, such as Enron, Lucent, Xerox, and WorldCom, were committed fraud by the U.S. Securities and Exchange Commission (SEC). It is very critical to develop an effective and efficient financial fraud detection framework for the best interest of investors, auditors, regulators, and governments.

The wide availability of fine-grained financial data, such as financial statements and stock transactions, enables unprecedent opportunities to change the computing paradigm for financial fraud detection and prevention. However, as these financial data become more detailed and multi-dimensional, it becomes ever more difficult for analysts to sift through the data even though it may contain valuable information. Data Mining holds great promise to address this challenge by providing efficient techniques to uncover useful information hidden in the large data repositories. Along this line, in this project, we investigate the characteristics of misstating firms by exploiting financial statements of these misstating firms. The results of this investigation will lead to a set of financial fraud indicators which will be, in turn, used for building a financial fraud prediction models.

Energy-Efficient Knowledge Discovery in Location Traces

The increasing availability of large-scale location traces and car sensing data creates unprecedent opportunities to change the paradigm for knowledge discovery in transportation systems. A particularly promising area is to extract energy-efficient transportation patterns (green knowledge), which can be used as the guidance for reducing inefficiencies in energy consumption of transportation sectors. However, extracting green knowledge from location traces is not a trivial task. Conventional data analysis tools might not be suitable for handling the massive quantity, complex, dynamic, and distributed nature of location traces. To that end, in this project, we propose to develop an analytical foundation for extracting energy-efficient transportation patterns from location traces. Specifically, we have the initial focus on the following challenging tasks. First, we will profile the driver behaviors according to the driving patterns identified from driving traces. Second, we will find correlations between road topologies and the energy use. Third, we will identify seasonal adjustment frequently used segments of trajectories. Finally, we will exploit data analysis techniques to identify abnormal traffic discontinuities/gaps.

Customer Service Support with Multi-focal Learning

In this study, we formalize a multi-focal learning problem, where training data are partitioned into several different focal groups and the prediction model will be learned within each focal group. The multi-focal learning problem is motivated by numerous real-world learning applications. For instance, for the same type of problems encountered in a customer service center, the problem descriptions from different customers can be quite different. The experienced customers usually give more precise and focused descriptions about the problem. In contrast, the inexperienced customers usually provide more diverse descriptions. In this case, the examples from the same class in the training data can be naturally in different focal groups. As a result, it is necessary to identify those natural focal groups and exploit them for learning at different focuses. The key developmental challenge is how to identify those focal groups in the training data. As a case study, we exploit multi-focal learning for profiling problems in customer service centers. The results show that multifocal learning can significantly boost the learning accuracies of existing learning algorithms, such as Support Vector Machines (SVMs), for classifying customer problems.

Group Party-KDD 2016

Group Party-ICDM 2015

Group Party

Group Party to celebrate Wenjun Zhou's graduation!

PAKDD-2007

An Overview of On-going Projects

IIS: Collaborative Research: Harnessing Big Data for Improving Career Mobility

Sponsored by National Science Foundation (NSF), 2020 - 2023

http://datamining.rutgers.edu/project/iis_2.htm

EAGER: Collaborative Research: Substructure-aware Spatiotemporal Representation Learning

Sponsored by National Science Foundation (NSF), 2020 - 2022

http://datamining.rutgers.edu/project/iis_1.htm

IIS: A Multi-source Data Driven Optimization Framework for Interconnected Express Delivery System Design and Inventory Rebalance

Sponsored by National Science Foundation (NSF), 2018 - 2020

http://datamining.rutgers.edu/project/IIS.htm

Enhancing the Capacity for Information Assurance Education Through Interdisciplinary Collaboration

Sponsored by National Science Foundation (NSF), 2012 - 2014

http://datamining.rutgers.edu/project/due.htm

MILAN: Multi-Modal Passive Intrusion Learning in Pervasive Wireless Environments

Sponsored by National Science Foundation (NSF), 2010 - 2013

http://datamining.rutgers.edu/project/milan.htm

Financial Fraud Detection with Data Mining Techniques

Energy-Efficient Knowledge Discovery in Location Traces

Customer Service Support with Multi-focal Learning