School of Advanced Technology

ADDRESS
School of Advanced Technology
Xi'an Jiaotong-Liverpool University
111 Ren'ai Road Suzhou Dushu Lake Science and Education Innovation District , Suzhou Industrial Park
Suzhou,Jiangsu Province,P. R. China,215123
1. Customer churn prediction in the telecommunication sector using a rough set approach

Author:Amin, A;Anwar, S;Adnan, A;Nawaz, M;Alawfi, K;Hussain, A;Huang, KZ

Source:NEUROCOMPUTING,2017,Vol.237

Abstract:Customer churn is a critical and challenging problem affecting business and industry, in particular, the rapidly growing, highly competitive telecommunication sector. It is of substantial interest to both academic researchers and industrial practitioners, interested in forecasting the behavior of customers in order to differentiate the churn from non-churn customers. The primary motivation is the dire need of businesses to retain existing customers, coupled with the high cost associated with acquiring new ones. A review of the field has revealed a lack of efficient, rule-based Customer Churn Prediction (CCP) approaches in the telecommunication sector. This study proposes an intelligent rule-based decision-making technique, based on rough set theory (RST), to extract important decision rules related to customer churn and non-churn. The proposed approach effectively performs classification of churn from non-churn customers, along with prediction of those customers who will churn or may possibly churn in the near future. Extensive simulation experiments are carried out to evaluate the performance of our proposed RST based CCP approach using four rule-generation mechanisms, namely, the Exhaustive Algorithm (EA), Genetic Algorithm (GA); Covering Algorithm (CA) and the LEM2 algorithm (LA). Empirical results show that RST based on GA is the most efficient technique for extracting implicit knowledge in the form of decision rules from the publicly available; benchmark telecom dataset. Further, coniparative results demonstrate that our proposed approach offers a globally optimal solution for CCP in the telecom sector, when benchmarked against several state-of-the-art methods. Finally, we show how attribute-level analysis can pave the way for developing a successful customer retention policy that could form an indispensable part of strategic decision making and planning process in the telecom sector.
2. A novel classifier ensemble method with sparsity and diversity

Author:Yin, XC;Huang, KZ;Hao, HW;Iqbal, K;Wang, ZB

Source:NEUROCOMPUTING,2014,Vol.134

Abstract:We consider the classifier ensemble problem in this paper. Due to its superior performance to individual classifiers, class ensemble has been intensively studied in the literature. Generally speaking, there are two prevalent research directions on this, i.e., to diversely generate classifier components, and to sparsely combine multiple classifiers. While most current approaches are emphasized on either sparsity or diversity only, we investigate the classifier ensemble by learning both sparsity and diversity simultaneously. We manage to formulate the classifier ensemble problem with the sparsity or/and diversity learning in a general framework. In particular, the classifier ensemble with sparsity and diversity can be represented as a mathematical optimization problem. We then propose a heuristic algorithm, capable of obtaining ensemble classifiers with consideration of both sparsity and diversity. We exploit the genetic algorithm, and optimize sparsity and diversity for classifier selection and combination heuristically and iteratively. As one major contribution, we introduce the concept of the diversity contribution ability so as to select proper classifier components and evolve classifier weights eventually. Finally, we compare our proposed novel method with other conventional classifier ensemble methods such as Bagging, least squares combination, sparsity learning, and AdaBoost, extensively on UCI benchmark data sets and the Pascal Large Scale Learning Challenge 2008 webspam data. The experimental results confirm that our approach leads to better performance in many aspects. (C) 2014 Elsevier B.V. All rights reserved.
3. Triple loss for hard face detection

Author:Fang, ZY;Ren, JC;Marshall, S;Zhao, HM;Wang, Z;Huang, KZ;Xiao, B

Source:NEUROCOMPUTING,2020,Vol.398

Abstract:Although face detection has been well addressed in the last decades, despite the achievements in recent years, effective detection of small, blurred and partially occluded faces in the wild remains a challenging task. Meanwhile, the trade-off between computational cost and accuracy is also an open research problem in this context. To tackle these challenges, in this paper, a novel context enhanced approach is proposed with structural optimization and loss function optimization. For loss function optimization, we introduce a hierarchical loss, referring to "triple loss" in this paper, to optimize the feature pyramid network (FPN) (Lin et al., 2017) based face detector. Additional layers are only applied during the training process. As a result, the computational cost is the same as FPN during inference. For structural optimization, we propose a context sensitive structure to increase the capacity of the prediction network to improve the accuracy of the output. In details, a three-branch inception subnet (Szegedy et al., 2015) based feature fusion module is employed to refine the original FPN without increasing the computational cost significantly, further improving low-level semantic information, which is originally extracted from a single convolutional layer in the backward pathway of FPN. The proposed approach is evaluated on two publicly available face detection benchmarks, FDDB and WIDER FACE. By using a VGG-16 based detector, experimental results indicate that the proposed method achieves a good balance between the accuracy and computational cost of face detection. (C) 2020 Elsevier B.V. All rights reserved.
4. DE2: Dynamic ensemble of ensembles for learning nonstationary data

Author:Yin, XC;Huang, KZ;Hao, HW

Source:NEUROCOMPUTING,2015,Vol.165

Abstract:Learning nonstationary data with concept drift has received much attention in machine learning and been an active topic in ensemble learning. Specifically, batch growing ensemble methods present one important direction for dealing with concept drift involved in nonstationary data. However, current batch growing ensemble methods combine all the available component classifiers only, each trained independently from a batch of non-stationary data. They simply discard interim ensembles and hence may lose useful information obtained from the fine-tuned interim ensembles. Distinctively, we introduce a comprehensive hierarchical approach called Dynamic Ensemble of Ensembles (DE2). The novel method combines classifiers as an ensemble of all the interim ensembles dynamically from consecutive batches of nonstationary data. DE2 includes two key stages: component classifiers and interim ensembles are dynamically trained; and the final ensemble is then learned by exponentially-weighted averaging with available experts, i.e., interim ensembles. Moreover, we engage Sparsity Learning to choose component classifiers selectively and intelligently. We also incorporate the techniques of Dynamic Weighted Majority, and Learn(++).NSE for better integrating different classifiers dynamically. We perform experiments with two benchmark test sets in real nonstationary environments, and compare our DE2 method to other conventional competitive ensemble methods. Experimental results confirm that our approach consistently leads to better performance and has promising generalization ability for learning in nonstationary environments. (C) 2015 Elsevier B.V. All rights reserved.
5. Graphical lasso quadratic discriminant function and its application to character recognition

Author:Xu, B;Huang, KZ;King, I;Liu, CL;Sun, J;Satoshi, N

Source:NEUROCOMPUTING,2014,Vol.129

Abstract:Multivariate Gaussian distribution is a popular assumption in many pattern recognition tasks. The quadratic discriminant function (QDF) is an effective classification approach based on this assumption. An improved algorithm, called modified QDF (or MQDF in short) has achieved great success and is widely recognized as the state-of-the-art method in character recognition. However, because both of the two approaches estimate the mean and covariance by the maximum-likelihood estimation (MLE), they often lead to the loss of the classification accuracy when the number of the training samples is small. To attack this problem, in this paper, we engage the graphical lasso method to estimate the covariance and propose a new classification method called the graphical lasso quadratic discriminant function (GLQDF). By exploiting a coordinate descent procedure for the lasso, GLQDF can estimate the covariance matrix (and its inverse) more precisely. Experimental results demonstrate that the proposed method can perform better than the competitive methods on two artificial and nine real datasets (including both benchmark digit and Chinese character data). (C) 2013 Elsevier B.V. All rights reserved.
6. Siamese network ensemble for visual tracking

Author:Jiang, CR;Xiao, JM;Xie, YC;Tillo, T;Huang, KZ

Source:NEUROCOMPUTING,2018,Vol.275

Abstract:Visual object tracking is a challenging task considering illumination variation, occlusion, rotation, deformation and other problems. In this paper, we extend a Siamese INstance search Tracker (SINT) with model updating mechanism to improve its tracking robustness. SINT uses convolutional neural network (CNN) features, and compares the new frame features with the target features in the first frame. The candidate region with the highest similarity score is considered as the tracking result. However, SINT is not robust against large target variation because the matching model is not updated during the whole tracking process. To combat this defect, we propose an Ensemble Siamese Tracker (EST), where the final similarity score is also affected by the similarity with tracking results in recent frames instead of solely considering the first frame. Tracking results in recent frames are used to adjust the model for continuous target change. Meanwhile, we combine large displacement optical flow method with EST to further improve the performance (called EST+). We test the proposed EST and EST+ on a standard tracking benchmark OTB. It turns out the average overlap ratio of EST and EST+ increase 2.72%% and 3.55%% respectively compared with SINT on OTB 2013, which contains 51 video sequences. For the OTB 100, the average overlap ratio gain is 4.2%%. (C) 2017 Elsevier B.V. All rights reserved.
7. Computational intelligence techniques for new product development

Author:Chan, KY;Yuen, KKF;Kwong, CK

Source:NEUROCOMPUTING,2014,Vol.142

8. Coarse-grained generalized zero-shot learning with efficient self-focus mechanism

Author:Yang, Guanyu ; Huang, Kaizhu ; Zhang, Rui ; Goulermas, John Y. ; Hussain, Amir

Source:Neurocomputing,2021,Vol.463

Abstract:For image classification in computer vision, the performance of conventional deep neural networks (DNN) may usually drop when labeled training samples are limited. In this case, few-shot learning (FSL) or particularly zero-shot learning (ZSL), i.e. classification of target classes with few or zero labeled training samples, was proposed to imitate the strong learning ability of human. However, recent investigations show that most existing ZSL models may easily overfit and they tend to misclassify the target instance as one class seen in the training set. To alleviate this problem, we proposed an embedding based ZSL method with a self-focus mechanism, i.e. a focus-ratio that introduces the importance of each dimension, into the model optimization process. The objective function will be reconstructed according to these focus-ratios encouraging that the embedding model focus exclusively on important dimensions in the target space. As the self-focus module only takes part in the training process, the over-fitting knowledge is apportioned, and hence the rest embedding model can become more generalized for the new classes during test. Experimental results on four benchmarks, including AwA1, AwA2, aPY and CUB, show that our method outperforms the state-of-the-art methods on coarse-grained ZSL tasks while not affecting the performance of fine-grained ZSL. Additionally, several comparisons demonstrate the superiority of the proposed mechanism. © 2021 Elsevier B.V.
9. Improving deep neural network performance by integrating kernelized Min-Max objective

Author:Wang, QF;Yao, K;Zhang, R;Hussain, A;Huang, KZ

Source:NEUROCOMPUTING,2020,Vol.408

Abstract:Deep neural networks (DNN), such as convolutional neural networks (CNN) have been widely used for object recognition. However, they are usually unable to ensure the required intra-class compactness and inter-class separability in the kernel space. These are known to be important in pattern recognition for achieving both robustness and accuracy. In this paper, we propose to integrate a kernelized Min-Max objective in the DNN training in order to explicitly enforce both kernelized within-class compactness and between-class margin. The involved kernel space is implicitly mapped from the feature space associated with a certain upper layer of DNN by exploiting a kernel trick, while the Min-Max objective in this space is interpolated with the original DNN loss function and finally optimized in the training phase. With a very small additional computation cost, the proposed strategy can be easily integrated in different DNN models without changing any other part of the original model. The comparative recognition accuracy of the proposed method is evaluated with multiple DNN models (including shallow CNN, deep CNN and deep residual neural network models) on two benchmark datasets: CIFAR-10 and CIFAR-100. Extensive experimental results demonstrate that the integration of kernelized Min-Max objective in the training of DNN models can achieve better results compared to state-of-the-art models, without incurring additional model complexity. (C) 2020 Elsevier B.V. All rights reserved.
10. A new two-layer mixture of factor analyzers with joint factor loading model for the classification of small dataset problems

Author:Yang, X;Huang, KZ;Zhang, R;Goulermas, JY;Hussain, A

Source:NEUROCOMPUTING,2018,Vol.312

Abstract:Dimensionality Reduction (DR) is a fundamental topic of pattern classification and machine learning. For classification tasks, DR is typically employed as a pre-processing step, succeeded by an independent classifier training stage. However, such independent operation of the two stages often limits the final classification performance notably, as the generated subspace may not be maximally beneficial or appropriate to the learning task at hand. This problem is further accentuated for high-dimensional data classification in situations of the limited number of samples. To address this problem, we develop a novel joint learning model for classification, referred to as two-layer mixture of factor analyzers with joint factor loading (2L-MJFA). Specifically, the model adopts a special two-layer mixture or a mixture of mixtures structure, where each component represents each specific class as a mixture of factor analyzers (MFA). Importantly, all the involved factor analyzers are intentionally designed so that they share the same loading matrix. This, apart from operating as the DR matrix, largely reduces the parameters and makes the proposed algorithm very suitable to small dataset situations. Additionally, we propose a modified expectation maximization algorithm to train the proposed model. A series of simulation experiments demonstrate that what we propose significantly outperforms other state-of-the-art algorithms on various benchmark datasets. Finally, since factor analyzers are closely linked with Auto-encoder networks, the proposed idea could be of particular utility to the community of neural networks. (C) 2018 Elsevier B.V. All rights reserved.
11. Special issue on advances in graph algorithm and applications

Author:Liu, ZY;Huang, KZ;Yang, X;Liu, CL

Source:NEUROCOMPUTING,2019,Vol.336

12. Hybrid channel based pedestrian detection

Author:Tesema, FB;Wu, H;Chen, MJ;Lin, JP;Zhu, W;Huang, KZ

Source:NEUROCOMPUTING,2020,Vol.389

Abstract:Pedestrian detection has achieved great improvements with the help of Convolutional Neural Networks (CNNs). CNN can learn high-level features from input images, but the insufficient spatial resolution of CNN feature channels (feature maps) may cause a loss of information, which is harmful especially to small instances. In this paper, we propose a new pedestrian detection framework, which extends the successful RPN+BF framework to combine handcrafted features and CNN features. RoI-pooling is used to extract features from both handcrafted channels (e.g. HOG+LUV, CheckerBoards or RotatedFilters) and CNN channels. Since handcrafted channels always have higher spatial resolution than CNN channels, we apply RoI-pooling with larger output resolution to handcrafted channels to keep more detailed information. Our ablation experiments show that the developed handcrafted features can reach better detection accuracy than the CNN features extracted from the VGG-16 net, and a performance gain can be achieved by combining them. Experimental results on Caltech pedestrian dataset with the original annotations and the improved annotations demonstrate the effectiveness of the proposed approach. When using a more advanced RPN in our framework, our approach can be further improved and get competitive results on both benchmarks. (C) 2020 Elsevier B.V. All rights reserved.
13. A hybrid fuzzy quality function deployment framework using cognitive network process and aggregative grading clustering: An application to cloud software product development

Author:Yuen, KKF

Source:NEUROCOMPUTING,2014,Vol.142

Abstract:Quality function deployment (QFD) is an essential decision tool for product development in various domains. QFD enables the cross-functional team to translate the customer requirements into engineering characteristics during product development. Whilst there are some limitations for criteria evaluation and analysis in QFD, this study proposes a hybrid framework of Fuzzy Cognitive Network Process, Aggregative Grading Clustering, and Quality Function Deployment (F-CNP-AGC-QFD) for the criteria evaluation and analysis in QFD. The fuzzy number applied to the QFD, i.e. FQFD, enables rating flexibility for the expert judgment to handle uncertainty. The Fuzzy Cognitive Network Process (FCNP) is used for the criteria weights/priorities evaluation. The Fuzzy Aggregative Grading Clustering (FAGC) classifies the weights/priorities as ordinal grades. The proposed hybrid QFD approach applied to the cloud software product development is demonstrated to show the validity and applicability. (C) 2014 Published by Elsevier B.V.
14. Domain adaptation with feature and label adversarial networks

Author:Zhao, P;Zang, WH;Liu, B;Kang, Z;Bai, K;Huang, KZ;Xu, ZL

Source:NEUROCOMPUTING,2021,Vol.439

Abstract:Learning a cross-domain representation from labeled source domains to unlabeled target domains is an important research problem in representation learning. Despite the success of traditional adversarial methods, they proposed to align features from each domain only while neglecting the importance of labels, when fooling a special domain discriminator network. Thus, the discriminator of these approaches merely distinguishes whether the generated features are in-domain or not, which may lead to less class discriminative features. In this paper, by considering the joint distributions of features and labels in both domains, we present Feature and Label Adversarial Networks (FLAN). As a result, FLAN can generate more discriminative features in both domains. Experimental results on standard unsupervised domain adaptation benchmarks have demonstrated that FLAN can outperform the state-of-art domain invariant representation learning methods. (c) 2021 Elsevier B.V. All rights reserved.
15. A review on transfer learning in EEG signal analysis

Author:Wan, ZT;Yang, R;Huang, MJ;Zeng, NY;Liu, XH

Source:NEUROCOMPUTING,2021,Vol.421

Abstract:Electroencephalogram (EEG) signal analysis, which is widely used for human-computer interaction and neurological disease diagnosis, requires a large amount of labeled data for training. However, the collection of substantial EEG data could be difficult owing to its randomness and non-stationary. Moreover, there is notable individual difference in EEG data, which affects the reusability and generalization of models. For mitigating the adverse effects from the above factors, transfer learning is applied in this field to transfer the knowledge learnt in one domain into a different but related domain. Transfer learning adjusts models with small-scale data of the task, and also maintains the learning ability with individual difference. This paper describes four main methods of transfer learning and explores their practical applications in EEG signal analysis in recent years. Finally, we discuss challenges and opportunities of transfer learning and suggest areas for further study. (c) 2020 Elsevier B.V. All rights reserved.
16. Residual attention-based multi-scale script identification in scene text images

Author:Ma, MK;Wang, QF;Huang, S;Huang, S;Goulermas, Y;Huang, KZ

Source:NEUROCOMPUTING,2021,Vol.421

Abstract:Script identification is an essential step in the text extraction pipeline for multi-lingual application. This paper presents an effective approach to identify scripts in scene text images. Due to the complicated background, various text styles, character similarity of different languages, script identification has not been solved yet. Under the general classification framework of script identification, we investigate two important components: feature extraction and classification layer. In the feature extraction, we utilize a hierarchical feature fusion block to extract the multi-scale features. Furthermore, we adopt an attention mechanism to obtain the local discriminative parts of feature maps. In the classification layer, we utilize a fully convolutional classifier to generate channel-level classifications which are then processed by a global pooling layer to improve classification efficiency. We evaluated the proposed approach on benchmark datasets of RRC-MLT2017, SIW-13, CVSI-2015 and MLe2e, and the experimental results show the effectiveness of each elaborate designed component. Finally, we achieve better performances than those competitive models, where the correct rates are 89.66%%, 96.11%%, 98.78%% and 97.20%% on PRC-MLT2017, SIW-13, CVSI-2015 and MLe2e, respectively. (c) 2020 Elsevier B.V. All rights reserved.
Total 16 results found
Copyright 2006-2020 © Xi'an Jiaotong-Liverpool University 苏ICP备07016150号-1 京公网安备 11010102002019号