Find Research Output

Research Output
  • All
  • Scholar Profiles
  • Research Units
  • Research Output
Filter
Department Publication Year Content Type Data Sources

SELECTED FILTERS

School of Advanced Technology
Clear all

1.Numerical Study to Examine the Effect of Porosity on In-Flight Particle Dynamics

Author:Kamnis, S;Gu, S;Vardavoulias, M

Source:JOURNAL OF THERMAL SPRAY TECHNOLOGY,2011,Vol.20

Abstract:High velocity oxygen fuel (HVOF) thermal spray has been widely used to deposit hard composite materials such as WC-Co powders for wear-resistant applications. Powder morphology varies according to production methods while new powder manufacturing techniques produce porous powders containing air voids which are not interconnected. The porous microstructure within the powder will influence in-flight thermal and aerodynamic behavior of particles which is expected to be different from fully solid powder. This article is devoted to study the heat and momentum transfer in a HVOF sprayed WC-Co particles with different sizes and porosity levels. The results highlight the importance of thermal gradients inside the particles as a result of microporosity and how HVOF operating parameters need to be modified considering such temperature gradient.

2.Driving Posture Recognition by a Hierarchal Classification System with Multiple Features

Author:Yan, C;Zhang, BL;Coenen, F

Source:2014 7TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING (CISP 2014),2014,Vol.

Abstract:This paper presents a novel system for vision-based driving posture recognition. The driving posture dataset was prepared by a side-mounted camera looking at a driver's left profile. After pre-processing for illumination variations, eight action classes of constitutive components of the driving activities were segmented, including normal driving, operating a cell phone, eating and smoking. A global grid-based representation for the action sequence was emphasized, which featured two consecutive steps. Step 1 generates a motion descriptive shape based on a motion frequency image(MFI), and step 2 applies the pyramid histogram of oriented gradients (PHOG) for more discriminating characterization. A three level hierarchal classification system is designed to overcome the difficulties of some overlapping classes. Four commonly applied classifiers, including k-nearest neighbor(KNN), random forest (RF), support vector machine(SVM) and multiple layer perceptron (MLP), are evaluated in each level. The overall classification accuracy is over 87.2%% for the eight classes of driving actions by the proposed classification system.

3.Multi-scale Attention Consistency for Multi-label Image Classification

Author:Xu, Haotian ; Jin, Xiaobo ; Wang, Qiufeng ; Huang, Kaizhu

Source:Communications in Computer and Information Science,2020,Vol.1332

Abstract:Human has well demonstrated its cognitive consistency over image transformations such as flipping and scaling. In order to learn from human’s visual perception consistency, researchers find out that convolutional neural network’s capacity of discernment can be further elevated via forcing the network to concentrate on certain area in the picture in accordance with the human natural visual perception. Attention heatmap, as a supplementary tool to reveal the essential region that the network chooses to focus on, has been developed and widely adopted by CNNs. Based on this regime of visual consistency, we propose a novel end-to-end trainable CNN architecture with multi-scale attention consistency. Specifically, our model takes an original picture and its flipped counterpart as inputs, and then send them into a single standard Resnet with additional attention-enhanced modules to generate a semantically strong attention heatmap. We also compute the distance between multi-scale attention heatmaps of these two pictures and take it as an additional loss to help the network achieve better performance. Our network shows superiority on the multi-label classification task and attains compelling results on the WIDER Attribute Dataset. © 2020, Springer Nature Switzerland AG.

4.Towards Better Forecasting by Fusing Near and Distant Future Visions

Author:Cheng, JZ;Huang, KZ;Zheng, ZB

Source:THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE,2020,Vol.34

Abstract:Multivariate time series forecasting is an important yet challenging problem in machine learning. Most existing approaches only forecast the series value of one future moment, ignoring the interactions between predictions of future moments with different temporal distance. Such a deficiency probably prevents the model from getting enough information about the future, thus limiting the forecasting accuracy. To address this problem, we propose Multi-Level Construal Neural Network (MLCNN), a novel multi-task deep learning framework. Inspired by the Construal Level Theory of psychology, this model aims to improve the predictive performance by fusing forecasting information (i.e., future visions) of different future time. We first use the Convolution Neural Network to extract multi-level abstract representations of the raw data for near and distant future predictions. We then model the interplay between multiple predictive tasks and fuse their future visions through a modified Encoder-Decoder architecture. Finally, we combine traditional Autoregression model with the neural network to solve the scale insensitive problem. Experiments on three real-world datasets show that our method achieves statistically significant improvements compared to the most state-of-the-art baseline methods, with average 4.59%% reduction on RMSE metric and average 6.87%% reduction on MAE metric.

5.A Covert Ultrasonic Phone-to-Phone Communication Scheme

Author:Shi,Liming;Yu,Limin;Huang,Kaizhu;Zhu,Xu;Wang,Zhi;Li,Xiaofei;Wang,Wenwu;Wang,Xinheng

Source:Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST,2021,Vol.349

Abstract:© 2021, ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering. Smartphone ownership has increased rapidly over the past decade, and the smartphone has become a popular technological product in modern life. The universal wireless communication scheme on smartphones leverages electromagnetic wave transmission, where the spectrum resource becomes scarce in some scenarios. As a supplement to some face-to-face transmission scenarios, we design an aerial ultrasonic communication scheme. The scheme uses chirp-like signal and BPSK modulation, convolutional code encoding with ID-classified interleaving, and pilot method to estimate room impulse response. Through experiments, the error rate of the ultrasonic communication system designed for mobile phones can be within 0.001 %% in 1 m range. The limitations of this scheme and further research work are discussed as well.

6.Towards an experimental analysis of android phone: GSM network positioning

Author:Man,Ka Lok;Man,Ka Lok;Wang,Wei;Liu,Dawei;Tayahi,Moncef;Hsu,Hui Huang;Lim,Eng Gee

Source:International Journal of Applied Engineering Research,2014,Vol.9

Abstract:Network positioning technology has been widely used in existing smart phones. Traditional network positioning methods are carried out by the network provider; this could violate user's privacy.In this paper we present a work in progress on the positioning of smart phone users using wireless networks. We propose a self-positioning scheme based on the fingerprint method, a positioning method commonly used in indoor environments in previous studies. We outline the proposed self-positioning scheme and propose a k-nearest neighbor method to improve the positioning accuracy. In future, we will evaluate the proposed scheme and method in field experiments.

7.A Systematic Analysis of Link Prediction in Complex Network

Author:Gul, H;Amin, A;Adnan, A;Huang, KZ

Source:IEEE ACCESS,2021,Vol.9

Abstract:Link mining is an important task in the field of data mining and has numerous applications in informal community. Suppose a real-world complex network, the responsibility of this function is to anticipate those links which are not occurred yet in the given real-world network. Holding the significance of LP, the link mining or expectation job has gotten generous consideration from scientists in differing exercise. In this manner, countless strategies for taking care of this issue have been proposed in the late decades. Various articles of link prediction are accessible, however, these are antiquated as multiples new methodologies introduced. In this paper, give a precise assessment of prevail link mining approaches. The investigation is through, it consists the soonest scoring-based approaches and reaches out to the latest strategies which confide on different link prediction strategies. We additionally order link prediction strategies because of their specialized methodology and discussion about the quality and weaknesses of various techniques. Additionally, we compared and expounded various top link prediction techniques. The experimental results of these techniques, over twelve data-sets are ordered here based on performance, RA, 0.7411 > AA, 0.7285 > PA, 0.7202 > Katz, 0.7141 > CN, 0.6951 > HP, 0.6924 > LHN, 0.6017 > PD, 0.3978.

8.Fast graph-based semi-supervised learning and its applications

Author:Zhang,Yan Ming;Huang,Kaizhu;Geng,Guang Gang;Liu,Cheng Lin

Source:Semi-Supervised Learning: Background, Applications and Future Directions,2018,Vol.

Abstract:Despite the great success of graph-based transductive learning methods, most of them have serious problems in scalability and robustness. In this chapter, we propose an efficient and robust graph-based transductive classification method, called minimum tree cut (MTC), which is suitable for large scale data. Motivated from the sparse representation of graph, we approximate a graph by a spanning tree. Exploiting the simple structure, we develop a linear-time algorithm to label the tree such that the cut size of the tree is minimized. This significantly improves graph-based methods, which typically have a polynomial time complexity. Moreover, we theoretically and empirically show that the performance of MTC is robust to the graph construction, overcoming another big problem of traditional graph-based methods. Extensive experiments on public data sets and applications on text extraction fromimages demonstrate our method’s advantages in aspect of accuracy, speed, and robustness.

9.Analyzing Healthcare Big Data for Patient Satisfaction

Author:Wan, KY;Alagar, V

Source:2017 13TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD),2017,Vol.

Abstract:Healthcare Big Data (HBD) is more complex than Big Data (BD) arising from any other critical sector because a variety of data sources and procedures are followed in traditional hospital settings and in healthcare network (e-Health). In order to achieve their primary goal, which is to enhance patient experience while sustaining dependable care within financial viability and respect for government regulations, the HBD should be analyzed to determine patent satisfaction level. In general, there exists no accepted method yet in measuring patient satisfaction. The traditional approach for evaluating hospital-based healthcare is through a statistical analysis of responses of clients to a survey, often conducted by a third party. Such methods are often infected with incomplete information, inaccurate hypothesis, and error-prone analysis. Analyzing data generated through automated healthcare networks for assessing the effectiveness of service provision and patient satisfaction are more challenging. It is in this context that we discuss in this paper factors that contribute to patient satisfaction, and propose an algorithmic method to assess it from HBD analysis.

10.Random subspace support vector machine ensemble for reliable face recognition

Author:Zhang,Bailing

Source:International Journal of Biometrics,2014,Vol.6

Abstract:Face recognition still meets challenges despite the progresses made. One of less addressed problems is to reject unregistered subjects. Aiming to tackle this problem, this paper proposes random subspace support vector machine (SVM) ensemble to provide classification confidence and implement reject option to accommodate the situations where no classification should be made. The ensemble is created using the random subspace (RS) method, together with four feature descriptions including local binary pattern (LBP), pyramid histogram of oriented gradient (PHOG), Gabor filtering and wavelet transform. The consensus degree from the ensemble's voting conforms to the confidence measure and rejection is accomplished accordingly when the confidence falls below a threshold. The reliable recognition scheme is empirically evaluated on several benchmark face databases including AR faces, FERET faces and Yale B faces, all of which yielded highly reliable results, thus demonstrating the effectiveness of the proposed approach. © 2014 Inderscience Enterprises Ltd.

11.Feature Representation Matters End-to-End Learning for Reference-Based Image Super-Resolution

Author:Xie, Yanchun ; Xiao, Jimin ; Sun, Mingjie ; Yao, Chao ; Huang, Kaizhu

Source:Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),2020,Vol.12349 LNCS

Abstract:In this paper, we are aiming for a general reference-based super-resolution setting it does not require the low-resolution image and the high-resolution reference image to be well aligned or with a similar texture. Instead, we only intend to transfer the relevant textures from reference images to the output super-resolution image. To this end, we engaged neural texture transfer to swap texture features between the low-resolution image and the high-resolution reference image. We identified the importance of designing a super-resolution task-specific features rather than classification oriented features for neural texture transfer, making the feature extractor more compatible with the image synthesis task. We develop an end-to-end training framework for the reference-based super-resolution task, where the feature encoding network prior to matching and swapping is jointly trained with the image synthesis network. We also discovered that learning the high-frequency residual is an effective way for the reference-based super-resolution task. Without bells and whistles, the proposed method E2ENT achieved better performance than state-of-the method (i.e., SRNTT with five loss functions) with only two basic loss functions. Extensive experimental results on several datasets demonstrate that the proposed method E2ENT can achieve superior performance to existing best models both quantitatively and qualitatively. © 2020, Springer Nature Switzerland AG.

12.Compressing Deep Networks by Neuron Agglomerative Clustering

Author:Wang, LN;Liu, WX;Liu, X;Zhong, GQ;Roy, PP;Dong, JY;Huang, KZ

Source:SENSORS,2020,Vol.20

Abstract:In recent years, deep learning models have achieved remarkable successes in various applications, such as pattern recognition, computer vision, and signal processing. However, high-performance deep architectures are often accompanied by a large storage space and long computational time, which make it difficult to fully exploit many deep neural networks (DNNs), especially in scenarios in which computing resources are limited. In this paper, to tackle this problem, we introduce a method for compressing the structure and parameters of DNNs based on neuron agglomerative clustering (NAC). Specifically, we utilize the agglomerative clustering algorithm to find similar neurons, while these similar neurons and the connections linked to them are then agglomerated together. Using NAC, the number of parameters and the storage space of DNNs are greatly reduced, without the support of an extra library or hardware. Extensive experiments demonstrate that NAC is very effective for the neuron agglomeration of both the fully connected and convolutional layers, which are common building blocks of DNNs, delivering similar or even higher network accuracy. Specifically, on the benchmark CIFAR-10 and CIFAR-100 datasets, using NAC to compress the parameters of the original VGGNet by 92.96%% and 81.10%%, respectively, the compact network obtained still outperforms the original networks.

13.Analysis of liquid feedstock behavior in high velocity suspension flame spraying for the development of nanostructured coatings

Author:Gozali, Ebrahim ; Kamnis, Spyros ; Gu, Sai

Source:Proceedings of the International Thermal Spray Conference,2013,Vol.

Abstract:Over the last decade the interest in thick nano-structured layers has been increasingly growing. Several new applications, including nanostructured thermoelectric coatings, thermally sprayed photovoltaic systems and solid oxide fuel cells, require reduction of micro-cracking, resistance to thermal shock and/or controlled porosity. The high velocity suspension flame spray (HVSFS) is a promising method to prepare advanced materials from nano-sized particles with unique properties. However, compared to the conventional thermal spray, HVSFS is by far more complex and difficult to control because the liquid feedstock phase undergoes aerodynamic break up and vaporization. The effects of suspension droplet size, injection velocity and mass flow rate were parametrically studied and the results were compared for axial, transverse and external injection. The numerical simulation consists of modeling aerodynamic droplet break-up and evaporation, heat and mass transfer between liquid droplets and gas phase.

14.Manifold adversarial training for supervised and semi-supervised learning

Author:Zhang, SF;Huang, KZ;Zhu, JK;Liu, Y

Source:NEURAL NETWORKS,2021,Vol.140

Abstract:We propose a new regularization method for deep learning based on the manifold adversarial training (MAT). Unlike previous regularization and adversarial training methods, MAT further considers the local manifold of latent representations. Specifically, MAT manages to build an adversarial framework based on how the worst perturbation could affect the statistical manifold in the latent space rather than the output space. Particularly, a latent feature space with the Gaussian Mixture Model (GMM) is first derived in a deep neural network. We then define the smoothness by the largest variation of Gaussian mixtures when a local perturbation is given around the input data point. On one hand, the perturbations are added in the way that would rough the statistical manifold of the latent space the worst. On the other hand, the model is trained to promote the manifold smoothness the most in the latent space. Importantly, since the latent space is more informative than the output space, the proposed MAT can learn a more robust and compact data representation, leading to further performance improvement. The proposed MAT is important in that it can be considered as a superset of one recently-proposed discriminative feature learning approach called center loss. We conduct a series of experiments in both supervised and semi-supervised learning on four benchmark data sets, showing that the proposed MAT can achieve remarkable performance, much better than those of the state-of-the-art approaches. In addition, we present a series of visualization which could generate further understanding or explanation on adversarial examples. (C) 2021 Published by Elsevier Ltd.

15.Disentangling Semantic-to-visual Confusion for Zero-shot Learning

Author:Ye,Zihan;Hu,Fuyuan;Lyu,Fan;Li,Linyan;Huang,Kaizhu

Source:IEEE Transactions on Multimedia,2021,Vol.

Abstract:Using generative models to synthesize visual features from semantic distribution is one of the most popular solutions to ZSL image classification in recent years. The triplet loss (TL) is popularly used to generate realistic visual distributions from semantics by automatically searching discriminative representations. However, the traditional TL cannot search reliable unseen disentangled representations due to the unavailability of unseen classes in ZSL. To alleviate this drawback, we propose in this work a multi-modal triplet loss (MMTL) which utilizes multi-modal information to search a disentangled representation space. As such, all classes can interplay which can benefit learning disentangled class representations in the searched space. Furthermore, we develop a novel model called Disentangling Class Representation Generative Adversarial Network (DCRGAN) focusing on exploiting the disentangled representations in training, feature synthesis, and final recognition stages. Benefiting from the disentangled representations, DCR-GAN could fit a more realistic distribution over both seen and unseen features. Extensive experiments show that our proposed model can lead to superior performance to the state-of-the-arts on four benchmark datasets.

16.Driver Behavior Recognition Based on Deep Convolutional Neural Networks

Author:Yan, SY;Teng, YX;Smith, JS;Zhang, BL

Source:2016 12TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD),2016,Vol.

Abstract:Traffic safety is a severe problem around the world. Many road accidents are normally related with the driver's unsafe driving behavior, e.g. eating while driving. In this work, we propose a vision-based solution to recognize the driver's behavior based on convolutional neural networks. Specifically, given an image, skin-like regions are extracted by Gaussian Mixture Model, which are passed to a deep convolutional neural networks model, namely R*CNN, to generate action labels. The skin-like regions are able to provide abundant semantic information with sufficient discriminative capability. Also, R*CNN is able to select the most informative regions from candidates to facilitate the final action recognition. We tested the proposed methods on Southeast University Driving-posture Dataset and achieve mean Average Precision(mAP) of 97.76%% on the dataset which prove the proposed method is effective in drivers's action recognition.

17.Semi-supervised learning: Background, applications and future directions

Author:Zhong,Guoqiang;Huang,Kaizhu

Source:Semi-Supervised Learning: Background, Applications and Future Directions,2018,Vol.

Abstract:Semi-supervised learning is an important area of machine learning. It deals with problems that involve a lot of unlabeled data and very scarce labeled data. The book focuses on state-of-the-art research on semi-supervised learning. In the first chapter, Weng, Dornaika and Jin introduce a graph construction algorithm named the constrained data self-representative graph construction (CSRGC). In the second chapter, to reduce the graph construction complexity, Zhang et al. use anchors that were a special subset chosen from the original data to construct the full graph, while randomness was injected into graphs to improve the classification accuracy and deal with the high dimensionality issue. In the third chapter, Dornaika et al. introduce a kernel version of the Flexible Manifold Embedding (KFME) algorithm. In the fourth chapter, Zhang et al. present an efficient and robust graph-based transductive classification method known as the minimum tree cut (MTC), for large scale applications. In the fifth chapter, Salazar, Safont and Vergara investigated the performance of semi-supervised learning methods in two-class classification problems with a scarce population of one of the classes. In the sixth chapter, by breaking the sample identically and independently distributed (i.i.d.) assumption, one novel framework called the field support vector machine (F-SVM) with both classification (F-SVC) and regression (F-SVR) purposes is introduced. In the seventh chapter, Gong employs the curriculum learning methodology by investigating the difficulty of classifying every unlabeled example. As a result, an optimized classification sequence was generated during the iterative propagations, and the unlabeled examples are logically classified from simple to difficult. In the eighth chapter, Tang combines semi-supervised learning with geo-tagged photo streams and concept detection to explore situation recognition. This book is suitable for university students (undergraduate or graduate) in computer science, statistics, electrical engineering, and anyone else who would potentially use machine learning algorithms; professors, who research artificial intelligence, pattern recognition, machine learning, data mining and related fields; and engineers, who apply machine learning models into their products.

18.MPSSD: Multi-Path Fusion Single Shot Detector

Author:Qu, SY;Huang, KZ;Hussain, A;Goulermas, Y

Source:2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN),2019,Vol.2019-July

Abstract:Recent prevalent one stage detectors, such as single shot detector (SSD) and RetinaNet, are able to detect objects faster than two stage ones while maintaining comparable accuracy. To further boost the accuracy, many studies focus on enhancing the multi-scale feature pyramid. Most of these current proposals focus on strengthening features on one pyramid, ignoring the rich connection among different scale features. In contrast, we propose a novel multi-path design to fully utilize the localization and semantics information. First, we exploit the original SSD multi-scale features as our base pyramid. Then we fuse these features in different groups to generate multi-path feature pyramids. Finally, we combine these pyramids through a novel and effective aggregation module, to obtain the final informative pyramid for detection. Comparative experiments on benchmark PASCAL VOC and MS COCO datasets have shown that our proposed method outperforms many state-of-the-art detectors. As an illustrative example, for input image with size 512x512, we can achieve a mean Average Precision (mAP) of 81.8%% on VOC2007 test and 33.1%% mAP on COCO test-dev2015.

19.Vehicle identification by improved stacking via kernel principal component regression

Author:Zhang,Bailing;Pan,Hao

Source:International Journal of Intelligent Computing and Cybernetics,2014,Vol.7

Abstract:Purpose - Many applications in intelligent transportation demand accurate categorization of vehicles. The purpose of this paper is to propose a working image-based vehicle classification system. The first component vehicle detection is implemented by applying Dalal and Triggs’s histograms of oriented gradients features and linear support vector machine (SVM) classifier. The second component vehicle classification, which is the emphasis of this paper, is accomplished by an improved stacked generalization. As an effective ensemble learning strategy, stacked generalization has been proposed to combine multiple models using the concept of a meta-learner. However, it was found that the well-known meta-learning scheme multi-response linear regression (MLR) for stacked generalization performs poorly on the vehicle classification.

20.Resource-aware Service-oriented Approach for Elderly Healthcare

Author:Wan, KY;Alagar, V

Source:2018 5TH INTERNATIONAL CONFERENCE ON SYSTEMS AND INFORMATICS (ICSAI),2018,Vol.

Abstract:To provide healthcare services to the elderly in an optimal fashion, while supporting their safety and privacy concerns, it is essential to develop systems that provide dependable healthcare services whenever they are demanded and from wherever they are demanded by the elderly. In this paper we propose resource-aware service-oriented service model as a way to design systems that can efficiently provide quality healthcare services. The primary virtues of this approach are scalability, reusability, dependability, and adaptability of the resource and service specifications.
Total 151 results found
Copyright 2006-2020 © Xi'an Jiaotong-Liverpool University 苏ICP备07016150号-1 京公网安备 11010102002019号