-
Minoru WATANABE
2018 Volume E101.D Issue 2 Pages
277
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
-
Motoki AMAGASAKI, Masato IKEBE, Qian ZHAO, Masahiro IIDA, Toshinori SU ...
Article type: PAPER
Subject area: Device and Architecture
2018 Volume E101.D Issue 2 Pages
278-287
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
Three-dimensional (3D) field-programmable gate arrays (FPGAs) are expected to offer higher logic density as well as improved delay and power performance by utilizing 3D integrated circuit technology. However, because through-silicon-vias (TSVs) for conventional 3D FPGA interlayer connections have a large area overhead, there is an inherent tradeoff between connectivity and small size. To find a balance between cost and performance, and to explore 3D FPGAs with realistic 3D integration processes, we propose two types of 3D FPGA and construct design tool sets for architecture exploration. In previous research, we created a TSV-free 3D FPGA with a face-down integration method; however, this was limited to two layers. In this paper, we discuss the face-up stacking of several face-down stacked FPGAs. To minimize the number of TSVs, we placed TSVs peripheral to the FPGAs for 3D-FPGA with 4 layers. According to our results, a 2-layer 3D FPGA has reasonable performance when limiting the design to two layers, but a 4-layer 3D FPGA is a better choice when area is emphasized.
View full abstract
-
Hoang-Gia VU, Shinya TAKAMAEDA-YAMAZAKI, Takashi NAKADA, Yasuhiko NAKA ...
Article type: PAPER
Subject area: Device and Architecture
2018 Volume E101.D Issue 2 Pages
288-302
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
Modern FPGAs have been integrated in computing systems as accelerators for long running applications. This integration puts more pressure on the fault tolerance of computing systems, and the requirement for dependability becomes essential. As in the case of CPU-based system, checkpoint/restart techniques are also expected to improve the dependability of FPGA-based computing. Three issues arise in this situation: how to checkpoint and restart FPGAs, how well this checkpoint/restart model works with the checkpoint/restart model of the whole computing system, and how to build the model by a software tool. In this paper, we first present a new checkpoint/restart architecture along with a checkpointing mechanism on FPGAs. We then propose a method to capture consistent snapshots of FPGA and the rest of the computing system. Third, we provide “fine-grained” management for checkpointing to reduce performance degradation. For the host CPU, we also provide a stack which includes API functions to manage checkpoint/restart procedures on FPGAs. Fourth, we present a Python-based tool to insert checkpointing infrastructure. Experimental results show that the checkpointing architecture causes less than 10% maximum clock frequency degradation, low checkpointing latencies, small memory footprints, and small increases in power consumption, while the LUT overhead varies from 17.98% (Dijkstra) to 160.67% (Matrix Multiplication).
View full abstract
-
Toshihiro KATASHITA, Masakazu HIOKI, Yohei HORI, Hanpei KOIKE
Article type: PAPER
Subject area: Device and Architecture
2018 Volume E101.D Issue 2 Pages
303-313
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
Field-programmable gate array (FPGA) devices are applied for accelerating specific calculations and reducing power consumption in a wide range of areas. One of the challenges associated with FPGAs is reducing static power for enforcing their power effectiveness. We propose a method involving fine-grained reconfiguration of body biases of logic and net resources to reduce the static power of FPGA devices. In addition, we develop an FPGA device called Flex Power FPGA with SOTB technology and demonstrate its power reduction function with a 32-bit counter circuit. In this paper, we describe the construction of an experimental platform to precisely evaluate power consumption and the maximum operating frequency of the device under various operating voltages and body biases with various practical circuits. Using the abovementioned platform, we evaluate the Flex Power FPGA chip at operating voltages of 0.5-1.0 V and at body biases of 0.0-0.5 V. In the evaluation, we use a 32-bit adder, 16-bit multiplier, and an SBOX circuit for AES cryptography. We operate the chip virtually with uniformed body bias voltage to drive all of the logic resources with the same threshold voltage. We demonstrate the advantage of the Flex Power FPGA by comparing its performance with non-reconfigurable biasing.
View full abstract
-
Hidenori GYOTEN, Masayuki HIROMOTO, Takashi SATO
Article type: PAPER
Subject area: Device and Architecture
2018 Volume E101.D Issue 2 Pages
314-323
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
An area-efficient FPGA-based annealing processor that is based on Ising model is proposed. The proposed processor eliminates random number generators (RNGs) and temperature schedulers, which are the key components in the conventional annealing processors and occupying a large portion of the design. Instead, a shift-register-based spin flipping scheme successfully helps the Ising model from stucking in the local optimum solutions. An FPGA implementation and software-based evaluation on max-cut problems of 2D-grid torus structure demonstrate that our annealing processor solves the problems 10-104 times faster than conventional optimization algorithms to obtain the solution of equal accuracy.
View full abstract
-
Akira YAMAWAKI, Seiichi SERIKAWA
Article type: PAPER
Subject area: Design Methodology and Platform
2018 Volume E101.D Issue 2 Pages
324-334
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
This paper shows a describing method of an image processing software in C for high-level synthesis (HLS) technology considering function chaining to realize an efficient hardware. A sophisticated image processing would be built on the sequence of several primitives represented as sub-functions like the gray scaling, filtering, binarization, thinning, and so on. Conventionally, generic describing methods for each sub-function so that HLS technology can generate an efficient hardware module have been shown. However, few studies have focused on a systematic describing method of the single top function consisting of the sub-functions chained. According to the proposed method, any number of sub-functions can be chained, maintaining the pipeline structure. Thus, the image processing can achieve the near ideal performance of 1 pixel per clock even when the processing chain is long. In addition, implicitly, the deadlock due to the mismatch of the number of pushes and pops on the FIFO connecting the functions is eliminated and the interpolation of the border pixels is done. The case study on a canny edge detection including the chain of some sub-functions demonstrates that our proposal can easily realize the expected hardware mentioned above. The experimental results on ZYNQ FPGA show that our proposal can be converted to the pipelined hardware with moderate size and achieve the performance gain of more than 70 times compared to the software execution. Moreover, the reconstructed C software program following our proposed method shows the small performance degradation of 8% compared with the pure C software through a comparative evaluation preformed on the Cortex A9 embedded processor in ZYNQ FPGA. This fact indicates that a unified image processing library using HLS software which can be executed on CPU or hardware module for HW/SW co-design can be established by using our proposed describing method.
View full abstract
-
Qian ZHAO, Motoki AMAGASAKI, Masahiro IIDA, Morihiro KUGA, Toshinori S ...
Article type: PAPER
Subject area: Design Methodology and Platform
2018 Volume E101.D Issue 2 Pages
335-343
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
Major cloud service providers, including Amazon and Microsoft, have started employing field-programmable gate arrays (FPGAs) to build high-performance and low-power-consumption cloud capability. However, utilizing an FPGA-enabled cloud is still challenging because of two main reasons. First, the introduction of software and hardware co-design leads to high development complexity. Second, FPGA virtualization and accelerator scheduling techniques are not fully researched for cluster deployment. In this paper, we propose an open-source FPGA-as-a-service (FaaS) platform, the hCODE, to simplify the design, management and deployment of FPGA accelerators at cluster scale. The proposed platform implements a Shell-and-IP design pattern and an open accelerator repository to reduce design and management costs of FPGA projects. Efficient FPGA virtualization and accelerator scheduling techniques are proposed to deploy accelerators on the FPGA-enabled cluster easily. With the proposed hCODE, hardware designers and accelerator users can be organized on one platform to efficiently build open-hardware ecosystem.
View full abstract
-
Shimpei SATO, Ryohei KOBAYASHI, Kenji KISE
Article type: PAPER
Subject area: Design Methodology and Platform
2018 Volume E101.D Issue 2 Pages
344-353
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
LSIs are generally designed through four stages including architectural design, logic design, circuit design, and physical design. In architectural design and logic design, designers describe their target hardware in RTL. However, they generally use different languages for each phase. Typically a general purpose programming language such as C or C++ and a hardware description language such as Verilog HDL or VHDL are used for architectural design and logic design, respectively. That is time-consuming way for designing a hardware and more efficient design environment is required. In this paper, we propose a new hardware modeling and high-speed simulation environment for architectural design and logic design. Our environment realizes writing and verifying hardware by one language. The environment consists of (1) a new hardware description language called ArchHDL, which enables to simulate hardware faster than Verilog HDL simulation, and (2) a source code translation tool from ArchHDL code to Verilog HDL code. ArchHDL is a new language for hardware RTL modeling based on C++. The key features of this language are that (1) designers describe a combinational circuit as a function and (2) the ArchHDL library realizes non-blocking assignment in C++. Using these features, designers are able to write a hardware transparently from abstracted level description to RTL description in Verilog HDL-like style. Source codes in ArchHDL is converted to Verilog HDL codes by the translation tool and they are used to synthesize for FPGAs or ASICs. As the evaluation of our environment, we implemented a practical many-core processor in ArchHDL and measured the simulation speed on an Intel CPU and an Intel Xeon Phi processor. The simulation speed for the Intel CPU by ArchHDL achieves about 4.5 times faster than the simulation speed by Synopsys VCS. We also confirmed that the RTL simulation by ArchHDL is efficiently parallelized on the Intel Xeon Phi processor. We convert the ArchHDL code to a Verilog HDL code and estimated the hardware utilization on an FPGA. To implement a 48-node many-core processor, 71% of entire resources of a Virtex-7 FPGA are consumed.
View full abstract
-
Akira JINGUJI, Shimpei SATO, Hiroki NAKAHARA
Article type: PAPER
Subject area: Emerging Applications
2018 Volume E101.D Issue 2 Pages
354-362
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
A random forest (RF) is a kind of ensemble machine learning algorithm used for a classification and a regression. It consists of multiple decision trees that are built from randomly sampled data. The RF has a simple, fast learning, and identification capability compared with other machine learning algorithms. It is widely used for application to various recognition systems. Since it is necessary to un-balanced trace for each tree and requires communication for all the ones, the random forest is not suitable in SIMD architectures such as GPUs. Although the accelerators using the FPGA have been proposed, such implementations were based on HDL design. Thus, they required longer design time than the soft-ware based realizations. In the previous work, we showed the high-level synthesis design of the RF including the fully pipelined architecture and the all-to-all communication. In this paper, to further reduce the amount of hardware, we use k-means clustering to share comparators of the branch nodes on the decision tree. Also, we develop the krange tool flow, which generates the bitstream with a few number of hyper parameters. Since the proposed tool flow is based on the high-level synthesis design, we can obtain the high performance RF with short design time compared with the conventional HDL design. We implemented the RF on the Xilinx Inc. ZC702 evaluation board. Compared with the CPU (Intel Xeon (R) E5607 Processor) and the GPU (NVidia Geforce Titan) implementations, as for the performance, the FPGA realization was 8.4 times faster than the CPU one, and it was 62.8 times faster than the GPU one. As for the power consumption efficiency, the FPGA realization was 7.8 times better than the CPU one, and it was 385.9 times better than the GPU one.
View full abstract
-
Takeshi OHKAWA, Kazushi YAMASHINA, Hitomi KIMURA, Kanemitsu OOTSU, Tak ...
Article type: PAPER
Subject area: Emerging Applications
2018 Volume E101.D Issue 2 Pages
363-375
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
A component-oriented FPGA design platform is proposed for robot system integration. FPGAs are known to be a power-efficient hardware platform, but the development cost of FPGA-based systems is currently too high to integrate them into robot systems. To solve this problem, we propose an FPGA component that allows FPGA devices to be easily integrated into robot systems based on the Robot Operating System (ROS). ROS-compliant FPGA components offer a seamless interface between the FPGA hardware and software running on the CPU. Two experiments were conducted using the proposed components. For the first experiment, the results show that the execution time of an FPGA component for image processing was 1.7 times faster than that of the original software-based component and was 2.51 times more power efficient than an ordinary PC processor, despite substantial communication overhead. The second experiment showed that an FPGA component for sensor fusion was able to process multiple sensor inputs efficiently and with very low latency via parallel processing.
View full abstract
-
Tomoya FUJII, Shimpei SATO, Hiroki NAKAHARA
Article type: PAPER
Subject area: Emerging Applications
2018 Volume E101.D Issue 2 Pages
376-386
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
For a pre-trained deep convolutional neural network (CNN) for an embedded system, a high-speed and a low power consumption are required. In the former of the CNN, it consists of convolutional layers, while in the latter, it consists of fully connection layers. In the convolutional layer, the multiply accumulation operation is a bottleneck, while the fully connection layer, the memory access is a bottleneck. The binarized CNN has been proposed to realize many multiply accumulation circuit on the FPGA, thus, the convolutional layer can be done with a high-seed operation. However, even if we apply the binarization to the fully connection layer, the amount of memory was still a bottleneck. In this paper, we propose a neuron pruning technique which eliminates almost part of the weight memory, and we apply it to the fully connection layer on the binarized CNN. In that case, since the weight memory is realized by an on-chip memory on the FPGA, it achieves a high-speed memory access. To further reduce the memory size, we apply the retraining the CNN after neuron pruning. In this paper, we propose a sequential-input parallel-output fully connection layer circuit for the binarized fully connection layer, while proposing a streaming circuit for the binarized 2D convolutional layer. The experimental results showed that, by the neuron pruning, as for the fully connected layer on the VGG-11 CNN, the number of neurons was reduced by 39.8% with keeping the 99% baseline accuracy. We implemented the neuron pruning CNN on the Xilinx Inc. Zynq Zedboard. Compared with the ARM Cortex-A57, it was 1773.0 times faster, it dissipated 3.1 times lower power, and its performance per power efficiency was 5781.3 times better. Also, compared with the Maxwell GPU, it was 11.1 times faster, it dissipated 7.7 times lower power, and its performance per power efficiency was 84.1 times better. Thus, the binarized CNN on the FPGA is suitable for the embedded system.
View full abstract
-
Haiyan HUANG, Chenxi LI
Article type: PAPER
Subject area: Fundamentals of Information Systems
2018 Volume E101.D Issue 2 Pages
387-395
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
Considering that different people are different in their linguistic preference and in order to determine the consensus state when using Computing with Words (CWW) for supporting consensus decision making, this paper first proposes an interval composite scale based 2-tuple linguistic model, which realizes the process of translation from word to interval numerical and the process of retranslation from interval numerical to word. Second, this paper proposes an interval composite scale based personalized individual semantics model (ICS-PISM), which can provide different linguistic representation models for different decision-makers. Finally, this paper proposes a consensus decision making model with ICS-PISM, which includes a semantic translation and retranslation phase during decision process and determines the consensus state of the whole decision process. These models proposed take into full consideration that human language contains vague expressions and usually real-world preferences are uncertain, and provide efficient computation models to support consensus decision making.
View full abstract
-
Huimin CAI, Eryun LIU, Hongxia LIU, Shulong WANG
Article type: PAPER
Subject area: Software System
2018 Volume E101.D Issue 2 Pages
396-404
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
A real-time road-direction point detection model is developed based on convolutional neural network architecture which can adapt to complex environment. Firstly, the concept of road-direction point is defined for either single road or crossroad. For single road, the predicted road-direction point can serve as a guiding point for a self-driving vehicle to go ahead. In the situation of crossroad, multiple road-direction points can also be detected which will help this vehicle to make a choice from possible directions. Meanwhile, different types of road surface can be classified by this model for both paved roads and unpaved roads. This information will be beneficial for a self-driving vehicle to speed up or slow down according to various road conditions. Finally, the performance of this model is evaluated on different platforms including Jetson TX1. The processing speed can reach 12 FPS on this portable embedded system so that it provides an effective and economic solution of road-direction estimation in the applications of autonomous navigation.
View full abstract
-
Chunyan HOU, Jinsong WANG, Chen CHEN
Article type: PAPER
Subject area: Software Engineering
2018 Volume E101.D Issue 2 Pages
405-414
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
System scenarios that derived from system design specification play an important role in the reliability engineering of component-based software systems. Several scenario-based approaches have been proposed to predict the reliability of a system at the design time, most of them adopt flat construction of scenarios, which doesn't conform to software design specifications and is subject to introduce state space explosion problem in the large systems. This paper identifies various challenges related to scenario modeling at the early design stages based on software architecture specification. A novel scenario-based reliability modeling and prediction approach is introduced. The approach adopts hierarchical scenario specification to model software reliability to avoid state space explosion and reduce computational complexity. Finally, the evaluation experiment shows the potential of the approach.
View full abstract
-
Eita FUJISHIMA, Kenji NAKASHIMA, Saneyasu YAMAGUCHI
Article type: PAPER
Subject area: Data Engineering, Web Information Systems
2018 Volume E101.D Issue 2 Pages
415-427
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
Hadoop is a popular open-source MapReduce implementation. In the cases of jobs, wherein huge scale of output files of all relevant Map tasks are transmitted into Reduce tasks, such as TeraSort, the Reduce tasks are the bottleneck tasks and are I/O bounded for processing many large output files. In most cases, including TeraSort, the intermediate data, which include the output files of the Map tasks, are large and accessed sequentially. For improving the performance of these jobs, it is important to increase the sequential access performance. In this paper, we propose methods for improving the performance of Reduce tasks of such jobs by considering the following two things. One is that these files are accessed sequentially on an HDD, and the other is that each zone in an HDD has different sequential I/O performance. The proposed methods control the location to store intermediate data by modifying block bitmap of filesystem, which manages utilization (free or used) of blocks in an HDD. In addition, we propose striping layout for applying these methods for virtualized environment using image files. We then present performance evaluation of the proposed method and demonstrate that our methods improve the Hadoop application performance.
View full abstract
-
Toru NAKASHIKA
Article type: PAPER
Subject area: Artificial Intelligence, Data Mining
2018 Volume E101.D Issue 2 Pages
428-436
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
Two different types of representations, such as an image and its manually-assigned corresponding labels, generally have complex and strong relationships to each other. In this paper, we represent such deep relationships between two different types of visible variables using an energy-based probabilistic model, called a deep relational model (DRM) to improve the prediction accuracies. A DRM stacks several layers from one visible layer on to another visible layer, sandwiching several hidden layers between them. As with restricted Boltzmann machines (RBMs) and deep Boltzmann machines (DBMs), all connections (weights) between two adjacent layers are undirected. During maximum likelihood (ML) -based training, the network attempts to capture the latent complex relationships between two visible variables with its deep architecture. Unlike deep neural networks (DNNs), 1) the DRM is a totally generative model and 2) allows us to generate one visible variables given the other, and 2) the parameters can be optimized in a probabilistic manner. The DRM can be also fine-tuned using DNNs, like deep belief nets (DBNs) or DBMs pre-training. This paper presents experiments conduced to evaluate the performance of a DRM in image recognition and generation tasks using the MNIST data set. In the image recognition experiments, we observed that the DRM outperformed DNNs even without fine-tuning. In the image generation experiments, we obtained much more realistic images generated from the DRM more than those from the other generative models.
View full abstract
-
Chengxiang YIN, Hongjun ZHANG, Rui ZHANG, Zilin ZENG, Xiuli QI, Yuntia ...
Article type: PAPER
Subject area: Artificial Intelligence, Data Mining
2018 Volume E101.D Issue 2 Pages
437-446
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
The main idea of filter methods in feature selection is constructing a feature-assessing criterion and searching for feature subset that optimizes the criterion. The primary principle of designing such criterion is to capture the relevance between feature subset and the class as precisely as possible. It would be difficult to compute the relevance directly due to the computation complexity when the size of feature subset grows. As a result, researchers adopt approximate strategies to measure relevance. Though these strategies worked well in some applications, they suffer from three problems: parameter determination problem, the neglect of feature interaction information and overestimation of some features. We propose a new feature selection algorithm that could compute mutual information between feature subset and the class directly without deteriorating computation complexity based on the computation of partitions. In light of the specific properties of mutual information and partitions, we propose a pruning rule and a stopping criterion to accelerate the searching speed. To evaluate the effectiveness of the proposed algorithm, we compare our algorithm to the other five algorithms in terms of the number of selected features and the classification accuracies on three classifiers. The results on the six synthetic datasets show that our algorithm performs well in capturing interaction information. The results on the thirteen real world datasets show that our algorithm selects less yet better feature subset.
View full abstract
-
Yu YAN, Kohei HARA, Takenobu KAZUMA, Yasuhiro HISADA, Aiguo HE
Article type: PAPER
Subject area: Educational Technology
2018 Volume E101.D Issue 2 Pages
447-454
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
Studies have shown that program visualization(PV) is effective for student programming exercise or self-study support. However, very few instructors actively use PV tools for programming lectures. This article discussed the impediments the instructors meet during combining PV tools into lecture classrooms and proposed a C programming classroom instruction support tool based on program visualization — PROVIT-CI (PROgram VIsualization Tool for Classroom Instruction). PROVIT-CI has been consecutively and actively used by the instructors in author's university to enhance their lectures since 2015. The evaluation of application results in an introductory C programming course shows that PROVIT-CI is effective and helpful for instructors classroom use.
View full abstract
-
Tetsuya WATANABE, Hirotsugu KAGA, Shota SHINKAI
Article type: PAPER
Subject area: Rehabilitation Engineering and Assistive Technology
2018 Volume E101.D Issue 2 Pages
455-461
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
Many text entry methods are available in the use of touch interface devices when using a screen reader, and blind smartphone users and their supporters are eager to know which one is the easiest to learn and the fastest. Thus, we compared the text entry speeds and error counts for four combinations of software keyboards and character-selecting gestures over a period of five days. The split-tap gesture on the Japanese numeric keypad was found to be the fastest across the five days even though this text entry method produced the most errors. The two entry methods on the QWERTY keyboard were slower than the two entry methods on the numeric keypad. This difference in text entry speed was explained by the differences in key pointing and tapping times and their repitition numbers among different methods.
View full abstract
-
Nobukatsu HOJO, Yusuke IJIMA, Hideyuki MIZUNO
Article type: PAPER
Subject area: Speech and Hearing
2018 Volume E101.D Issue 2 Pages
462-472
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
Deep neural network (DNN)-based speech synthesis can produce more natural synthesized speech than the conventional HMM-based speech synthesis. However, it is not revealed whether the synthesized speech quality can be improved by utilizing a multi-speaker speech corpus. To address this problem, this paper proposes DNN-based speech synthesis using speaker codes as a method to improve the performance of the conventional speaker dependent DNN-based method. In order to model speaker variation in the DNN, the augmented feature (speaker codes) is fed to the hidden layer(s) of the conventional DNN. This paper investigates the effectiveness of introducing speaker codes to DNN acoustic models for speech synthesis for two tasks: multi-speaker modeling and speaker adaptation. For the multi-speaker modeling task, the method we propose trains connection weights of the whole DNN using a multi-speaker speech corpus. When performing multi-speaker synthesis, the speaker code corresponding to the selected target speaker is fed to the DNN to generate the speaker's voice. When performing speaker adaptation, a set of connection weights of the multi-speaker model is re-estimated to generate a new target speaker's voice. We investigated the relationship between the prediction performance and architecture of the DNNs through objective measurements. Objective evaluation experiments revealed that the proposed model outperformed conventional methods (HMMs, speaker dependent DNNs and multi-speaker DNNs based on a shared hidden layer structure). Subjective evaluation experimental results showed that the proposed model again outperformed the conventional methods (HMMs, speaker dependent DNNs), especially when using a small number of target speaker utterances.
View full abstract
-
Kyeongmin JEONG, Kwangyeon CHOI, Donghwan KIM, Byung Cheol SONG
Article type: PAPER
Subject area: Image Processing and Video Processing
2018 Volume E101.D Issue 2 Pages
473-480
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
Advanced driver assistance system (ADAS) can recognize traffic signals, vehicles, pedestrians, and so on all over the vehicle. However, because the ADAS is based on images taken in an outdoor environment, it is susceptible to ambient weather such as fog. So, preprocessing such as de-fog and de-hazing techniques is required to prevent degradation of object recognition performance due to decreased visibility. But, if such a fog removal technique is applied in an environment where there is little or no fog, the visual quality may be deteriorated due to excessive contrast improvement. And in foggy road environments, typical fog removal algorithms suffer from color distortion. In this paper, we propose a temporal filter-based fog detection algorithm to selectively apply de-fogging method only in the presence of fog. We also propose a method to avoid color distortion by detecting the sky region and applying different methods to the sky region and the non-sky region. Experimental results show that in the actual images, the proposed algorithm shows an average of more than 97% fog detection accuracy, and improves subjective image quality of existing de-fogging algorithms. In addition, the proposed algorithm shows very fast computation time of less than 0.1ms per frame.
View full abstract
-
Yoshiki ITO, Takahiro OGAWA, Miki HASEYAMA
Article type: PAPER
Subject area: Image Processing and Video Processing
2018 Volume E101.D Issue 2 Pages
481-490
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
A method for accurate estimation of personalized video preference using multiple users' viewing behavior is presented in this paper. The proposed method uses three kinds of features: a video, user's viewing behavior and evaluation scores for the video given by a target user. First, the proposed method applies Supervised Multiview Spectral Embedding (SMSE) to obtain lower-dimensional video features suitable for the following correlation analysis. Next, supervised Multi-View Canonical Correlation Analysis (sMVCCA) is applied to integrate the three kinds of features. Then we can get optimal projections to obtain new visual features, “canonical video features” reflecting the target user's individual preference for a video based on sMVCCA. Furthermore, in our method, we use not only the target user's viewing behavior but also other users' viewing behavior for obtaining the optimal canonical video features of the target user. This unique approach is the biggest contribution of this paper. Finally, by integrating these canonical video features, Support Vector Ordinal Regression with Implicit Constraints (SVORIM) is trained in our method. Consequently, the target user's preference for a video can be estimated by using the trained SVORIM. Experimental results show the effectiveness of our method.
View full abstract
-
Yung-Yao CHEN, Yi-Cheng ZHANG
Article type: PAPER
Subject area: Image Recognition, Computer Vision
2018 Volume E101.D Issue 2 Pages
491-503
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
Tracking-by-detection methods consider tracking task as a continuous detection problem applied over video frames. Modern tracking-by-detection trackers have online learning ability; the update stage is essential because it determines how to modify the classifier inherent in a tracker. However, most trackers search for the target within a fixed region centered at the previous object position; thus, they lack spatiotemporal consistency. This becomes a problem when the tracker detects an incorrect object during short-term occlusion. In addition, the scale of the bounding box that contains the target object is usually assumed not to change. This assumption is unrealistic for long-term tracking, where the scale of the target varies as the distance between the target and the camera changes. The accumulation of errors resulting from these shortcomings results in the drift problem, i.e. drifting away from the target object. To resolve this problem, we present a drift-free, online learning-based tracking-by-detection method using a single static camera. We improve the latent structured support vector machine (SVM) tracker by designing a more robust tracker update step by incorporating two Kalman filter modules: the first is used to predict an adaptive search region in consideration of the object motion; the second is used to adjust the scale of the bounding box by accounting for the background model. We propose a hierarchical search strategy that combines Bhattacharyya coefficient similarity analysis and Kalman predictors. This strategy facilitates overcoming occlusion and increases tracking efficiency. We evaluate this work using publicly available videos thoroughly. Experimental results show that the proposed method outperforms the state-of-the-art trackers.
View full abstract
-
Lishuang LI, Xinyu HE, Jieqiong ZHENG, Degen HUANG, Fuji REN
Article type: PAPER
Subject area: Natural Language Processing
2018 Volume E101.D Issue 2 Pages
504-511
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
Protein-Protein Interaction Extraction (PPIE) from biomedical literatures is an important task in biomedical text mining and has achieved great success on public datasets. However, in real-world applications, the existing PPI extraction methods are limited to label effort. Therefore, transfer learning method is applied to reduce the cost of manual labeling. Current transfer learning methods suffer from negative transfer and lower performance. To tackle this problem, an improved TrAdaBoost algorithm is proposed, that is, relative distribution is introduced to initialize the weights of TrAdaBoost to overcome the negative transfer caused by domain differences. To make further improvement on the performance of transfer learning, an approach combining active learning with the improved TrAdaBoost is presented. The experimental results on publicly available PPI corpora show that our method outperforms TrAdaBoost and SVM when the labeled data is insufficient,and on document classification corpora, it also illustrates that the proposed approaches can achieve better performance than TrAdaBoost and TPTSVM in final, which verifies the effectiveness of our methods.
View full abstract
-
Seung-Hoon NA, Young-Kil KIM
Article type: PAPER
Subject area: Natural Language Processing
2018 Volume E101.D Issue 2 Pages
512-522
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
In this paper, we propose a novel phrase-based model for Korean morphological analysis by considering a phrase as the basic processing unit, which generalizes all the other existing processing units. The impetus for using phrases this way is largely motivated by the success of phrase-based statistical machine translation (SMT), which convincingly shows that the larger the processing unit, the better the performance. Experimental results using the SEJONG dataset show that the proposed phrase-based models outperform the morpheme-based models used as baselines. In particular, when combined with the conditional random field (CRF) model, our model leads to statistically significant improvements over the state-of-the-art CRF method.
View full abstract
-
XueTing LIM, Kenjiro SUGIMOTO, Sei-ichiro KAMATA
Article type: PAPER
Subject area: Biological Engineering
2018 Volume E101.D Issue 2 Pages
523-530
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
Seed detection or sometimes known as nuclei detection is a prerequisite step of nuclei segmentation which plays a critical role in quantitative cell analysis. The detection result is considered as accurate if each detected seed lies only in one nucleus and is close to the nucleus center. In previous works, voting methods are employed to detect nucleus center by extracting the nucleus saliency features. However, these methods still encounter the risk of false seeding, especially for the heterogeneous intensity images. To overcome the drawbacks of previous works, a novel detection method is proposed, which is called secant normal voting. Secant normal voting achieves good performance with the proposed skipping range. Skipping range avoids over-segmentation by preventing false seeding on the occlusion regions. Nucleus centers are obtained by mean-shift clustering from clouds of voting points. In the experiments, we show that our proposed method outperforms the comparison methods by achieving high detection accuracy without sacrificing the computational efficiency.
View full abstract
-
Jin-Taek SEONG
Article type: LETTER
Subject area: Fundamentals of Information Systems
2018 Volume E101.D Issue 2 Pages
531-534
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
In this paper, we consider to develop a recovery algorithm of a sparse signal for a compressed sensing (CS) framework over finite fields. A basic framework of CS for discrete signals rather than continuous signals is established from the linear measurement step to the reconstruction. With predetermined priori distribution of a sparse signal, we reconstruct it by using a message passing algorithm, and evaluate the performance obtained from simulation. We compare our simulation results with the theoretic bounds obtained from probability analysis.
View full abstract
-
Jiajun ZHOU, Bo LIU, Lu DENG, Yaofeng CHEN, Zhefeng XIAO
Article type: LETTER
Subject area: Fundamentals of Information Systems
2018 Volume E101.D Issue 2 Pages
535-538
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
Graph sampling is an effective method to sample a representative subgraph from a large-scale network. Recently, researches have proven that several classical sampling methods are able to produce graph samples but do not well match the distribution of the graph properties in the original graph. On the other hand, the validation of these sampling methods and the scale of a good graph sample have not been examined on weighted graphs. In this paper, we propose the weighted graph sampling problem. We consider the proper size of a good graph sample, propose novel methods to verify the effectiveness of sampling and test several algorithms on real datasets. Most notably, we get new practical results, shedding a new insight on weighted graph sampling. We find weighted random walk performs best compared with other algorithms and a graph sample of 20% is enough for weighted graph sampling.
View full abstract
-
Yun-Feng XING, Xiao CHEN, Ming-Xiang GUAN, Zhe-Ming LU
Article type: LETTER
Subject area: Fundamentals of Information Systems
2018 Volume E101.D Issue 2 Pages
539-542
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
Considering that the traditional local-world evolving network model cannot fully reflect the characteristics of real-world power grids, this Letter proposes a new evolving model based on geographical location clusters. The proposed model takes into account the geographical locations and degree values of nodes, and the growth process is in line with the characteristics of the power grid. Compared with the characteristics of real-world power grids, the results show that the proposed model can simulate the degree distribution of China's power grids when the number of nodes is small. When the number of nodes exceeds 800, our model can simulate the USA western power grid's degree distribution. And the average distances and clustering coefficients of the proposed model are close to that of the real world power grids. All these properties confirm the validity and rationality of our model.
View full abstract
-
Hyun KWON, Yongchul KIM, Hyunsoo YOON, Daeseon CHOI
Article type: LETTER
Subject area: Information Network
2018 Volume E101.D Issue 2 Pages
543-546
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
We propose new CAPTCHA image generation systems by using generative adversarial network (GAN) techniques to strengthen against CAPTCHA solvers. To verify whether a user is human, CAPTCHA images are widely used on the web industry today. We introduce two different systems for generating CAPTCHA images, namely, the distance GAN (D-GAN) and composite GAN (C-GAN). The D-GAN adds distance values to the original CAPTCHA images to generate new ones, and the C-GAN generates a CAPTCHA image by composing multiple source images. To evaluate the performance of the proposed schemes, we used the CAPTCHA breaker software as CAPTCHA solver. Then, we compared the resistance of the original source images and the generated CAPTCHA images against the CAPTCHA solver. The results show that the proposed schemes improve the resistance to the CAPTCHA solver by over 67.1% and 89.8% depending on the system.
View full abstract
-
Joyce Jiyoung WHANG, Yunseob SHIN
Article type: LETTER
Subject area: Artificial Intelligence, Data Mining
2018 Volume E101.D Issue 2 Pages
547-551
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
In social and information network analysis, ranking has been considered to be one of the most fundamental and important tasks where the goal is to rank the nodes of a given graph according to their importance. For example, the PageRank and the HITS algorithms are well-known ranking methods. While these traditional ranking methods focus only on the structure of the entire network, we propose to incorporate a local view into node ranking by exploiting the clustering structure of real-world networks. We develop localized ranking mechanisms by partitioning the graphs into a set of tightly-knit groups and extracting each of the groups where the localized ranking is computed. Experimental results show that our localized ranking methods rank the nodes quite differently from the traditional global ranking methods, which indicates that our methods provide new insights and meaningful viewpoints for network analysis.
View full abstract
-
Yangyu FAN, Rui DU, Jianshu WANG
Article type: LETTER
Subject area: Pattern Recognition
2018 Volume E101.D Issue 2 Pages
552-555
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
Identification of urban road targets using radar systems is usually heavily dependent on the aspect angle between the target velocity and line of sight of the radar. To improve the performance of the classification result when the target is in a cross range position relative to the radar, a method based on range micro Doppler signature is proposed in this paper. Joint time-frequency analysis is applied in every range cell to extract the time Doppler signature. The spectrograms from all of the target range cells are combined to form the range micro Doppler signature to allow further identification. Experiments were conducted to investigate the performance of the proposed method, and the results proved the effectiveness of the method presented.
View full abstract
-
JianFeng WU, HuiBin QIN, YongZhu HUA, LingYan FAN
Article type: LETTER
Subject area: Speech and Hearing
2018 Volume E101.D Issue 2 Pages
556-559
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
In this paper, a novel method for pitch estimation and voicing classification is proposed using reconstructed spectrum from Mel-frequency cepstral coefficients (MFCC). The proposed algorithm reconstructs spectrum from MFCC with Moore-Penrose pseudo-inverse by Mel-scale weighting functions. The reconstructed spectrum is compressed and filtered in log-frequency. Pitch estimation is achieved by modeling the joint density of pitch frequency and the filter spectrum with Gaussian Mixture Model (GMM). Voicing classification is also achieved by GMM-based model, and the test results show that over 99% frames can be correctly classified. The results of pitch estimation demonstrate that the proposed GMM-based pitch estimator has high accuracy, and the relative error is 6.68% on TIMIT database.
View full abstract
-
Jinhua WANG, Weiqiang WANG, Guangmei XU, Hongzhe LIU
Article type: LETTER
Subject area: Image Recognition, Computer Vision
2018 Volume E101.D Issue 2 Pages
560-563
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
In this paper, we describe the direct learning of an end-to-end mapping between under-/over-exposed images and well-exposed images. The mapping is represented as a deep convolutional neural network (CNN) that takes multiple-exposure images as input and outputs a high-quality image. Our CNN has a lightweight structure, yet gives state-of-the-art fusion quality. Furthermore, we know that for a given pixel, the influence of the surrounding pixels gradually increases as the distance decreases. If the only pixels considered are those in the convolution kernel neighborhood, the final result will be affected. To overcome this problem, the size of the convolution kernel is often increased. However, this also increases the complexity of the network (too many parameters) and the training time. In this paper, we present a method in which a number of sub-images of the source image are obtained using the same CNN model, providing more neighborhood information for the convolution operation. Experimental results demonstrate that the proposed method achieves better performance in terms of both objective evaluation and visual quality.
View full abstract
-
Jingjie YAN, Bojie YAN, Ruiyu LIANG, Guanming LU, Haibo LI, Shipeng XI ...
Article type: LETTER
Subject area: Image Recognition, Computer Vision
2018 Volume E101.D Issue 2 Pages
564-567
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
In this paper, we present a novel regression-based robust locality preserving projections (RRLPP) method to effectively deal with the issue of noise and occlusion in facial expression recognition. Similar to robust principal component analysis (RPCA) and robust regression (RR) approach, the basic idea of the presented RRLPP approach is also to lead in the low-rank term and the sparse term of facial expression image sample matrix to simultaneously overcome the shortcoming of the locality preserving projections (LPP) method and enhance the robustness of facial expression recognition. However, RRLPP is a nonlinear robust subspace method which can effectively describe the local structure of facial expression images. The test results on the Multi-PIE facial expression database indicate that the RRLPP method can effectively eliminate the noise and the occlusion problem of facial expression images, and it also can achieve better or comparative facial expression recognition rate compared to the non-robust and robust subspace methods meantime.
View full abstract
-
SungIk CHO, JungHyun HAN
Article type: LETTER
Subject area: Computer Graphics
2018 Volume E101.D Issue 2 Pages
568-571
Published: February 01, 2018
Released on J-STAGE: February 01, 2018
JOURNAL
FREE ACCESS
Supplementary material
This paper proposes a painterly morphing algorithm for mobile smart devices, where each frame in the morphing sequence looks like an oil-painted picture with brush strokes. It can be presented, for example, during the transition between the main screen and a specific application screen. For this, a novel dissimilarity function and acceleration data structures are developed. The experimental results show that the algorithm produces visually stunning effects at an interactive time.
View full abstract