Special Section on Selected Papers from the 20th Workshop on Circuits and Systems in Karuizawa
-
Masahide ABE
2008 Volume E91.A Issue 4 Pages
925-926
Published: April 01, 2008
Released on J-STAGE: July 01, 2018
JOURNAL
RESTRICTED ACCESS
-
Toru EZAWA, Hiroo SEKIYA, Takashi YAHAGI
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
927-934
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
This paper investigates the design curves of the class DE amplifier with the nonlinear shunt capacitances for any output
Q and any grading coefficient
m of the diode junction in the MOSFET. The design curves are derived by the numerical calculation using Spice. The results of this paper have two important meanings. Firstly, it is clarified that the nonlinearities of the shunt capacitances affect the design curves of the class DE amplifier, especially, for low output
Q. Moreover, the supply voltage is a quite important parameter to design the class DE amplifier with the nonlinear shunt capacitances. Secondly, it is also clarified that the numerical design tool using Spice, which is proposed by authors, can be applied to the derivation of the design curves. This shows the possibility of the algorithm to be a powerful tool for the analysis of the class E switching circuits. The waveforms from Spice simulations denote the validity of the design curves.
View full abstract
-
Seungwoo CHUN, Yoshihiro HAYAKAWA, Koji NAKAJIMA
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
935-942
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
The visual inspection of defects in products is heavily dependent on human experience and instinct. In this situation, it is difficult to reduce the production costs and to shorten the inspection time and hence the total process time. Consequently people involved in this area desire an automatic inspection system. In this paper, we propose a hardware neural network, which is expected to provide high-speed operation for automatic inspection of products. Since neural networks can learn, this is a suitable method for self-adjustment of criteria for classification. To achieve high-speed operation, we use parallel and pipelining techniques. Furthermore, we use a piecewise linear function instead of a conventional activation function in order to save hardware resources. Consequently, our proposed hardware neural network achieved 6GCPS and 2GCUPS, which in our test sample proved to be sufficiently fast.
View full abstract
-
Johan SVEHOLM, Yoshihiro HAYAKAWA, Koji NAKAJIMA
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
943-950
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
Further development of a network based on the Inverse Function Delayed (ID) model which can recall temporal sequences of patterns, is proposed. Additional advantage is taken of the negative resistance region of the ID model and its hysteretic properties by widening the negative resistance region and letting the output of the ID neuron be almost instant. Calling this neuron
limit ID neuron, a model with limit ID neurons connected pairwise with conventional neurons enlarges the storage capacity and increases it even further by using a weightmatrix that is calculated to guarantee the storage after transforming the sequence of patterns into a linear separation problem. The network's tolerance, or the model's ability to recall a sequence, starting in a pattern with initial distortion is also investigated and by choosing a suitable value for the output delay of the conventional neuron, the distortion is gradually reduced and finally vanishes.
View full abstract
-
Shiho HAGIWARA, Takumi UEZONO, Takashi SATO, Kazuya MASU
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
951-956
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
Stochastic approaches for effective power distribution network optimization are proposed. Considering node voltages obtained using dynamic voltage drop analysis as sample variables, multi-variate regression is conducted to optimize clock timing metrics, such as clock skew or jitter. Aggregate correlation coefficient (ACC) which quantifies connectivity between different chip regions is defined in order to find a possible insufficiency in wire connections of a power distribution network. Based on the ACC, we also propose a procedure using linear regression to find the most effective region for improving clock timing metrics. By using the proposed procedure, effective fixing point were obtained two orders faster than by using brute force circuit simulation.
View full abstract
-
Masanori IMAI, Takashi SATO, Noriaki NAKAYAMA, Kazuya MASU
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
957-964
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
We present an evaluation method for estimating the lower bound number of Monte Carlo STA trials required to obtain at least one sample which falls within top-
k% of its parent population. The sample can be used to ensure that target designs are timing-error free with a predefined probability using the minimum computational cost. The lower bound number is represented as a closed-form formula which is general enough to be applied to other verifications. For validation, Monte Carlo STA was carried out on various benchmark data including ISCAS circuits. The minimum number of Monte Carlo runs determined using the proposed method successfully extracted one or more top-
k% delay instances.
View full abstract
-
Hao SAN, Hajime KONAGAYA, Feng XU, Atsushi MOTOZAWA, Haruo KOBAYASHI, ...
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
965-970
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
This paper proposes novel feedforward architecture of the second-order multibit ΔΣAD modulator with single DAC-feedback topology. The ΔΣAD modulator realizes high resolution by oversampling and noise shaping techniques. However, its SNDR (Signal to Noise and Distortion Ratio) is limited by the dynamic range of the input signal and non-idealities of circuit building blocks, particularly by the harmonic distortion in amplifier circuits. A full feedforward ΔΣAD modulator structure has the signal transfer function of unity under ideal circumstances, which means that the signal swings through the loop filter become smaller compared with a feedbacked ΔΣAD modulator. Therefore, the harmonic distortion generated inside the loop filter can be significantly reduced in the feedforward structure because the effect of non-idealities in amplifiers can be suppressed when signal swing is small. Moreover, the reduction of the internal signal swings also relaxes output swing requirements for amplifiers with low supply voltage. However, in conventional feedforward ΔΣAD modulator, an analog adder is needed before quantizer, and especially in a multibit modulator, an additional amplifier is necessary to realize the summation of feedforward signals, which leads to extra chip area and power dissipation. In this paper, we propose a novel architecture of a feedforward ΔΣAD modulator which realizes the summation of feedforward signals without additional amplifier. The proposed architecture is functionally equivalent to the conventional one but with smaller chip area and lower power dissipation. We conducted MATLAB and SPICE simulations to validate the proposed architecture and modulator circuits.
View full abstract
-
Yibo FAN, Takeshi IKENAGA, Satoshi GOTO
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
971-977
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
With the increase of key length used in public cryptographic algorithms such as RSA and ECC, the speed of Montgomery multiplication becomes a bottleneck. This paper proposes a high speed design of Montgomery multiplier. Firstly, a modified scalable high-radix Montgomery algorithm is proposed to reduce critical path. Secondly, a high-radix clock-saving dataflow is proposed to support high-radix operation and one clock cycle delay in dataflow. Finally, a hardware-reused architecture is proposed to reduce the hardware cost and a parallel radix-16 design of data path is proposed to accelerate the speed. By using HHNEC 0.25μm standard cell library, the implementation results show that the total cost of Montgomery multiplier is 130 KGates, the clock frequency is 180MHz and the throughput of 1024-bit RSA encryption is 352kbps. This design is suitable to be used in high speed RSA or ECC encryption/decryption. As a scalable design, it supports any key-length encryption/decryption up to the size of on-chip memory.
View full abstract
-
Yoshihisa TAKAHASHI, Kentaro HIRAKI, Hisakazu KIKUCHI, Shogo MURAMATSU
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
978-986
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
This paper presents a color demosaicing method applied to the Bayer pattern color filter array (CFA). Reliable estimation of an edge direction, edge-directed asymmetric interpolation, and the use of color samples at immediate neighbors are considered as the key guidelines for smooth and sharp image restoration. Also, special interest is directed to local areas that are rich in high spatial frequency variations. For suppression of false colors likely to occur in those areas, a hue vector representation is introduced so that the spatial correlation between different color components may be exploited in consistent with the local constant-hue principle. Smoothing is repeated in the hue vector field a few times. Experimental results have shown preferable performances in terms of PSNR, CIELAB color difference, hue angle difference, CIE chromaticity and visual appearance, in particular resulting in less false colors.
View full abstract
-
Yiqing HUANG, Zhenyu LIU, Yang SONG, Satoshi GOTO, Takeshi IKENAGA
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
987-997
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
One hardware efficient and high speed architecture for variable block size motion estimation (VBSME) in H.264 is presented in this paper. By improving the pipeline structure and processing element (PE) circuits, the system latency and hardware cost is reduced, which makes this structure more hardware efficient than the original Propagate Partial SAD architecture. For small and middle frame size picture's coding, the proposed structure can save 12.1% hardware cost compared with original Propagate Partial SAD structure. In the case of HDTV, since small inter modes trivially contribute to the coding quality, we remove modes below 8×8 in our design. By adopting mode reduction technique, when the set number of PE array is less than 8, the proposed mode reduction based Propagate Partial SAD structure can work at faster clock speed and consume less hardware cost than widely used SAD Tree architecture. It is more robust to the high speed timing constraint when parallel processing is considered. With TSMC 0.18μm technology in worst work conditions (1.62V, 125°C), its peak throughput of 8-set PE array structure is 720p@30Hz with 128×64 search range and 5 reference frames. 12k gates hardware cost can be reduced by our design compared with the parallel SAD Tree architecture.
View full abstract
-
Yoshihiro KITAURA, Mitsuji MUNEYASU, Katsuaki NAKANISHI
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
998-1005
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
JPEG2000 still image coding standard has a feature called Region of Interest (ROI) coding. This feature can encode a restricted region in an image prior to than its background (BG) region. In low bit rate compression, the code of the ROI region occupies the most of the bit stream in the whole image and it causes the serious deterioration of the image quality in the BG region. This paper proposes a new image quality control method between the ROI region and the BG one by the one time encoding process and it can achieve more detailed image quality control. The use of ROI masks in the encoder makes it possible. The standard decoder of JPEG2000 partl can decode the encoded data in the proposed method.
View full abstract
-
Sakol UDOMSIRI, Masahiro IWAHASHI, Shogo MURAMATSU
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
1006-1014
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
This paper proposes a new type of layered video coding especially for the use of monitoring water level of a river. A sensor node of the system decomposes an input video signal into some kinds of component signals and produces a bit stream functionally separated into three layers. The first layer contains the minimum components effective for detecting the water level. It is transmitted at very low bit rate for regular monitoring. The second layer contains signals for thumb-nail video browsing. The third layer contains additional data for decoding the original video signal. These are transmitted in case of necessity. A video signal is decomposed into several bands with the three dimensional Haar transform. In this paper, optimum bands to be contained into the 1st layer are experimentally investigated considering both of water level detection and data size to be transmitted. As a result, bit rate for transmitting the first layer is reduced by 32.5% at the cost of negligible 3.7% decrease of recognition performance for one of video examples.
View full abstract
-
Zhenxing CHEN, Yang SONG, Takeshi IKENAGA, Satoshi GOTO
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
1015-1022
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
Comparing with search pattern motion estimation (ME) algorithms, adaptive search range (ASR) algorithms are more fundamental, regular and flexible. In variable block size motion estimation (VBSME), ASR algorithms can be applied whether on a whole frame (frame level), or on an entire macroblock which includes up to forty-one blocks (macroblock level), or just on a single block (block level). In the other hand, in H.264/AVC, not the motion vectors (MVs) but the motion vector differences (MVDs) are coded and the median motion vector predictors (median-MVPs) are used to place the search centers. In this sense, it can be thought that the search windows (SWs) are centered at the positions pointed by median-MVPs, the search ranges (SRs) play the role of limiting MVDs. Thus it is reasonable for considering using MVDs to predict SRs. In this paper, one of the MB level and two of the block level, at all three MVD based SR prediction algorithms are proposed. VBSME based experiments are carried out to assess the proposed algorithms. Comparisons between the proposed three algorithms and the previously proposed one given in [8] are done in terms of encoding quality and computational complexity.
View full abstract
-
Koichi ITO, Takafumi AOKI, Hiroshi NAKAJIMA, Koji KOBAYASHI, Tatsuo HI ...
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
1023-1030
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
This paper presents a palmprint recognition algorithm using Phase-Only Correlation (POC). The use of phase components in 2D (two-dimensional) discrete Fourier transforms of palmprint images makes it possible to achieve highly robust image registration and matching. In the proposed algorithm, POC is used to align scaling, rotation and translation between two palmprint images, and evaluate similarity between them. Experimental evaluation using a palmprint image database clearly demonstrates efficient matching performance of the proposed algorithm.
View full abstract
-
Shinnosuke HIRATA, Minoru Kuribayashi KUROSAWA, Takashi KATAGIRI
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
1031-1037
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
Ultrasonic distance measurement using the pulse-echo method is based on the determination of the time of flight of ultrasonic waves. The pulse-compression technique, in which the cross-correlation function of a detected ultrasonic wave and a transmitted ultrasonic wave is obtained, is the conventional method used for improving the resolution of distance measurement. However, the calculation of a cross-correlation operation requires high-cost digital signal processing. This paper presents a new method of sensor signal processing within the pulse-compression technique using a delta-sigma modulated single-bit digital signal. The proposed sensor signal processing method consists of a cross-correlation operation employing single-bit signal processing and a smoothing operation involving a moving average filter. The proposed method reduces the calculation cost of the digital signal processing of the pulse-compression technique.
View full abstract
-
Kyeong-Yuk MIN, Jong-Wha CHONG
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
1038-1043
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
In this paper, we propose memory and performance optimized architecture to accelerate the operation speed of adaptive deblocking filter for H.264/JVT/AVC video coding. The proposed deblocking filter executes loading/storing and filtering operations with only 192 cycles for 1 macroblock. Only 2×4×4 internal buffers and 32×16 internal SRAM are adopted for the buffering operation of deblocking filter with I/O bandwidth of 32 bit. The proposed architecture can process the filtering operation for 1 macroblock with less filtering cycles and lower memory sizes than some conventional approaches of realizing deblocking filter. The efficient hardware architecture is implemented with novel data arrangement, hybrid filter scheduling and minimum number of buffer. The proposed architecture is suitable for low cost and real-time applications, and the real-time decoding with 1080HD (1920×1088@30fps) can be easily achieved when working frequency is 70MHz.
View full abstract
-
Keisuke INOUE, Mineo KANEKO, Tsuyoshi IWAGAKI
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
1044-1053
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
As the feature size of VLSI becomes smaller, delay variations become a serious problem in VLSI. In this paper, we propose a novel class of robustness for a datapath against delay variations, which is named structural robustness against delay variation (SRV), and propose sufficient conditions for a datapath to have SRV. A resultant circuit designed under these conditions has a larger timing margin to delay variations than previous designs without sacrificing effective computation time. In addition, under any degree of delay variations, we can always find an available clock frequency for a datapath having SRV property to operate correctly, which could be a preferable characteristic in IP-based design.
View full abstract
-
Kazunori SHIMIZU, Nozomu TOGAWA, Takeshi IKENAGA, Satoshi GOTO
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
1054-1061
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
Reducing the power dissipation for LDPC code decoder is a major challenging task to apply it to the practical digital communication systems. In this paper, we propose a low power LDPC code decoder architecture based on an intermediate message-compression technique which features as follows: (i) An intermediate message compression technique enables the decoder to reduce the required memory capacity and write power dissipation. (ii) A clock gated shift register based intermediate message memory architecture enables the decoder to decompress the compressed messages in a single clock cycle while reducing the read power dissipation. The combination of the above two techniques enables the decoder to reduce the power dissipation while keeping the decoding throughput. The simulation results show that the proposed architecture improves the power efficiency up to 52% and 18% compared to that of the decoder based on the overlapped schedule and the rapid convergence schedule without the proposed techniques respectively.
View full abstract
-
Shin-ichi OHKAWA, Hiroo MASUDA, Yasuaki INOUE
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
1062-1070
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
We have proposed a random curved surface model as a new mathematical concept which enables the expression of spatial correlation. The model gives us an appropriate methodology to deal with the systematic components of device variation in an LSI chip. The key idea of the model is the fitting of a polynomial to an array of Gaussian random numbers. The curved surface is expressed by a new extension from the Legendre polynomials to form two-dimensional formulas. The formulas were proven to be suitable to express the spatial correlation with reasonable computational complexity. In this paper, we show that this approach is useful in analyzing characteristics of device variation of actual chips by using experimental data.
View full abstract
-
Toshihiko TAKAHASHI, Ryo FUJIMAKI
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
1071-1076
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
A floorplan is a subdivision of a rectangle into rectangular faces with horizontal and vertical line segments. We call a floorplan room-to-room when adjacencies between rooms are considered. Fujimaki and Takahashi showed that any room-to-room floorplan can be represented as a permutation. In this paper, we give an
O(
n)-time algorithm that constructs the vertical and the horizontal constraint graphs of a floorplan for a given permutation under this representation.
View full abstract
-
Atsushi KUROKAWA, Hiroshi FUJITA, Tetsuya IBE
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
1077-1083
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
Developing LSIs with EMI suppression, particularly for use in automobiles, is important for improving warranties and customer acquisition. First, we describe that the measures against EMI noise caused by a X'tal oscillator are important. Next, we present a practical method for analyzing the noise with models of the inside and outside of a chip. In addition, we propose a within-chip measure against EMI noise that takes chip cost into account. The noise is suppressed by using an appropriate resistance and capacitance on the power line. Simulation results demonstrated the method's effectiveness in suppressing noise.
View full abstract
-
Tsuyoshi SADAKATA, Yusuke MATSUNAGA
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
1084-1091
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
This paper proposes a novel Behavioral Synthesis method that tries to reduce the number of clock cycles under clock cycle time and total functional unit area constraints using special functional units efficiently. Special functional units are designed to have shorter delay and/or smaller area than the cascaded basic functional units for specific operation patterns. For example, a Multiply-Accumulator is one of them. However, special functional units may have less flexibility for resource sharing because intermediate operation results may not be able to be obtained. Hence, almost all conventional methods can not handle special functional units efficiently for the reduction of clock cycles in practical time, especially under a tight area constraint. The proposed method makes it possible to solve module selection, scheduling, and functional unit allocation problems using special functional units in practical time with some heuristics. Experimental results show that the proposed method has achieved maximally 33% reduction of the cycles for a small application and 14% reduction for a realistic application in practical time.
View full abstract
-
Chengjie ZANG, Shigeki IMAI, Steven FRANK, Shinji KIMURA
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
1092-1100
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
Simultaneous Multithreading (SMT) technology enhances instruction throughput by issuing multiple instructions from multiple threads within one clock cycle. For in-order pipeline to each thread, SMT processors can provide large number of issued instructions close to or surpass than using out-of-order pipeline. In this work, we show an efficient issue logic for predicated instruction sequence with the parallel flag in each instruction, where the predicate register based issue control is adopted and the continuous instructions with the parallel flag of ‘0’ are executed in parallel. The flag is pre-defined by a compiler. Instructions from different threads are issued based on the round-robin order. We also introduce an Instruction Queue skip mechanism for thread if the queue is empty. Using this kind of issue logic, we designed a 6 threads, 7-stage, in-order pipeline processor. Based on this processor, we compare round-robin issue policy (RR(
T1-
Tn)) with other policies: thread one always has the highest priority (PR(
T1)) and thread one or thread n has the highest priority in turn (PR(
T1-
Tn)). The results show that RR(
T1-
Tn) policy outperforms others and PR(
T1-
Tn) is almost the same to RR(
T1-
Tn) from the point of view of the issued instructions per cycle.
View full abstract
-
Yun YANG, Shinji KIMURA
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
1101-1111
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
This paper proposes an efficient systolic array construction method for optimal planar systolic design of the matrix multiplication. By connection network adjustment among systolic array processing element (PE), the input/output data are jumping in the systolic array for multiplication operation requirements. Various 2-D systolic array topologies, such as square topology and hexagonal topology, have been studied to construct appropriate systolic array configuration and realize high performance matrix multiplication. Based on traditional Kung-Leiserson systolic architecture, the proposed “Jumping Systolic Array (JSA)” algorithm can increase the matrix multiplication speed with less processing elements and few data registers attachment. New systolic arrays, such as square jumping array, redundant dummy latency jumping hexagonal array, and compact parallel flow jumping hexagonal array, are also proposed to improve the concurrent system operation efficiency. Experimental results prove that the JSA algorithm can realize fully concurrent operation and dominate other systolic architectures in the specific systolic array system characteristics, such as band width, matrix complexity, or expansion capability.
View full abstract
-
Yoshinobu KAWABE, Hideki SAKURADA
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
1112-1120
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
The use of a formal method is a promising approach to developing reliable computer programs. This paper presents a formal method for anonymity, which is an important security property of communication protocols with regard to a user's identity. When verifying the anonymity of security protocols, we need to consider the presence of adversaries. To formalize stronger adversaries, we introduce an adversary model for simulation-based anonymity proof. This paper also demonstrates the formal verification of a communication protocol. We employ Crowds, which is an implementation of an anonymous router, and verify its anonymity. After describing Crowds in a formal specification language, we prove its anonymity with a theorem prover.
View full abstract
-
Yiyuan GONG, Senlin GUAN, Morikazu NAKAMURA
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
1121-1128
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
This paper investigates migration effects of parallel genetic algorithms (GAs) on the line topology of heterogeneous computing resources. Evolution process of parallel GAs is evaluated experimentally on two types of arrangements of heterogeneous computing resources: the ascending and descending order arrangements. Migration effects are evaluated from the viewpoints of scalability, chromosome diversity, migration frequency and solution quality. The results reveal that the performance of parallel GAs strongly depends on the design of the chromosome migration in which we need to consider the arrangement of heterogeneous computing resources, the migration frequency and so on. The results contribute to provide referential scheme of implementation of parallel GAs on heterogeneous computing resources.
View full abstract
-
Daisuke TAKAFUJI, Satoshi TAOKA, Yasunori NISHIKAWA, Toshimasa WATANAB ...
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
1129-1139
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
The subject of this paper is
maximum weight matchings of graphs. An edge set
M of a given graph
G is called a
matching if and only if any pair of edges in
M share no endvertices. A
maximum weight matching is a matching whose total weight (total sum of edge-weights) is maximum among those of
G. The
maximum weight matching problem (
MWM for short) is to find a maximum weight matching of a given graph. Polynomial algorithms for finding an optimum solution to
MWM have already been proposed: for example, an
O(|
V|
4) time algorithm proposed by J. Edmonds, and an
O(|
E||
V| log|
V|) time algorithm proposed by H. N. Gabow. Some applications require obtaining a matching of large total weight (not necessarily a maximum one) in realistic computing time. These existing algorithms, however, spend extremely long computing time as the size of a given graph becomes large, and several fast approximation algorithms for
MWM have been proposed. In this paper, we propose six approximation algorithms
GRS+,
GRS_F+,
GRS_R+,
GRS_S+,
LAM_a+ and
LAM_as+. They are enhanced from known approximation ones by adding some post-processings that consist of improved search of weight augmenting paths. Their performance is evaluated through results of computing experiment.
View full abstract
-
Satoshi TAOKA, Daisuke TAKAFUJI, Toshimasa WATANABE
Article type: PAPER
2008 Volume E91.A Issue 4 Pages
1140-1149
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
A branch-and-bound algorithm (BB for short) is the most general technique to deal with various combinatorial optimization problems. Even if it is used, computation time is likely to increase exponentially. So we consider its parallelization to reduce it. It has been reported that the computation time of a parallel BB heavily depends upon node-variable selection strategies. And, in case of a parallel BB, it is also necessary to prevent increase in communication time. So, it is important to pay attention to how many and what kind of nodes are to be transferred (called sending-node selection strategy). In this paper, for the graph coloring problem, we propose some sending-node selection strategies for a parallel BB algorithm by adopting MPI for parallelization and experimentally evaluate how these strategies affect computation time of a parallel BB on a PC cluster network.
View full abstract
-
Yusuke SAKAGUCHI, Yuhei NAGAO, Masayuki KUROSAKI, Hiroshi OCHI
Article type: LETTER
2008 Volume E91.A Issue 4 Pages
1150-1154
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
This paper presents discussion about channel fluctuation on channel estimation in digital terrestrial television broadcasting. This channel estimation uses a two-dimensional (2D) filter. In our previous work, only a structure of a lattice is considered for generation of nonrectangular 2D filter. We investigate generation of nonrectangular 2D filter with adaptive method, because we should refer to not only a lattice but also channel conditions. From the computer simulations, we show that bit error rate of the proposed filter is improved compared to that of the filter depending on only lattices.
View full abstract
-
Satoshi OHTA, Yoshinobu KAJIKAWA, Yasuo NOMURA
Article type: PAPER
Subject area: Digital Signal Processing
2008 Volume E91.A Issue 4 Pages
1155-1161
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
In the acoustic echo canceller (AEC), the step-size parameter of the adaptive filter must be varied according to the situation if double talk occurs and/or the echo path changes. We propose an AEC that uses a sub-adaptive filter. The proposed AEC can control the step-size parameter according to the situation. Moreover, it offers superior convergence compared to the conventional AEC even when the double talk and the echo path change occur simultaneously. Simulations demonstrate that the proposed AEC can achieve higher ERLE and faster convergence than the conventional AEC. The computational complexity of the proposed AEC can be reduced by reducing the number of taps of the sub-adaptive filter.
View full abstract
-
Sunao MURASHIGE
Article type: PAPER
Subject area: Nonlinear Problems
2008 Volume E91.A Issue 4 Pages
1162-1168
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
This paper considers numerical methods for stability analyses of periodic solutions of ordinary differential equations. Stability of a periodic solution can be determined by the corresponding monodromy matrix and its eigenvalues. Some commonly used numerical methods can produce inaccurate results of them in some cases, for example, near bifurcation points or when one of the eigenvalues is very large or very small. This work proposes a numerical method using a periodic boundary condition for vector fields, which preserves a critical property of the monodromy matrix. Numerical examples demonstrate effectiveness and a drawback of this method.
View full abstract
-
Hideki SATOH
Article type: PAPER
Subject area: Nonlinear Problems
2008 Volume E91.A Issue 4 Pages
1169-1176
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
An orthonormal basis adaptation method for function approximation was developed and applied to reinforcement learning with multi-dimensional continuous state space. First, a basis used for linear function approximation of a control function is set to an orthonormal basis. Next, basis elements with small activities are replaced with other candidate elements as learning progresses. As this replacement is repeated, the number of basis elements with large activities increases. Example chaos control problems for multiple logistic maps were solved, demonstrating that the method for adapting an orthonormal basis can modify a basis while holding the orthonormality in accordance with changes in the environment to improve the performance of reinforcement learning and to eliminate the adverse effects of redundant noisy states.
View full abstract
-
Won-Young JUNG, Hyungon KIM, Yong-Ju KIM, Jae-Kyung WEE
Article type: PAPER
Subject area: VLSI Design Technology and CAD
2008 Volume E91.A Issue 4 Pages
1177-1184
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
In order for the interconnect effects due to process-induced variations to be applied to the designs in 0.13μm and below, it is necessary to determine and characterize the realistic interconnect worstcase models with high accuracy and speed. This paper proposes new statistically-based approaches to the characterization of realistic interconnect worstcase models which take into account process-induced variations. The Effective Common Geometry (ECG) and Accumulated Maximum Probability (AMP) algorithms have been developed and implemented into the new statistical interconnect worstcase design environment. To verify this statistical interconnect worstcase design environment, the 31-stage ring oscillators are fabricated and measured with UMC 0.13μm Logic process. The 15-stage ring oscillators are fabricated and measured with 0.18μm standard CMOS process for investigating its flexibility in other technologies. The results show that the relative errors of the new method are less than 1.00%, which is two times more accurate than the conventional worstcase method. Furthermore, the new interconnect worstcase design environment improves optimization speed by 29.61-32.01% compared to that of the conventional worstcase optimization. The new statistical interconnect worstcase design environment accurately predicts the worstcase and bestcase corners of non-normal distribution where conventional methods cannot do well.
View full abstract
-
Mohammad ZALFANY URFIANTO, Tsuyoshi ISSHIKI, Arif ULLAH KHAN, Dongju L ...
Article type: PAPER
Subject area: VLSI Design Technology and CAD
2008 Volume E91.A Issue 4 Pages
1185-1196
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
This paper presentss a Multiprocessor System-on-Chips (MPSoC) architecture used as an execution platform for the new C-language based MPSoC design framework we are currently developing. The MPSoC architecture is based on an existing SoC platform with a commercial RISC core acting as the host CPU. We extend the existing SoC with a multiprocessor-array block that is used as the main engine to run parallel applications modeled in our design framework. Utilizing several optimizations provided by our compiler, an efficient inter-communication between processing elements with minimum overhead is implemented. A host-interface is designed to integrate the existing RISC core to the multiprocessor-array. The experimental results show that an efficacious integration is achieved, proving that the designed communication module can be used to efficiently incorporate off-the-shelf processors as a processing element for MPSoC architectures designed using our framework.
View full abstract
-
Sumek WISAYATAKSIN, Dongju LI, Tsuyoshi ISSHIKI, Hiroaki KUNIEDA
Article type: PAPER
Subject area: VLSI Design Technology and CAD
2008 Volume E91.A Issue 4 Pages
1197-1205
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
We propose a low cost and stand-alone platform-based SoC for H.264/AVC decoder, whose target is practical mobile applications such as a handheld video player. Both low cost and stand-alone solutions are particularly emphasized. The SoC, consisting of RISC core and decoder core, has advantages in terms of flexibility, testability and various I/O interfaces. For decoder core design, the proposed H.264/AVC coprocessor in the SoC employs a new block pipelining scheme instead of a conventional macroblock or a hybrid one, which greatly contribute to reducing drastically the size of the core and its pipelining buffer. In addition, the decoder schedule is optimized to block level which is easy to be programmed. Actually, the core size is reduced to 138 KGate with 3.5 kbyte memory. In our practical development, a single external SDRAM is sufficient for both reference frame buffer and display buffer. Various peripheral interfaces such as a compact flash, a digital broadcast receiver and a LCD driver are also provided on a chip.
View full abstract
-
Jeesung LEE, Hanho LEE
Article type: PAPER
Subject area: VLSI Design Technology and CAD
2008 Volume E91.A Issue 4 Pages
1206-1211
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
This paper presents a novel high-speed, low-complexity two-parallel 128-point radix-2
4 FFT/IFFT processor for MB-OFDM ultra-wideband (UWB) systems. The proposed high-speed, low-complexity FFT architecture can provide a higher throughput rate and low hardware complexity by using a two-parallel data-path scheme and a single-path delay-feedback (SDF) structure. The radix-2
4 FFT algorithm is also realized in our processor to reduce the number of complex multiplications. The proposed FFT/IFFT processor has been designed and implemented with 0.18μm CMOS technology in a supply voltage of 1.8V. The proposed two-parallel FFT/IFFT processor has a throughput rate of up to 900Msample/s at 450MHz while requiring much smaller hardware complexity and low power consumption.
View full abstract
-
Han-gil MOON, Yulee CHOI
Article type: LETTER
Subject area: Engineering Acoustics
2008 Volume E91.A Issue 4 Pages
1212-1217
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
To provide the distance information of a sound source in virtual audio space, we must have some information about effective distance cues and there must be some way to handle them properly. It is well known that the conventional cues comprise loudness, spectral information, reverberation and binaural information. Some research works have shown that most of these cues can give listeners only limited distance information. Among these cues, reverberation can give listeners effective distance information, but the implementation using this cue is not a simple problem because there are no well-defined parameters and methods. This paper discusses methods to control the perceived auditory depth with the reverberation cue. A two-stepped linear envelope method and an artificial reverberator, which can control the early reflection slope of an impulse response but does not alter the reverberation time, are proposed as solutions. To validate these methods, subjective assessment was performed.
View full abstract
-
Noboru NAKASAKO, Tetsuji UEBO, Atsushi MORI, Norimitsu OHMATA
Article type: LETTER
Subject area: Engineering Acoustics
2008 Volume E91.A Issue 4 Pages
1218-1221
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
In the research field of microwave radar, a range finding method based on standing wave is known to be effective for measuring short distances. In this paper, we focus our attention on audible sound and fundamentally examine the distance estimation method in which acoustical standing wave is used.
View full abstract
-
Heesik YANG, Sangbae JEONG, Minsoo HAHN
Article type: PAPER
Subject area: Speech and Hearing
2008 Volume E91.A Issue 4 Pages
1222-1225
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
In our previous study, a distortion measure based variable bit rate (DM-VBR) scheme in waveform interpolation (WI) coders was proposed. In this paper, the repetition method is proposed to estimate non-transmitted parameters instead of the extrapolation method. For the further reduction of slowly evolving waveform (SEW) bit rates, the dimensions of the past parameters, which are different from those of the current parameters, are converted to match the dimension of the current ones. Distortions between interpolated sub-frames and original sub-frames are measured for the reduction of the SEW parameters. And the usefulness of several other distortion measures is also investigated instead of the simple log spectral distortion. Experimental results show that the coder adopting the new schemes offers above 41% bit rate reduction with almost unnoticeable output speech degradation.
View full abstract
-
Kwangwook CHOI, Cheolwoo YOU, Intae HWANG, Sangjin RYOO, Kyunghwan LEE ...
Article type: LETTER
Subject area: Digital Signal Processing
2008 Volume E91.A Issue 4 Pages
1226-1228
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
In this paper, we propose a system that adopts the independent MCS (modulation and coding scheme) level for each layer in the AMC (adaptive modulation and coding) scheme combined with the V-BLAST (vertical Bell lab layered space time) system. From the simulation results, we observe that since the independent MCS level case adapts modulation and coding rate for maximum throughput to each channel condition in separate layers, the combined AMC-V-BLAST system with the independent MCS level selection results in improved throughput compared to the combined AMC-V-BLAST system with the common MCS level selection and the conventional AMC system based on the 1x EV-DO standard. Especially, the combined AMC-V-BLAST system with the independent MCS level achieves a gain of 700kbps in 7-9dB SNR (signal-to-noise ratio) range.
View full abstract
-
Yasuyuki NOGAMI, Ryo NAMBA, Yoshitaka MORIKAWA
Article type: LETTER
Subject area: Cryptography and Information Security
2008 Volume E91.A Issue 4 Pages
1229-1232
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
This paper shows a necessary condition for type-‹k, m› and ‹k', m› Gauss period normal bases in F
pm to be the same normal basis by using their traces.
View full abstract
-
Takafumi HAYASHI
Article type: LETTER
Subject area: Coding Theory
2008 Volume E91.A Issue 4 Pages
1233-1237
Published: April 01, 2008
Released on J-STAGE: March 01, 2010
JOURNAL
RESTRICTED ACCESS
The present paper describes a method for the construction of a zero-correlation zone sequence set from a perfect sequence. Both the cross-correlation function and the side-lobe of the auto-correlation function of the proposed sequence sets are zero for phase shifts within the zero-correlation zone. These sets can be generated from an arbitrary perfect sequence, the length of which is the product of a pair of odd integers ((2
n+1)(2
k+1) for
k≥1 and
n≥0). The proposed sequence construction method can generate an optimal zero-correlation zone sequence set that achieves the theoretical bounds of the sequence member size given the size of the zero-correlation zone and the sequence period. The peak in the out-of-phase correlation function of the constructed sequences is restricted to be lower than the half of the power of the sequence itself. The proposed sequence sets could successfully provide CDMA communication without co-channel interference, or, in an ultrasonic synthetic aperture imaging system, improve the signal-to-noise ratio of the acquired image.
View full abstract