Journal of Signal Processing
Online ISSN : 1880-1013
Print ISSN : 1342-6230
ISSN-L : 1342-6230
Volume 25, Issue 6
Special Issue on Nonlinear Circuits, Communications and Signal Processing (Editor-in-Chief: Keikichi Hirose, Editor: Tetsuya Shimamura, Guest Editor: Yoko Uwate, Honorary Editor-in-Chief: Takashi Yahagi)
Displaying 1-10 of 10 articles from this issue
  • Hikaru Onda, Masayuki Yamauchi
    2021 Volume 25 Issue 6 Pages 203-212
    Published: November 01, 2021
    Released on J-STAGE: November 01, 2021
    JOURNAL FREE ACCESS

    Synchronization phenomena are observed in many situations and places, and are used for various purposes. Many synchronization phenomena in coupled van der Pol oscillators, which are electronic circuits, have been analyzed and reported. We previously discovered and reported a wavelike propagating phase state between adjacent oscillators. However, the analysis of many synchronization phenomena is not yet sufficient and is a very important task. In this paper, we analyze synchronization states on torus shapes in which the number of oscillators per column is three and the number of oscillators per row is three or four. Furthermore, theoretical results are compared with simulation results.

    Download PDF (1138K)
  • Kanato Ishii, Yuma Kinoshita, Yukoh Wakabayashi, Nobutaka Ono
    2021 Volume 25 Issue 6 Pages 213-220
    Published: November 01, 2021
    Released on J-STAGE: November 01, 2021
    JOURNAL FREE ACCESS

    In this paper, we propose a real-time pitch visualization system with a “Blinky" sound-to-light conversion device. A Blinky transmits sound information as the intensity of an onboard light-emitting diode (LED). In conventional research using Blinkies, sound intensity has been converted into light intensity without utilizing spectral information. In contrast, we focus on pitch information of sound in this study and realize real-time pitch visualization as a new application of a Blinky. The proposed system consists of seven Blinkies. Each Blinky calculates a chroma vector in real time, and a pitch is visualized by activating an LED on a Blinky. We modify each Blinky for a smoother frequency response. Experimental results in a real environment show that the proposed system classifies 12 pitch classes of a musical instrumental tone reproduced by a loudspeaker with high accuracy. Furthermore, it is shown that a Specmurt-based chroma vector calculation makes it possible to suppress overtone components in a chroma vector.

    Download PDF (975K)
  • Shaohua Kan, Yoshiaki Sasaki, Tetsuya Asai, Megumi Akai-Kasaya
    2021 Volume 25 Issue 6 Pages 221-225
    Published: November 01, 2021
    Released on J-STAGE: November 01, 2021
    JOURNAL FREE ACCESS

    Stochastic computing (SC) has attracted much attention of researchers for decades and plays an important role in the field of information science. The relative large cost on area and power of stochastic number generators (SNGs) greatly offset the advantage of SC for hardware implementation. In this paper, we proposed a scheme replacing the SNG's function during weight updating process by a molecular device, which can spontaneously generate random spikes and noise [1]. A control electrode (gate electrode) is added to previous electron-cascading model by simulation, the frequency of charges collected on collector electrode that higher than threshold are related to the number of times voltage is applied on this gate electrode. Specifically, as the voltage applied again and again, the frequency may increase or decrease gradually. This is basically consistent with the basic principle of weight update. After some parameter adjustment and model optimization, we believe that this molecular device is expected to replace the part of network weight update in the hardware SC circuit.

    Download PDF (970K)
  • Riki Watabe, Hiroyuki Kamata
    2021 Volume 25 Issue 6 Pages 227-231
    Published: November 01, 2021
    Released on J-STAGE: November 01, 2021
    JOURNAL FREE ACCESS

    In this paper, we propose a novel method for estimating the time delay in chaotic time series analysis. In recent years, focusing on the shape of an attractor using persistent homology has attracted attention. However, this method has a problem in that the calculation cost is enormous. In the proposed method, we aim to improve the calculation speed while considering the geometric shape of the attractor by focusing on the distance between the points in the data group.

    Download PDF (743K)
  • Taichi Fukawa, Kenya Jin'no
    2021 Volume 25 Issue 6 Pages 233-237
    Published: November 01, 2021
    Released on J-STAGE: November 01, 2021
    JOURNAL FREE ACCESS

    We propose a method to enhance noise reduction performance by separating a speech spectrum into spectral envelopes and fine structures using cepstrum analysis and linear predictive coding (LPC) analysis, and removing noise using an autoencoder (AE). A technique for removing noise from the spectrum of noise-containing speech is to use AE to reconstruct the spectrum of speech through the latent variables of the speech. We focused on spectral envelopes and fine structures that constitute speech, and improved the independence between latent variables in AE to reconstruct the speech spectrum by separating them in advance. In this way, we confirmed that the performance of noise reduction was improved in exchange for a slight decrease in the reproducibility of speech spectra when cepstrum analysis was used. It was also confirmed that cepstrum analysis was superior to LPC analysis in noise reduction.

    Download PDF (636K)
  • Yuki Kaneko, Takeshi Yamada, Shoji Makino
    2021 Volume 25 Issue 6 Pages 239-243
    Published: November 01, 2021
    Released on J-STAGE: November 01, 2021
    JOURNAL FREE ACCESS

    Acoustic scene classification is one of the important technologies for classifying domestic activities. When considering domestic activities as acoustic scenes, unlike the general task of acoustic scene classification, there is the problem that the sounds of the target scene and interference scene can become mixed. To deal with this problem, we propose a classification method using multiple beamformers and an attention mechanism. In the proposed method, multiple beamformers for different target directions are prepared and their outputs are input to a classifier. The proposed method then estimates the importance of each beamformer output by using an attention mechanism. To verify the effectiveness of the proposed method, we generated acoustic data by mixing the sounds of the target scene and the interference scene, and conducted a classification experiment. The experimental results confirmed that the F-score could be greatly improved by the proposed method.

    Download PDF (912K)
  • Riki Takahashi, Li Li, Shoji Makino, Takeshi Yamada
    2021 Volume 25 Issue 6 Pages 245-250
    Published: November 01, 2021
    Released on J-STAGE: November 01, 2021
    JOURNAL FREE ACCESS

    In this paper, we propose a new method for the interpolation of virtual signals between two real microphones to improve speech enhancement performance in underdetermined situations. The virtual microphone technique is a recently proposed technique that can virtually increase the number of channels of observed signals by linearly interpolating the phase and nonlinearly interpolating the amplitude based on β-divergence in the short-time Fourier transform (STFT) domain. This technique has been shown to be effective in improving the speech enhancement performance of beamforming in underdetermined situations. It is reasonable to linearly interpolate the phase based on the sound propagation model and nonlinearly interpolate the amplitude to increase the information content of the observed signals. However, there is no theoretical proof that β-divergence is the optimal criterion for amplitude interpolation due to the complexity of the physical model of amplitude. In this paper, we propose the use of an autoencoder to search for the optimal interpolation domain in a data-driven manner. We perform amplitude interpolation in the latent space, a low-dimensional representation space of observed mixture signals that is trained so that the interpolated virtual signals are optimal for conducting beamforming with high performance. Experimental results revealed that the proposed method achieved higher speech enhancement performance than conventional methods.

    Download PDF (761K)
  • Bowen Zhang, Masahiro Tanaka
    2021 Volume 25 Issue 6 Pages 251-255
    Published: November 01, 2021
    Released on J-STAGE: November 01, 2021
    JOURNAL FREE ACCESS

    In this work, we develop a real-time computer vision system to detect people and judge whether each person is wearing a mask or not. We construct a 2-stage algorithm based on deep convolutional neural networks, where the masks are treated as objects in an image. Furthermore, in order to improve the accuracy of recognizing masks when the human face occupies a large area of the image, we adopt the dilated convolution algorithm to solve this problem. Based on the recent research of COVID-19 for infection danger, this system can send dangerous signal level 1-3 due to the proportion of masked people in the captured image. Due to the report of infection danger criteria, this system can send an alarm of three levels with the borders at 20% and 50% of the people without masks in the area, which can notify people in the area as a safe, a little dangerous, or a particularly dangerous situation.

    Download PDF (1283K)
  • Daiki Arai, Taishi Iriyama, Masatoshi Sato, Hisashi Aomori, Tsuyoshi O ...
    2021 Volume 25 Issue 6 Pages 257-261
    Published: November 01, 2021
    Released on J-STAGE: November 01, 2021
    JOURNAL FREE ACCESS

    Demosaicking is an image reconstruction process for restoring full-color images from color filter array (CFA) data. In recent years, many deep convolutional neural network (CNN)-based demosaicking methods have been reported, and state-of-the-art accuracy has been achieved. In this paper, we propose a novel demosaicking method using the predictive filter flow (PFF) network for various CFA patterns. The PFF is a model that predicts a spatial variant linear filter that transforms an input image into a target image. To incorporate the PFF into demosaicking, the proposed network synthesizes the filter flow corresponding to each channel by means of a network trained by integrating RGB channels. Our model, designed to apply demosaicking with the PFF to various CFA patterns, provides versatility and extensibility. Experimental results demonstrate that the proposed method provides better or competitive results compared with several state-of-the-art deep-CNN-based demosaicking algorithms.

    Download PDF (883K)
  • Taishi Iriyama, Masatoshi Sato, Hisashi Aomori, Tsuyoshi Otake
    2021 Volume 25 Issue 6 Pages 263-268
    Published: November 01, 2021
    Released on J-STAGE: November 01, 2021
    JOURNAL FREE ACCESS

    A digital camera acquires images using a single electronic sensor with a color filter array (CFA). The raw image contains luminance, defined as a spatial map of intensity, and chrominance, defined as a spatial map of each color information. Since the luminance and chrominance components have different demosaicking complexities, they should be modeled separately. In this paper, we propose a novel convolutional neural network (CNN)-based demosaicking method that separately estimates the luminance and chrominance components. Specifically, we apply two-stage CNNs consisting of a luminance component estimation network and a chrominance component estimation network. The proposed method suppresses artifacts such as false colors and reduces the computational complexity. Experimental results on several benchmark datasets demonstrate that the proposed method provides results that are better or competitive with conventional demosaicking algorithms while reducing the computational complexity.

    Download PDF (1164K)
feedback
Top