Special Issue on Image Electronics and Related Technologies towards User Value Creation and Innovation
-
Shuichi KAMIJO, Yuichi MIYAJIMA, Atsushi MATSUI, Yohei NAKADA, Daigo M ...
2010Volume 39Issue 5 Pages
571-579
Published: September 25, 2010
Released on J-STAGE: August 25, 2011
JOURNAL
FREE ACCESS
For multi-object tracking using a particle filter, a tracking method in which Cross entropy is incorporated into a likelihood function is proposed, with the aim of improving the tracking speed. Baseline methods have utilized Bhattacharyya distance, KL divergence, and so on, in the likelihood function. However, these methods require unnegligible computational cost in calculation of color histograms for each sample, drawn at each frame. In contrast, in the Cross entropy method, likelihood calculations can be performed without generating sample histograms, which is expected to speed-up the tracking speed. Moreover, incorporating the background information into a tracking algorithm is a possible solution for performance improvement. Background information can be utilized together with cross entropy without increasing the computational cost. Therefore, fast and robust tracking algorithm for occlusion problem can be generated by incorporating background information with cross entropy. The proposed method was experimentally compared with a baseline method using the Bhattacharyya distance. The effectiveness of the proposed method and the effect of the number of sample were examined.
View full abstract
-
Chengjiao GUO, Ying LU, Takeshi IKENAGA
2010Volume 39Issue 5 Pages
580-589
Published: September 25, 2010
Released on J-STAGE: August 25, 2011
JOURNAL
FREE ACCESS
Object tracking is one of the most important applications in the field of computer vision. One of the common problems in object tracking is object occlusions. Especially in the presence of long-term full occlusion, or called long-lived full occlusion, during which the target remains invisible for tens of frames, the tracking is more difficult. This paper proposes an occlusion handling scheme based on particle filter. Compared with the conventional particle filter which usually utilizes color as tracking cue, multiple likelihood models: HSV color and gradient orientation likelihoods, are employed in the observation model during occlusion. The incorporation of these two features makes the target distinguishable even if it is occluded by a similar colored object in the background. Also, multiple state noises are introduced to ensure the redetection of the target at the end of full occlusion as well as keeping tracking accuracy under occlusion. Experimental results under different occlusion conditions show that the proposed particle filter achieves robust and accurate performance compared with the particle filter with appearance adaptive models and the color particle filter, even in the condition of long-lived full occlusion.
View full abstract
-
Tsuyoshi SASAKI, Kodai KAWANE, Takeshi IKENAGA
2010Volume 39Issue 5 Pages
590-597
Published: September 25, 2010
Released on J-STAGE: August 25, 2011
JOURNAL
FREE ACCESS
Surveillance camera systems play an important role for creating safe and secure society. Especially, real-time motion detection is a key to detect abnormal scenes. So, we picked up KLT (Kanade-Lucas-Tomasi) tracker and tried to implement a system. However, there are still many problems in accuracy and system cost. This paper proposes a score control by weighted mask and an adaptive feature point interval algorithms to increase accuracy of object detection. Moreover, to implement these algorithms onto a low cost FPGA, hardware architectures, such as weighted value generation circuit, insert position calculation circuit and feature point data update circuit, are proposed. Evaluation results shows that the proposed algorithm can detect motion vectors with high accuracy for various surveillance scenes. Moreover, hardware implementation results show that the proposed architecture attains real-time processing with around 20% FPGA resources.
View full abstract
-
Fang ZHANG, Yoshihiro SUGAYA, Shinichiro OMACHI, Hirotomo ASO
2010Volume 39Issue 5 Pages
598-605
Published: September 25, 2010
Released on J-STAGE: August 25, 2011
JOURNAL
FREE ACCESS
It is difficult to recognize characters in a scene image taken by a digital camera since the image usually suffers from geometrical distortions, changing lighting conditions, and so on. We focus on the geometrical distortions. If the transformation is restricted to the affine transformation, the transformation will be the combination of expansion, rotation and slant transformations. Although the size can be adjusted and the rotation angle can be estimated, it is difficult to estimate the slant angle of a character image. In this paper, we propose an algorithm for slanted character recognition. The proposed technique is based on the subspace method and uses fast Fourier transform to calculate the similarity of patterns of various slant angles efficiently. The experimental results show the effectiveness of the proposed method.
View full abstract
-
Dongzhu YIN, Yoshihiro SUGAYA, Shinichiro OMACHI, Hirotomo ASO
2010Volume 39Issue 5 Pages
606-614
Published: September 25, 2010
Released on J-STAGE: August 25, 2011
JOURNAL
FREE ACCESS
Although skin color segmentation using different color spaces has been investigated using various approaches, there is still much to explore. Previous comparative analyses of different color space models did not sufficiently address the performance of color space models against the cases where training samples for skin color and test samples are taken under different environments. In this paper, we present a comparative study on different color space models for skin color segmentation. To show the significance of choosing the proper color space model, we investigated thirteen different color space models using two different skin color samples on two databases. Results of comparative experiment show that the CIECAM02 color appearance model can most precisely segment the true skin color against various skin color samples.
View full abstract
-
Shinsuke NAKAMURA, Shigeo MORISHIMA
2010Volume 39Issue 5 Pages
615-620
Published: September 25, 2010
Released on J-STAGE: August 25, 2011
JOURNAL
FREE ACCESS
Characteristics of human motion, such as walking, running or jumping vary from person to person. Differences in human motion enable people to identify oneself or a friend. However, it is challenging to generate animation where individual characters exhibit characteristic motion using computer graphics. In our research, differences between an average motion in some sample motions and a target motion are considered as characteristics target motion includes. We are able to synthesize gait animation having exaggerated characteristics by increasing the differences. The synthesized motion is represented as PCA (Principal Component Analysis) score in PCA space composed of sample motions. In the experiment of looking for a target motion from crowd, we estimate the optimum degree of exaggerated characteristics to minimize the finding time of the target motion.
View full abstract
-
Satoshi SHIMADA, Akira SUZUKI, Shunichi YONEMURA, Akira KOJIMA
2010Volume 39Issue 5 Pages
621-630
Published: September 25, 2010
Released on J-STAGE: August 25, 2011
JOURNAL
FREE ACCESS
We propose a method of recording experience by videos captured with coworker's head-mounted camera and detecting useful scene automatically from the videos based on the relation of their eye direction change. When members are interested in an experience, they move naturally to an appropriate position in which they can look an interesting object well. As a result, each member's head-mounted camera can acquire effective videos to the record of the experience. Moreover, members who share an experience have the following character concerning the change of eye direction. All members pay attention to a specific object at the same time. Or, there is a typical pattern to change the eye direction according to experience person's role. Experimental results show that the proposed method can capture videos of various angles and shot size. In addition, it can detect useful scenes in the videos by using the relation of coworker's eye direction movement.
View full abstract
-
Michihiko GOTO, Yuko UEMATSU, Hideo SAITO, Shuji SENDA, Akihiko IKETAN ...
2010Volume 39Issue 5 Pages
631-643
Published: September 25, 2010
Released on J-STAGE: August 25, 2011
JOURNAL
FREE ACCESS
Augmented Reality (AR) based work supporting system can help users intuitively. For developing AR systems, it is necessary to create contents to be overlaid. In most cases, such contents are newly-created with CG according to the application. In our system, we utilize already-existing videos as instruction videos which are transformed to the user's view point and overlaid onto the user's view. In order to solve the problem that the instruction video and the user's view may be visually confused, we add various visual effects to the instruction video such as transparency, enhancement of contours, and so on. By dividing the instruction video into some sections according to the work, moreover, the user interactively goes to the next step in the instruction video after the user's operation has finished. Therefore the user can carry on the work at his/her own pace. In the usability test, the users evaluated how to provide the instruction video in our system through two types of works; making origami and building block. As a result, we can find that user's visibility improves by transforming the instruction video to the user's view. As for evaluation of visual effects, we can classify those effects according to the work and obtain the guideline in applying the other work to our system.
View full abstract
-
Terumasa AOKI, Hiroshi YASUDA
2010Volume 39Issue 5 Pages
644-653
Published: September 25, 2010
Released on J-STAGE: August 25, 2011
JOURNAL
FREE ACCESS
In recent years, personal information distribution by Web/Weblog is more and more increasing. However, there are few technologies and laws to protect these individuals. For example, the author of a certain Web page is very difficult to prove his/her innocence when the Web page is altered against the author's will and it is published by magazines as if he/she wrote it actually. In this paper, we propose WebFingerprint, which makes it possible to solve these kinds of mismatch problems. Furthermore this paper shows WebFingerprint is easy to implement by prototyping.
View full abstract
-
Ye Kyaw Thu, Yoshiyori URANO
2010Volume 39Issue 5 Pages
654-662
Published: September 25, 2010
Released on J-STAGE: August 25, 2011
JOURNAL
FREE ACCESS
Mobile devices such as mobile phone, PDA, music player and game player are becoming to play an important role in today's communication, education and entertainment. Text typing with these small mobile devices is a challenging research topic for daily work such as emailing, word processing, browsing or searching information, note taking and adding a new contact address into a phone book and so on. In this paper, we propose a new text input interface for Khmer (language of Cambodia) for mobile devices using clickwheel like Apple iPod. We used our proposed Positional Prediction (PP) text input concept for predicting possible combinations of a consonant and vowels or a syllable, and named the new text input interface PP_Clickwheel (Positional Prediction with Clickwheel). We held user study for PP_Clickwheel with ten native participants in Phnom Penh city, Cambodia to judge its user-friendliness based on first-time users' typing speed. The results are acceptable and positive for the current version of PP_Clickwheel prototype.
View full abstract
-
Chen LIU, Tianruo ZHANG, Xin JIN, Minghui WANG, Satoshi GOTO
2010Volume 39Issue 5 Pages
663-671
Published: September 25, 2010
Released on J-STAGE: August 25, 2011
JOURNAL
FREE ACCESS
H.264/AVC introduces the variable block size motion estimation (VBSME), which brings huge computational cost of the encoder. In this paper, a novel fast inter mode decision algorithm for H.264/AVC has been proposed. The proposed algorithm evaluates the modes based on residual feature. The residual is obtained after the motion search of P16 × 16 mode or P8 × 8 mode. And then basing on the extracted residual feature, the complexity and similarity are evaluated for the inter mode decision. According to the evaluation of similarity between different sub-blocks and the complexity of each sub-block, the most possible inter modes for current block is chosen to be conducted. In the worst case, the proposed whole scheme of inter mode decision algorithm only conducts 4 modes, which is much more effective than conducting all the 8 modes in conventional approach. The simulation results show that, comparing to JM14.1, on average, the proposed algorithm achieves 57.98% and 55.72% time-saving on CIF and 720p sequences respectively, with equivalent 0.219dB PSNR drop and 5.55% bit rate increase for CIF and 0.107dB PSNR drop and 3.53% bit rate increase for 720p. Compared to existing inter mode decision algorithm, proposed algorithm achieves 10.68% and 13.26% timing-reduction on CIF and 720p sequences respectively with less performance loss.
View full abstract
-
Guifen TIAN, Tianruo ZHANG, Satoshi GOTO
2010Volume 39Issue 5 Pages
672-681
Published: September 25, 2010
Released on J-STAGE: August 25, 2011
JOURNAL
FREE ACCESS
The variable block sizes for intra and inter coding in H.264/AVC achieves significant coding gain compared with coding a macroblock (MB) with fixed size. However, extremely heavy computational burden is required when Rate Distortion Optimization (RDO) process runs in brutal force searching manner for selecting the optimal coding block. This paper proposes an MB homogeneity detection method to accelerate H.264/AVC intra and inter coding. All the luminance values of pixels in an MB are taken to calculate their entropy feature, which is defined as MB's spatial homogeneity. Based on homogeneity judgment, 16×16 or 4×4 block size is appropriately selected for intra coding; Meanwhile, either the large blocks in {16×16, 16×8, 8×16} or sub-blocks in {8×8, 8×4, 4×8, 4×4} are chosen for inter coding. Especially, a cost function is defined to select near optimal threshold for selecting optimal block size. Proposed methods are verified on a wide range of video sequences with different spatial-/motion characteristics. Sufficient simulations demonstrate that consistent encoding gain is achieved for all videos with different motion and spatial features. Encoding complexity for intra coding alone can be reduced by 31%-34% and time savings for inter mode decision is 43.7%-58.7%, both with negligible loss in bitrate and PSNR.
View full abstract
-
Minghui WANG, Tianruo ZHANG, Chen LIU, Satoshi GOTO
2010Volume 39Issue 5 Pages
682-691
Published: September 25, 2010
Released on J-STAGE: August 25, 2011
JOURNAL
FREE ACCESS
H.264/AVC achieves low bit-rate video stream which meets the requirement of video communication. The problem of H.264/AVC is the large computation burden. Thus fast algorithm should be adopted to reduce the computation burden to meet the limited power of the mobile device. This paper uses region-of-interest (ROI) detector to locate an “important” region and apply unequally coding in the encoder engine according ROI. Several coding parameters including quantization parameter (QP), candidates for mode decision, number of referencing frames and the search range of motion estimation are adaptively adjusted at the macroblock (MB) level. This design is decoding-friendly. Experimental result shows a large amount computation is saved and the subjective visual quality is kept or even improved.
View full abstract
-
Keiki YAMADA, Ichiro FURUKI, Takaaki KASE, Naoshi NAKAYA, Yuji KOUI
2010Volume 39Issue 5 Pages
697-705
Published: September 25, 2010
Released on J-STAGE: August 25, 2011
JOURNAL
FREE ACCESS
We have investigated the optimum design in mechanism and the cooling method that influence the recording and erasing characteristics of continuous tone rewritable medium, suitable for digital signage system. We made clear, that the mechanical parameter dominant for optical density is limited to the nip pressure, and showed the conditions to obtain a nip pressure enough to erase records, including the shape of heat element of thermal head, nip width, nip points, etc. Then we showed that the mechanical parameters, such as a diameter and length of platen rollers and rubber hardness can be obtained by a 3-dimensional contact analysis method, and that the parameters to design thermal head pressing mechanism can be fixed by a simulation method. We also showed that the cooling characteristic (heat radiation characteristic) is efficiently improved by a cooling method using a heat pipe, connected to the body mechanism.
View full abstract
-
Fumihito ITO, Oky Dicky Ardiansyah PRIMA, Ikuko UWANO, Kenzo ITO
2010Volume 39Issue 5 Pages
706-713
Published: September 25, 2010
Released on J-STAGE: August 25, 2011
JOURNAL
FREE ACCESS
As a result of the aging society, there is a significant increase in the number of patients of advanced osteoarthritis performing total knee arthroplasty (TKA) in Japan. Using images from upright x-ray fluoroscopy images of the affected knee, a pre-operative planning is conducted for TKA to measure the position of prosthetic joint. This process requires an experienced doctor to interpret the skeletal conformation from them. In place of using these images, creating a 3D image from a CT image of the affected knee would help this task. However, taking an upright CT image is currently difficult due to the functional constraint of the CT device. In this paper, a registration method to generate an upright 3D model of knee joint from a CT image is proposed. This method makes use of an upright x-ray fluoroscopy image as reference. Anatomically based registrations are conducted to femur and tibial bones separately. Our quantity examination and a clinician's perspective against the registered images show that the resulted model provides practical benefits for a better interpretation to the skeletal conformation. Furthermore, the proposed method indicates its ability to generate an upright 3D model of severe knee osteoarthritis from a CT image.
View full abstract
-
Ming FANG, Hidenori TAKAUJI, Shun'ichi KANEKO, Hidemi WATANABE
2010Volume 39Issue 5 Pages
714-724
Published: September 25, 2010
Released on J-STAGE: August 25, 2011
JOURNAL
FREE ACCESS
A novel robust method of estimating optical flow for a dynamic image sequence with poor quality is proposed. In order to estimate the optical flow, in a neighborhood of each interest position, we divide the local region into some sub-regions and then compute the similarity profile for each sub-region by using Orientation Code Matching. These similarity profiles around the position can be used to extract two kinds of voting: positive voting (candidate vectors) and negative voting (suppressing areas). The positive voting can be used to enhance the signal corresponding to the correct optical flows, and the negative voting can be used to reduce noises corresponding to incorrect optical flows. These two votings are integrated into complementary voting in order to extract a reasonable flow together with proper parameters which maximize the signal to noise ratio. The experiments with real image sequences are conducted to show the effectiveness of proposed method.
View full abstract
-
Yuko TASHIRO, Tsuyoshi SAITOH
2010Volume 39Issue 5 Pages
725-732
Published: September 25, 2010
Released on J-STAGE: August 25, 2011
JOURNAL
FREE ACCESS
With rapid expansion in information technology and low-cost storage media, numerous movie films are digitalized and archived as a database. These movie pictures and animated images are stored with their own specific purposes. Therefore, it is important to have means to be able to retrieve and reference a specified scene and to extract some features on the time domain. The purpose of our research is to develop following two methods. A first method is to represent situation on specific space compactly from long time movie taken by fixed angle camera and to access the specified frame in the movie quickly using the representation. The second method is to extract some features on the space from the representation. In this research, we represent the contents of the movie pictures in the specified time interval as one “slit image” created by applying principle of slit camera which is possible to display its actual images in the specified subspace of the captured space in accordance with time sequence. The direction of time of the slit image corresponds to the frame number for the original movie. Therefore, the slit image can be used as an index of the movie. Moreover, we can extract many features by image processing on only the created slit image without the original stored movie. For the practical application, we have developed the way which displays behavior of group of the cattle clearly from long time movie taken in a paddock where multiple cattle move freely. Through this research, we can become to control cattle behavior.
View full abstract
-
Hayato YAMAMORI, Fumihiko SAITOH
2010Volume 39Issue 5 Pages
733-740
Published: September 25, 2010
Released on J-STAGE: August 25, 2011
JOURNAL
FREE ACCESS
There is a lot of signboards in the street, and they show useful information. It is very important to detect the location and the angle of the arrow in a scene image, because many signboards indicate the direction of a target by the arrow. This paper proposes a method to detect the location and the angle of the arrow patterns in a scene image using the multi-point combinational Hough transform. Scanning from a pair of edge pixels, if the requirement based on the geometrical parameters of the arrow is fulfilled, the voting process is done to the parametrical space that corresponds the location judged that it is the top of the arrowhead of the arrow and the angle judged that the arrow indicates in the object image. After all voting processes are finished, scanning the parametrical space, the point that has the largest voted value in the voting space is detected as the location and the angle of an arrow in the object image. The experimental results show that the arrows of various shapes are detected by the proposed method.
View full abstract
-
Tomoyuki SASAKI, Jun-ichi KUDOH
2010Volume 39Issue 5 Pages
741-747
Published: September 25, 2010
Released on J-STAGE: August 25, 2011
JOURNAL
FREE ACCESS
In this paper, we propose an efficient method of Asian dust extraction by 3-Dimensional histogram using satellite image (NOAA/AVHRR, TERRA/MODIS). Past methods of Asian dust extraction have a problem that it does not distinguish between Asian dust and others. So, our method is able to visualize the indexes which correspond to vegetation, soil and water by simultaneous processing. And then, we specify the Asian dust by watching and extract it by using 3-Dimensional histogram. We compared proposed with past methods using image when Asian dust came to Japan. The proposed method is more worth for extracting.
View full abstract
-
Yousuke KANBE, Fumihiko SAITOH
2010Volume 39Issue 5 Pages
748-755
Published: September 25, 2010
Released on J-STAGE: August 25, 2011
JOURNAL
FREE ACCESS
Drill texts are used with the aim of iterative practice. But drill texts are usually used by directly penciling. So repetitive using is difficult. This paper proposes the method to eliminate penciling from used drill texts. Mounting such function in photocopiers will solve above problem. Proposed method uses image feature of writing materials to eliminate penciling. Distinction of penciling by image feature materializes the elimination whatever their shapes. Experimental results show that penciling could be eliminated using proposed method with above 99% accuracy. And there was little false elimination of machine printing with proposed method.
View full abstract
-
Keiji SHIBATA, Kei MAEDA, Soshi URAKAMI, Yuukou HORITA
2010Volume 39Issue 5 Pages
756-763
Published: September 25, 2010
Released on J-STAGE: August 25, 2011
JOURNAL
FREE ACCESS
The hazard map has been mainly made as a disaster mitigation measures for the earthquake and the tsunami. In this paper, we investigate new type of hazard map, and construct the prototype. The layer of the hazard map was obtained by the spatial interpolation method. By using these database and layer information, a scalable, real-time hazard map was constructed in the different ground resolution.
View full abstract