Recently, a speaker recognition is important in the viewpoints of the security. However, the circuit configuration becomes complicated because the FFT is usually performed to analyze the sound signal. This paper presents a new speaker recognition algorithm which can be constituted in an simple circuit using a voice waveform.
The communication resources are needed in simultaneous transmission of a Image and/or a sound. Conventionally, a compression of sound data has not been widely used because the data loss is conspicuous. This paper presents a new algorithm embedding a sound in a picture in order to save communication resources.
This paper presents on a Chinese language learning system by combining visualization of prosody and speech conversion. It can display pitch contours of native speaker's and learner's utterances, and also can correct learner's prosody into native one. A skill measure defined by the distance between native and learner's pitch contours is effective as well as scoring by Chinese listeners.
This paper describes an algorithm for solitary pulmonary nodule in chest radiology by using the belief propagation algorithm. We discuss how the belief propagation is used to detect solitary pulmonary nodules.
A new dynamic image processing system with multi-camera have been proposed and applied to the detection of fire in the chemical plant, the power station etc.. This paper presents a smoke detection method using an optical flow for the dynamic image processing system. It is confirmed that recognize to smoke experimentally.
By increasing broadband network, we became possible translating and editing images on networks easily. However, it takes many cost of time to retrieve by keywords, because it is difficult to make keywords from images. In this paper, we describe a new image retrieving method with color rate obtained from text data.
In order to realize a new broadcasting service which you can get various information from the objects in the TV screen, we proposed a new database system over the network for the purpose of efficient metadata data-input for program production. The proposed system links the program-related information with the objects dynamically. By using the network, many independent processes can be done parallel on distributed databases, thus an efficient program production can be realized. We carried out a program-production simulation to confirm the validity of the proposed system.
This paper describes the examination of an MPEG-7 based original retrieval profile as a means to research efficiently the video material include the archive by web-base centrally, the development of the retrieval application by using a profile, and the verification of the layered structure description of a profile.
In this paper, we present our research results towards the development of a large-scale video archival and retrieval system that Yamagata digital content center (YDCC) has been conducting. These include a number of important findings regarding digital content preservation, distribution and production. With an MPEG-7-based video material retrieval application and a content management approach based on the standard proposed by Content ID Forum, we have some verification tests. Our proposed business solutions can be effectively deployed into any archival and retrieval system in the future network community and can be utilized for achieving smooth management of secure content distribution.
The authors are developing an HDTV wireless camera system for studio-use, which transmits a full size of HD-SDI signal in a form of multiple OFDM carrier signals. This report discusses "OFDM-FDM" and "OFDM-CDM" for multiplexing OFDM carrier signals. The transmission properties of "OFDM-CDM" and "OFDM-FDM" were examined with computer simulation and compared. It was confirmed that "OFDM-FDM" have better properties than "OFDM-CDM" in the higher C/N.
Three diversity combining methods are compared by field tests. The best performance of maximum ratio combining is confirmed. The larger improvement is obtained by applying it together with time interleaving. The result shows the realizability of mobile reception of HDTV.
We have developed an optical transmission method for converting UHF signals into the millimeter-wave signals at an optical receiver. From a transmission experiment of eight channels of terrestrial digital broadcasting signals using an optical SSB modulator, the CNR of 40 dB for the millimeter-wave signals was obtained.
The Japan Aerospace Exploration Agency (JAXA) has planned to launch the geostationary satellite called Engineering Test Satellite-_VIII_ (ETS-_VIII_) in 2004. The antenna which is used for this experiment for mobile station will be put on a car roof. Therefore, A this antenna should be small, thin, cheap and of a simple structure. A phase shifter is not necessary and the beam is switched by use of a simple ON-OFF feeding method. In this paper, the switching circuit which can be used for the antenna is discussed. As for the characteristics, a miniaturized circuit with low insertion loss in order to pass the signal efficiently and satisfactory withstand power characteristics for the transmitting circuit are required.
We have been developing a sharp-cut superconducting bandpass filter, which is effective in drastically reducing adjacent channel interference, for digital terrestrial broadcasting. A compact ultra-high frequency (UHF) superconducting filter was fabricated experimentally and its sharp-cut property was confirmed.
The features of superconducting filters are low-loss and sharp-skirt characteristics. These features can use in order to reduce the adjacent channel interferences in digital terrestrial television broadcasting networks. This paper describes the effect of employing the superconducting filters in relay stations for digital terrestrial broadcasting.
We have developed Video Multipelexer for transmitting various formats of video signals, which are serviceable in digital terrestrial broadcasting service in Japan, between broadcasting stations using single uncompressed HDTV transmission line.
Current producers use different forms of telop as part of the broadcasting program. This paper evaluates the use of telop, from the end viewers' viewpoint, and gives considerations of its use and importance in future broadcasting program.
Concerns with Closed Caption have been growing with the introduction of digital television. Currently, the most controversial topic in Closed Caption is its technical issues as little attention is given to the creativity of the original TV program. This study focuses on the Closed Caption production in the broadcasting industry.
The Telecommunications Advancement Organization (TAO) of Japan had initiated a project in order to develop the efficient production of closed caption in TV programs.We have developed a "utility model system2" as a final result of the project combining automatic procedures and assistant functions for manual operation.
The TAO of Japan has been developing the efficient production of closed-caption in TV programs. We report on technique of the automatic text format of closed-caption production system which realizes avoiding open-caption by image recognition and adding the "RUBI" (KANA alongside KANJI characters for reading) over the captions.
The Telecommunications Advancement Organization (TAO) has built the off-line closed caption production system for the hearing impaired people. We have developed speech synchronization method which detects each sentence position in speech of TV program. We report the outline of the method and the result of its assessment.
A smart image sensor for an augmented reality (AR) system using optical devices is presented, which can provide an enhance view of the real world with meaningful information from a computer world. Our designed 128 x 128 image sensor achieves high-sensitivity ID beacon detection of less than -10.0 dB in wide range of 40 dB background intensity. Moreover it can obtain 20-byte ID information per frame using 40 kHz carrier in 30 fps. It enables to get a scene image, locations, IDs and additional information of multiple target objects simultaneously in real time. The advanced performance provides the next step for a practical AR system.
Hitachi Zosen has developed a low voltage driven field emitter by the brush type of carbon nano tubes, which had already developed. Only 30V was enough to start to emit the electron. This paper describes the performance of the field emittion and the luminescence.
CRT type Projection TV sales is still extending in North America market.The power saving and low cost is especially requested from the market.Because those points are difference from PDPTV.We developed a new Deflection Yoke(DY) for Projection TV. We used the noe development method "TRIZ" and "Robust Engineering". And new Deflection Yoke achieved low power consumption and low cross talk from horizontal deflection coil.
We have developed a smart card enabling access control for digital broadcasting system using receivers with storage devices. Developed smart card can control both broadcasting contents and stored contents, enabling renewal of security software in the card.
A new broadcasting service for mobile receivers using terrestrial broadcast radio band will be launched. The reception conditions, however, will not always be stable enough for recording. Assuming that the receiving condition at home is stable, the system uses two receivers, mobile one and one at home, connected with each other was developed.
In order to realize the "easy DVD copy" function for HDD/DVD hybrid recorders, we developed new software multiplex module. The software multiplex module has been included in the recorder application software which performs as a user I/F and a system control program. We have achieved 18 times the real time playback speed for a maximum DVD copying speed from HDD.
We've developed a frontend module for the 1-segment receivers of ISDB-T/ISDB-Tsb, which include RF downconverter and demodulator. The module is ultra small size and the lowest power-consumption as the DTV frontend, by employing the low-IF architecture. It will make digital TV on the mobile gear come true.
We propose a speech dialogue interface for TV as one of the most effective solutions to make the operation easier and to help the users from complicated digital-TV operating environment. To clarify relationships between viewer's intentions and utterances in the TV operations, we conducted Wizard of OZ experiments on program selection tasks targeting at elderly users. In this paper, we describe characteristic operating strategies and utterances obtained from the experiments.
With the progress of the digital broadcasting, the complexity of the TV operations has come into attention.To make the operations easier, it is necessary to support them by inferring the user's intentions.In this paper, we regard the zapping as being related to the intentions and describe relationships between the program-viewing-duration and the outcome of the selection.
In case of learning the motion of human body, the distance-learning systems are focused on not only how to exercise, but also how to explain the meaning of the motion. Therefore, we propose the new presentation method of the motion of human as Interface for the distance-learning systems.
In recent years, using computers is more simply and conveniently.In this research, the system by which a computer understands the intention which human beings wants to perform by image processing with the monocular camera.Especially the evaluation to control remote robot for communication was considered.
This paper analyzes the timing of eyeblink during visual identification of katakana characters on a display, which were presented under the constraint of a restricted observation window (R.O.W.). Blinks frequently occurred when the subject slowly brought the R.O.W. near a feature point (e.g., terminal point, crossing point).
For display unit, visual information is the most critical information compared to other information such as vestibular information and haptic information. In order to check the validity of the disregarding other information, we presented the vestibular and visual information independently to human subjects. We found that using both independently controled vestibular information and visual information with same condition lowered the threshold of the rotational perception of the subjects.Visual information was dominant over vestibular information when contradictory conditions exist between visual and vestibular information. It is observed that a motion perception and a rotation direction perception are independent in vestibular information because the direction is undetectable in rotational perception even if the movement is detectable in it.
We have proposed a display method of visible on-board display for the aged people.In an automobile operation environment, a visual acuity is greatly influenced by the viewing distance to the display screen, the appropriate time, the time that far and near focal regulation takes, and operating duty. In this paper, as the display method in consideration of the functional fall of the vision of an aged driver, the correction value according to age is calculated from an experiment, and the method of determining the recommendation character display size using the correction value is proposed.We asked for the recommendation character size which can carry out the appropriate of a youth and the aged people from the experiment.We have evaluated the minimum appropriate character.The experimental result has showed that aged driver need big character expanded into 1.6 times under the driving situation.
We have confirmed that evaluated the effectiveness in this method for the aged.Vehicle warning is that the warning sound in which the emergency was told according to the distance between two cars etc was the main so far. In this research, we proposes a new warning method that take into consideration the driver's arousal level in addition to the vehicle emergency degree. Moreover, the aged people also has aimed at a comprehensible warning. And, the evaluation experiment was done. Aged driver gets physical strength decrease function of eyesight and hearing along with aging. Then, The testee was done and the experiment separately for young people and the aged people to verify effectiveness to the aged people of the method of presenting this warning. The evaluation method evaluates the objective evaluation and the subjectivity, and, as a result, verifies the effectiveness of this method.In the objective evaluation, reactive time from the warning presentation to pushing the mouse by noticing warning is measured. In the subjectivity, when the degree by warning the past felt dangerous was assumed to be a standard, The comparative evaluation is done by five stages this warning. This time, effectiveness by warning for normal consideration and warning for the reduction consideration was verified with the testee's consideration reduce.Here, the testee's consideration was self-declared. Effectiveness that changed warning by the state of consideration by both young people and aged people was able to be confirmed from the resuld of these evaluation experiments.
In a driving support system in ITS, a human interface between a driver and a system is rising in importance. To realize a human interface friendly to drivers, it is essential to detect the driver's state of consciousness or attention. The purpose of this study is to develop the capturing system that capture driver's facial image, and to develop the method of blinking measurement for detecting consciousness degradation, using the motion picture processing. A robust method against the wide illumination change from daytime to nighttime is essential in a car use. To solve this issue, a weak pulsed infrared light is synchronized to get a CCD camera to capture. In the blink measurement, the facial and eye areas are extracted from the obtained facial image. After the upper and lower eyelids are detected from the eye areas, blinking is measured from the temporal change of the upper and lower eyelids. Detection of the upper and lower eyelids is that after the captured image is sliced into vertical sections, candidate points of the upper and lower eyelids are detected in each section. In this method, the candidates of the upper and lower eyelids can be detected reliably irrespective of the shadow of upper eyelid. We experimented to check the availability of the capturing system in the car experiment. Against the wide illumination change from daytime (2700~32,000[lx]) to nighttime (2~12[lx]), a facial image was taken with a CCD camera when drives at 40km/h. The obtained facial image was verified by visual evaluation and histogram. The experimental results indicate that the obtained facial images are the good images in 30 scenes of which the illumination differs. In the blink measurement experiment, the extraction rates of this method are 96% for five subjects. Also the extraction rates of this method are over 80% for the subjects wearing sunglasses.
Driving support system is one of the most interested system in many ITS systems. Examples of such systems include distance detection systems for vehicles ahead and lane departure detection systems. These systems give warning information depending on the urgency level. To realize driving support systems excellent in human interface, it is essential to change the method or the timing to give warning information depending on the driver status. Driver status monitor detects degradation of consciousness or attention of a driver during driving, and the objects to be detected are drowsiness, inattention, excessive concentration through cellular phone use, intoxication or drugs, and fatigue, or other factors. The detection object of the developed system is drowsiness and inattention. In this driver status monitor system, the method or the timing for offering information to a driver is changed according to the level of the consciousness or the attention of a driver, and the media or its method to offer information is changed according to assent or urgency level of the information. The purpose of this study is to realize a system that wins driver's confidence by the ways mentioned above. The driver status monitor detects drowsiness from the change in the duration of eye closure during blinking of a driver and inattention from the change in the gaze direction. This paper has described the detection technique of the gaze direction that has been developed to detect inattention of a driver during driving among driver's status. The eye areas are extracted and tracked from the facial images captured by a camera installed in a rear-view mirror. The facial direction and the gaze direction are detected separately from the extracted eye areas. Detection of Facial Direction The direction and the angle of a face are calculated from the distance between the gravities of two eyes and the center point of nose. Detection of Gaze Direction The positions of the pupil and the inner corner of the eye are detected from the extracted image of the eye area, and then the gaze direction is calculated from the relative distance between the two positions. And it is shown from verification experiments that this method is applicable to the inattention detection.Many studies concerning detection of the gaze direction have been performed mostly as a part of human interface studies or one of the visibility evaluations of display. But these kinds of studies for inattention detection are few. This paper has described the above- mentioned contents in detail.
So far, the half tone of the ordered dither method was used most matrix size n=8. However, there is not the objective evaluation method of it. So, the purpose of this paper is to investigate an estimation using the correlation coefficient and measure of approximation on gray level L and matrix size n of the ordered dither matrix.
It was clear that measure of approximation coincided with the subjective evaluation method MOS using the lows frequency images .The purpose of this paper is to compare the validity using the high frequency images for the modulated methods between the conventional objective evaluation method PSNR and ,GSNR which we have proposed .
In this paper, we present a system which provides synchronized speech guides for movies. Since there are no spaces to save additional information on movie films, we use a personal computer to give speech guides that were fragmented into short wave-forms, and to play synchronous to the film movement.
This paper discusses a recovery processing that can transmit real-time video with high quality .In an information recovery technique,this paper proposes a method based on a new approach to motion vector interpolation and intra's peculiarity. Simulation results showed that this method keeps video quality degradation to a minimum.
This paper proposes a novel method for reducing reproduction errors of JPEG2000 compressed images.The method optimizes only lifting coefficients used in a decoding process and attaches them to compressed data as side-information.Simulation results indicate that it can improve SNR of reproduced images by about 0.2 dB.
We formerly proposed a coding scheme based on iteration of motion compensation and matching pursuit for monochrome video. This paper describes extension of the scheme to color video coding. A cost-function which simultaneously considers rate-distortion of both luminance and chrominance signals allows appropriate bit-allocation to them. As a result, the proposed scheme attains better coding performance than H.263.
This paper describes the coding performance of H.264/AVC to optimize the combination of coding tools by experiments. The results show that the rate-distortion optimization (RDO) of the reference software JM is reasonably effective for any size of image format. We also analyze the occurrence of block modes which are related to high coding efficiency of H.264/AVC with RDO procedure.
Various block sizes in motion compensation of H.264/AVC incur heavy calculations, although compression efficiency is high. We tested several limitation of block sizes to compare the bit-rate saving with the calculation time, and found small block partitions of H.264 to be efficient.
A video store-and-forward system "VAST-ip" is evaluated over IPv6 network between the USA and Japan. It shows that VAST-ip achieves: 1) high throughput efficiency for a long distant link, 2) peaceful co-existence with UDP streaming, 3) peer-to-peer remote video sharing.
A new non-realtime modulation method, that is called Time Base Modulation (TBM), has been studied with computer simulations using a FFT. It has been confirmed that the frequency spectrum response of TBM signal can be approximated by a Bessel function with parameters different from FM signal.
We propose a remote monitoring system using PLC (Power Line Communication) networks for sending and receiving data. The PLC is based on power lines. Since power lines are now available almost anywhere, it means that a PLC network can easily be built anywhere. Therefore, it is very easy hence useful to build a remote security monitoring system using the PLC networks. However, the transmission speed of PLC is very slow at present due to the limit of the permitted in Japan, and also because the transmission is often interrupted by noise produced by household appliances. Thus a large data like a raw image cannot be transmitted in real time. To overcome this speed problem, in the proposed remote monitoring system, we first analyze an input image to detect a possible invader and then to determine its size and location in the observed scene. It is the resultant information that is transmitted via the PLC networks. In the reception side, the location and size of the invader is reproduced in a 3D space by the received information. In this way, we can decrease the amount of transmission data to materialize a remote monitoring system using such a slow PLC networks. More precisely, we analyze the input image by motion detection. Location detection is performed by motion detection using two cameras. Size calculation is also performed. Experimental results show that the proposed system seems promising for remote monitoring.
In recent years, the opportunity to perform image processing individually has been increasing by the spread of a computer or digital camera, and the appearance of a cellular phone with a camera. Although a filter is mentioned as an example of representation of image processing, a number with the huge filter for image processing exists, and, only in the same number, a function exists. This leads to the increase in the troublesome work of sorting of a filter, programming, etc., when building a filter into application. Moreover, a name and its effect of a filter are incomprehensible. This research aimed at construction of the system which can reproduce two or more filter functions, and does not require the knowledge of a filter of a user. This technique requires the following input images of three sheets of a user. First, the image to which a user wants to perform image processing after this (Image C). Next, the image which has already performed image processing to perform (Image B). Finally it is the original image (Image A) of Image B. A system discovers the corresponding point of all the pixels of Image C out of Image A, determines the target RGB values of image (ImageD) that the RGB values of Image C and the target RGB values of a Image D will become the same relation as compared with the RGB values of Image A about the RGB values of the pixel of the Image B located in the corresponding point and these coordinates, and generates Image D.