We propose a method to extract bigram knowledge from GPT-2 models. Based on the observation that the first layer in GPT-2 is useful to predict the tokens next to the given input tokens, we propose an algorithm to use self attention heads only from the first layer to predict the next tokens. We also propose an algorithm to find contextual words that are highly related to a given bigram by applying the backpropagation method to GPT-2 parameters for the next-token prediction. Experimental results showed that our proposed algorithms to predict next words and to induce context words showed the higher average precision values than the baseline methods.
This paper aims to enhance the efficiency of participatory workshops (WS) in municipalities by proposing a hybrid WS support framework that combines Human-in-the-Loop (HITL) and Machine-in-the-Loop (MITL) approaches utilizing generative AI. In the HITL process, generative AI is regarded as workshop participants, with a human facilitator collaborating with the AI to achieve rapid and comprehensive problem identification and organization. In contrast, the MITL process uses outputs generated by the AI as the support for discussions among human participants. This hybrid approach ensures that both human expertise and AI capabilities are optimally utilized. By strategically applying these processes across different WS phases, it becomes possible to efficiently progress through the WS with minimal information loss and achieve the desired outputs. Specifically, in the HITL process, we present a novel methodology using facilitation-based prompts, providing concrete guidance for WS designers. The proposed framework and HITL methodology were applied in actual municipalities, resulting in the successful extraction and organization of problems within a short timeframe, ultimately achieving the objectives of the overall WS process. The application showed that the framework and methodology can significantly reduce the time and resources required for effective WS execution. The findings of this study offer a new perspective on WS design and operation, supporting more efficient and effective policy making. Future challenges include expanding the application scope of the framework and methodology to other WS phases and analytical techniques, and exploring its applicability to other domains. This will enable more organizations to leverage generative AI for effective decision-making.
The purpose of this study is to clarify the overall framework of Inquiry based learning in Japanese high schoolsthrough knowledge modeling using ontology. Recently, inquiry-based learning (IBL), which aims at solving problems,has become popular in Japanese high school education. However, because of no teachers specializing in IBLand no textbooks, there is a lack of teaching resources. One solution to the problem is to conduct knowledge modelingas alternative to textbooks. This study proposes the Problem-Plan-Data-Analysis-Conclusion cycle (PPDAC cycle)ontology for use in teaching IBL. The PPDAC cycle is one of the statistical problem-solving processes. As a result ofthis study, we were able to construct a knowledge model that clarifies the overall framework of IBL. We verified thatits impact can contribute to improving the instructional skills of teachers with inexperience in teaching. The futureprospect is to develop an instructional advisory support system for inquiry-based learning. This paper is a study ofthe initial stage of system development.
Multimodal learning is generally expected to make more accurate predictions than text-only analysis. Here,although various methods for fusing multimodal inputs have been proposed for sentiment analysis tasks, we foundthat they may be inhibiting their fusion methods, which are based on attention-based language models, from learningnon-verbal modalities, because non-verbal ones are isolated from the linguistic semantics and contexts and do notinclude them, meaning that they are unsuitable for applying attention to text modalities during the fusion phase. Toaddress this issue, we propose Word-Aware Modality Stimulation Fusion (WA-MSF) for facilitating integration ofnon-verbal modalities with the text modality. The Modality Stimulation Unit layer (MSU-layer) is the core conceptof WA-MSF; it integrates language contexts and semantics into non-verbal modalities, thereby instilling linguisticessence into these modalities. Moreover, WA-MSF uses aMLP in the fusion phase in order to utilize spatial andtemporal representations of non-verbal modalities more effectively than transformer fusion. In our experiments, WAMSFset a new state-of-the-art level of performance on sentiment prediction tasks.
Ray Kurzweil argues that by 2045, the Singularity will arrive, at which point the computing power of computerswill surpass that of humanity. His claim is based on Moore’s Law, which states that technological advancementsincrease chip density, thereby shortening signal transmission distances and increasing transmission speeds, ultimatelyresulting in faster computations. However, this argument does not account for the programs written on these chips.It assumes that if program size remains fixed while chip density increases, computations will automatically becomefaster. Now, suppose the physical size of the chip remains unchanged. When the density increases, the size of theprograms it can store increases. Furthermore, a program that is N times larger can handle more than N times theamount of information. If computations rely solely on local communication on the chip, the speed increases by a factorof !N, while the computational capacity becomes N times larger. If computations require global communication, onthe other hand, the speed remains unchanged, aligning with Kurzweil’s prediction. In the case of vector computations,such as those used in GPUs for Transformers operating on SIMD (Single Instruction, Multiple Data) architectures,computation is localized. This implies that not only does the speed increase, but the complexity of the program mustalso increase. The central argument of this paper is that the overall improvement in speed would not merely be exponential,as Kurzweil claims, but rather exponential of the exponential. Although we do not explore this further, our argumentshould apply to a broader range of parallel computing architectures.
Nowadays, generative AI has had a considerable influence on learning and education. It has found application invarious contexts. However, it remains unclear how it should be utilized to promote learning. In this paper, we address anissue of whether and how generative AI could be used in the context of investigating any question with exploration ofinformation obtained from the Web to construct knowledge in a wider and deeper way. In our previous research, we havemodeled the investigative learning process with Web resources, and developed a cognitive tool called interactive LearningScenario Builder (iLSB), which provides some scaffolds for conducting the Web-based investigative learning process asmodeled. In this paper, we describe case studies, in which the use of generative AI has been examined in comparison with theuse of iLSB in the following three contexts: (1) learning through communicating with generative AI before knowing the wayto learn represented by the Web-based investigative learning model, (2) learning through communicating with generative AIafter knowing the model, and (3) learning by means of iLSB integrated with generative AI. The results of the studiesdemonstrated that, in comparison to iLSB, generative AI was less effective in helping learners acquire information related tothe question. Furthermore, it was observed that learners were able to acquire a greater amount of information with generativeAI after knowing the model. Additionally, it is suggested that generative AI could effectively assist in acquiring backgroundknowledge about the question to be investigated before exploring Web resources and in organizing information to constructknowledge.
We propose a new method to evaluate machine learning models, parameters, and algorithms. To design themethod, we consider verification and correction costs of human labor, and we base it on machine learning systemswith reject models. The models are proposed to handle problems where the machine learning systems have errors.In the processes of the reject models, a part of outputs of machine learning are rejected and we can get our desiredaccuracy, that means we can reduce errors, for the other results that are not rejected. On the other hand, we shouldverify and correct the rejected results. Therefore, the verification and correction costs should be considered to developthe evaluation method. In addition, the reject models are sensitive to thresholds that define decisions of rejection.Thus, the evaluation method should handle the varying thresholds. Hence, the method to evaluate the machinelearning should have the following two features: (1) handling varying thresholds, (2) managing the verification andcorrection costs. Conventional methods such as ROC curve or PR curve can handle the varying thresholds. Theseconventional methods, however, cannot manage the verification and correction costs. In this paper, first, we define aperformance measure to evaluate machine learning based on the verification and correction costs. Second, we proposeARAC curve(Acceptance Rate-Accuracy after Correction curve) and ARAC-AUC(Area Under Curve) in which thehorizontal axis shows acceptance rate, and the vertical axis shows accuracy after correction, respectively. This ARACcurve can handle varying thresholds as well as the conventional methods. Third, we explain the relationship theperformance measure and the ARAC curve. The horizontal axis is closely related to the verification costs, andthe vertical axis is closely related to the correction costs. Accordingly, the ARAC curve can express verificationand correction costs. Finally, we show experimental results where the proposed ARAC curve and ARAC-AUC canexpress the performance measure better than the conventional methods.