We propose a novel driving policy for self-driving vehicles to reduce traffic jams. Although the driving policy in previous research was empirically designed according to a given traffic situation, which meant that the driving policy needed to be reconfigured for every traffic situation and every change in traffic, we proposed the driving policy that is learned by a learner agent that learns the driving policy through reinforcement learning using data collected on the self-driving vehicles in simulation. The driving policy is relayed to the smart vehicles, which in turn, are directed by the driving policy, we conducted traffic flow simulations with manually driven vehicle and self-driving vehicles in several scenarios where the two key parameters, vehicle density and self-driving vehicle penetration rate, are assigned different values. Our findings show that a driving policy for self-driving vehicles does reduce traffic jams in such conditions as (1) when the vehicle density is 42 vehicles/km and the penetration of the self-driving vehicle is 10% of the total traffic, and (2) when the vehicle density is 50 vehicles/km and the penetration of the self-driving vehicle is 70% of the total traffic (at which point traffic flow is nearly optimized). In addition, we found that intervehicle communication among self-driving vehicles provides real-time information that reduce traffic jam even more effectively.