Tree Search Based End-to-End Soft Q-Learning in Reinforcement Learning 


Vol. 48,  No. 1, pp. 81-84, Jan.  2023
10.7840/kics.2023.48.1.81


PDF
  Abstract

In the sparse reward system of reinforcement learning, existing temporal difference learning based exploration is especially hard to both find state-action pairs which give non-zero reward and update the value function on those state-action pairs even if a reinforcement learning agent finds a reward giving state-action pair. In this paper, we propose Boltzmann guided sparse sampling (BGSS), which is an end-to-end reinforcement learning that can efficiently learn to find trajectories with high return from sparse-reward system. In the experiments, we can also demonstrate the proposed method, BGSS, is a faster reinforcement learning algorithm than tree search-based method and temporal difference learning.

  Statistics
Cumulative Counts from November, 2022
Multiple requests among the same browser session are counted as one view. If you mouse over a chart, the values of data points will be shown.


  Related Articles
  Cite this article

[IEEE Style]

S. Han, T. Cho, H. Han, H. Lee, H. Kim, J. Lee, "Tree Search Based End-to-End Soft Q-Learning in Reinforcement Learning," The Journal of Korean Institute of Communications and Information Sciences, vol. 48, no. 1, pp. 81-84, 2023. DOI: 10.7840/kics.2023.48.1.81.

[ACM Style]

Seungyub Han, Taehyun Cho, Hyeonggeun Han, Hefesoo Lee, Hyungjin Kim, and Jungwoo Lee. 2023. Tree Search Based End-to-End Soft Q-Learning in Reinforcement Learning. The Journal of Korean Institute of Communications and Information Sciences, 48, 1, (2023), 81-84. DOI: 10.7840/kics.2023.48.1.81.

[KICS Style]

Seungyub Han, Taehyun Cho, Hyeonggeun Han, Hefesoo Lee, Hyungjin Kim, Jungwoo Lee, "Tree Search Based End-to-End Soft Q-Learning in Reinforcement Learning," The Journal of Korean Institute of Communications and Information Sciences, vol. 48, no. 1, pp. 81-84, 1. 2023. (https://doi.org/10.7840/kics.2023.48.1.81)
Vol. 48, No. 1 Index