LLMs augmented hierarchical reinforcement learning with action primitives for long-horizon manipulation tasks

Yun, W. J., Mohaisen, D., Jung, S., Kim, J.-K. & Kim, J. Hierarchical reinforcement learning using gaussian random trajectory generation in autonomous furniture assembly. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 3624–3633 (2022).

Levine, S., Finn, C., Darrell, T. & Abbeel, P. End-to-end training of deep visuomotor policies. The J. Mach. Learn. Res. 17, 1334–1373 (2016).

MathSciNet

Google Scholar

Schulman, J., Moritz, P., Levine, S., Jordan, M. & Abbeel, P. High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438 (2015).

Jiang, Y., Gu, S. S., Murphy, K. P. & Finn, C. Language as an abstraction for hierarchical deep reinforcement learning. Adv. Neural Inf. Process. Syst. 32, (2019).

Nasiriany, S., Liu, H. & Zhu, Y. Augmenting reinforcement learning with behavior primitives for diverse manipulation tasks. In 2022 International Conference on Robotics and Automation (ICRA), 7477–7484 (IEEE, 2022).

Bacon, P.-L., Harb, J. & Precup, D. The option-critic architecture. In Proceedings of the AAAI conference on artificial intelligence, vol. 31 (2017).

Eysenbach, B., Salakhutdinov, R. R. & Levine, S. Search on the replay buffer: Bridging planning and reinforcement learning. Adv. Neural Inf. Process. Syst. 32, (2019).

Nachum, O., Gu, S. S., Lee, H. & Levine, S. Data-efficient hierarchical reinforcement learning. Adv. Neural Inf. Process. Syst. 31, (2018).

Co-Reyes, J. et al. Self-consistent trajectory autoencoder: Hierarchical reinforcement learning with trajectory embeddings. In International conference on machine learning, 1009–1018 (PMLR, 2018).

Andrychowicz, O. M. et al. Learning dexterous in-hand manipulation. The Int. J. Robotics Res. 39, 3–20 (2020).

Article

Google Scholar

Kalashnikov, D. et al. Qt-opt: Scalable deep reinforcement learning for vision-based robotic manipulation. arXiv preprint arXiv:1806.10293 (2018).

Dalal, M., Pathak, D. & Salakhutdinov, R. R. Accelerating robotic reinforcement learning via parameterized action primitives. Adv. Neural Inf. Process. Syst. 34, 21847–21859 (2021).

Google Scholar

Wang, H., Zhang, H., Li, L., Kan, Z. & Song, Y. Task-driven reinforcement learning with action primitives for long-horizon manipulation skills. IEEE Transactions on Cybern. (2023).

Huang, W. et al. Inner monologue: Embodied reasoning through planning with language models. arXiv preprint arXiv:2207.05608 (2022).

Frans, K., Ho, J., Chen, X., Abbeel, P. & Schulman, J. Meta learning shared hierarchies. arXiv preprint arXiv:1710.09767 (2017).

Li, A. C., Florensa, C., Clavera, I. & Abbeel, P. Sub-policy adaptation for hierarchical reinforcement learning. arXiv preprint arXiv:1906.05862 (2019).

Allshire, A. et al. Laser: Learning a latent action space for efficient reinforcement learning. In 2021 IEEE International Conference on Robotics and Automation (ICRA), 6650–6656 (IEEE, 2021).

Hausman, K., Springenberg, J. T., Wang, Z., Heess, N. & Riedmiller, M. Learning an embedding space for transferable robot skills. In International Conference on Learning Representations (2018).

Sharma, A., Gu, S., Levine, S., Kumar, V. & Hausman, K. Dynamics-aware unsupervised discovery of skills. arXiv preprint arXiv:1907.01657 (2019).

Xie, K., Bharadhwaj, H., Hafner, D., Garg, A. & Shkurti, F. Latent skill planning for exploration and transfer. arXiv preprint arXiv:2011.13897 (2020).

Lynch, C. et al. Learning latent plans from play. In Conference on robot learning, 1113–1132 (PMLR, 2020).

Shankar, T., Tulsiani, S., Pinto, L. & Gupta, A. Discovering motor programs by recomposing demonstrations. In International Conference on Learning Representations (2019).

Brown, T. et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 33, 1877–1901 (2020).

Google Scholar

Huang, W., Abbeel, P., Pathak, D. & Mordatch, I. Language models as zero-shot planners: Extracting actionable knowledge for embodied agents. In International Conference on Machine Learning, 9118–9147 (PMLR, 2022).

Ahn, M. et al. Do as i can, not as i say: Grounding language in robotic affordances. arXiv preprint arXiv:2204.01691 (2022).

Suglia, A., Gao, Q., Thomason, J., Thattai, G. & Sukhatme, G. Embodied bert: A transformer model for embodied, language-guided visual task completion. arXiv preprint arXiv:2108.04927 (2021).

Pashevich, A., Schmid, C. & Sun, C. Episodic transformer for vision-and-language navigation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 15942–15952 (2021).

Sharma, P., Torralba, A. & Andreas, J. Skill induction and planning with latent language. arXiv preprint arXiv:2110.01517 (2021).

Li, S. et al. Pre-trained language models for interactive decision-making. Adv. Neural Inf. Process. Syst. 35, 31199–31212 (2022).

ADS

Google Scholar

Prakash, B., Oates, T. & Mohsenin, T. Llm augmented hierarchical agents. arXiv preprint arXiv:2311.05596 (2023).

Bohg, J., Morales, A., Asfour, T. & Kragic, D. Data-driven grasp synthesis-a survey. IEEE Trans. Robot. 30, 289–309 (2013).

Article

Google Scholar

Ijspeert, A. J., Nakanishi, J., Hoffmann, H., Pastor, P. & Schaal, S. Dynamical movement primitives: learning attractor models for motor behaviors. Neural Comput. 25, 328–373 (2013).

Article
MathSciNet
PubMed
MATH

Google Scholar

Karaman, S. & Frazzoli, E. Sampling-based algorithms for optimal motion planning. Int. J. Rob. Res. 30, 846–894 (2011).

Article

Google Scholar

Fu, H., Yu, S., Tiwari, S., Littman, M. & Konidaris, G. Meta-learning parameterized skills. arXiv preprint arXiv:2206.03597 (2022).

Bousmalis, K. et al. Robocat: A self-improving generalist agent for robotic manipulation. Trans. Mach. Learn. Res. arXiv:2306.11706 (2023).

Vemprala, S., Bonatti, R., Bucker, A. & Kapoor, A. Chatgpt for robotics: Design principles and model abilities. Microsoft Auton. Syst. Robot. Res. 2, 20 (2023).

Google Scholar

Mu, Y. et al. Embodiedgpt: Vision-language pre-training via embodied chain of thought. Adv. Neural Inf. Process. Syst. 36, (2024).

Liu, H. et al. Enhancing the llm-based robot manipulation through human-robot collaboration (IEEE Robotics Autom, Lett, 2024).

Book

Google Scholar

Kim, Y. et al. A survey on integration of large language models with intelligent robots. Intell. Serv. Robotics 17, 1091–1107 (2024).

Article

Google Scholar

Brohan, A. et al. Rt-1: Robotics transformer for real-world control at scale. arXiv preprint arXiv:2212.06817 (2022).

Brohan, A. et al. Rt-2: Vision-language-action models transfer web knowledge to robotic control. arXiv preprint arXiv:2307.15818 (2023).

Driess, D. et al. Palm-e: An embodied multimodal language model. arXiv preprint arXiv:2303.03378 (2023).

Mees, O. et al. Octo: An open-source generalist robot policy. In First Workshop on Vision-Language Models for Navigation and Manipulation at ICRA 2024 (2024).

Du, Y. et al. Guiding pretraining in reinforcement learning with large language models. arXiv preprint arXiv:2302.06692 (2023).

Dalal, M., Chiruvolu, T., Chaplot, D. & Salakhutdinov, R. Plan-seq-learn: Language model guided rl for solving long horizon robotics tasks. arXiv preprint arXiv:2405.01534 (2024).

Song, S. et al. Playing fps games with environment-aware hierarchical reinforcement learning. In IJCAI, 3475–3482 (2019).

Wang, R., Zhao, D., Gupte, A. & Min, B.-C. Initial task allocation in multi-human multi-robot teams: An attention-enhanced hierarchical reinforcement learning approach (IEEE Robotics Autom, Lett, 2024).

Google Scholar

Gao, X., Liu, J., Wan, B. & An, L. Hierarchical reinforcement learning from demonstration via reachability-based reward shaping. Neural Process. Lett. 56, 184 (2024).

Article
CAS

Google Scholar

Yang, X. et al. Hierarchical reinforcement learning with universal policies for multistep robotic manipulation. IEEE Transactions on Neural Networks Learn. Syst. 33, 4727–4741 (2021).

Google Scholar

Zhang, D. et al. Explainable hierarchical imitation learning for robotic drink pouring. IEEE Trans. Autom. Sci. Eng. 19, 3871–3887 (2021).

Article

Google Scholar

Zhang, H., Wang, H., Qian, T. & Kan, Z. Temporal logic guided affordance learning for generalizable dexterous manipulation. In 2024 7th International Symposium on Autonomous Systems (ISAS), 1–7 (IEEE, 2024).

Zhu, Y. et al. robosuite: A modular simulation framework and benchmark for robot learning. arXiv preprint arXiv:2009.12293 (2020).

Haarnoja, T., Zhou, A., Abbeel, P. & Levine, S. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International conference on machine learning, 1861–1870 (PMLR, 2018).

Masson, W., Ranchod, P. & Konidaris, G. Reinforcement learning with parameterized actions. In Proceedings of the AAAI conference on artificial intelligence, vol. 30 (2016).

Achiam, J. et al. Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023).

Duan, J. et al. Distributional soft actor-critic: Off-policy reinforcement learning for addressing value estimation errors. IEEE Trans. Neural Netw. Learn. Syst. 33, 6584–6598 (2021).

Article
MathSciNet

Google Scholar

Mehta, S. A., Habibian, S. & Losey, D. P. Waypoint-based reinforcement learning for robot manipulation tasks. In 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 541–548 (IEEE, 2024).

Levenshtein, V. I. et al. Binary codes capable of correcting deletions, insertions, and reversals. In Soviet physics doklady, vol. 10, 707–710 (Soviet Union, 1966).

link

LLMs augmented hierarchical reinforcement learning with action primitives for long-horizon manipulation tasks

‘Playful’ teaching gaining credibility, say Lego researchers

AI-guided competitive docking for virtual screening and compound efficacy prediction

Stratasys launches multi-material 3D printed model preset for dental training

‘Playful’ teaching gaining credibility, say Lego researchers

AI-guided competitive docking for virtual screening and compound efficacy prediction

Stratasys launches multi-material 3D printed model preset for dental training

Dhammajarinee Witthaya School pioneers learning transformation through AI and Buddhist principles to build “Bridge of Opportunity” for Thai youth

K12 Education Technology Analysis Report 2026: $96.5 Bn

More Stories

You may have missed