Archive/COLLT: A Multi-Task Optimization Framework for Clarification-Oriented Tool Learning in Legal Large Language Models
COLLT: A Multi-Task Optimization Framework for Clarification-Oriented Tool Learning in Legal Large Language Models
Kaixin Yang, Jingyun Sun, Zhenxing Wang et al.
May 29, 2026
en

Abstract

Tool-augmented Large Language Models (LLMs) have demonstrated remarkable capabilities in language understanding and generation across various domains, with notable progress in legal applications. However, existing legal LLMs still face two major challenges in legal question answering: (1) a persistent occurrence of hallucinations in generated legal advice and (2) limited effectiveness in handling ambiguous legal queries. To address both issues, this article introduces COLLT (Clarification-Oriented Legal Large language models enhanced by Tool learning), a legal LLM framework designed around a tool learning mechanism oriented to clarification. We model the clarification, tooling, and response workflow as a budgeted sequential decision. All components, including action selection, tool invocation, and response generation, are jointly optimized through multi-task instruction tuning. The COLLT framework first determines whether a user query contains ambiguity. If ambiguity is detected, the model actively guides the user to clarify their intent; if the query is clear, the model automatically invokes appropriate legal tools to improve the accuracy and reliability of its response. To support this mechanism, we propose an end-to-end training strategy enabling the model to learn how to invoke tools adaptively across various scenarios. Specifically, we construct an instruction-tuning dataset with action tags, tool tags, and optimized responses, and use it to train a series of COLLT models based on five Chinese foundation models: ChatGLM-6B, LLaMA-3-8B, InternLM-3-8B, Qwen-2.5-7B, and Baichuan-2-7B. Experimental results show that COLLT significantly outperforms baselines across nine standard legal NLP tasks. In a free-form QA evaluation based on 500 real-world legal consultation queries, COLLT achieves notable improvements in answer accuracy, legal knowledge coverage, and response appropriateness. Further visualization analysis reveals that COLLT consistently selects appropriate tools under similar intents, and statistical analysis indicates that multi-turn clarification interactions contribute to generating higher-quality responses.

IPC Classification

G06

Keywords

colltmulti-taskoptimizationframeworkclarification-orientedtoollearninglegallargelanguagemodelsmathematicstool-augmentedllmsdemonstratedremarkablecapabilitiesunderstandinggenerationacrossvariousdomainsnotableprogress
Reference this publication

€ 4.00