Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
keras-team
GitHub Repository: keras-team/keras-io
Path: blob/master/guides/keras_hub/function_calling_with_keras_hub.py
3293 views
1
"""
2
Title: Function Calling with KerasHub models
3
Author: [Laxmareddy Patlolla](https://github.com/laxmareddyp), [Divyashree Sreepathihalli](https://github.com/divyashreepathihalli)
4
Date created: 2025/07/08
5
Last modified: 2025/07/10
6
Description: A guide to using the function calling feature in KerasHub with Gemma 3 and Mistral.
7
Accelerator: GPU
8
"""
9
10
"""
11
## Introduction
12
13
Tool calling is a powerful new feature in modern large language models that allows them to use external tools, such as Python functions, to answer questions and perform actions. Instead of just generating text, a tool-calling model can generate code that calls a function you've provided, allowing it to interact with the real world, access live data, and perform complex calculations.
14
15
In this guide, we'll walk you through a simple example of tool calling with the Gemma 3 and Mistral models and KerasHub. We'll show you how to:
16
17
1. Define a tool (a Python function).
18
2. Tell the models about the tool.
19
3. Use the model to generate code that calls the tool.
20
4. Execute the code and feed the result back to the model.
21
5. Get a final, natural-language response from the model.
22
23
Let's get started!
24
"""
25
26
"""
27
## Setup
28
29
First, let's import the necessary libraries and configure our environment. We'll be using KerasHub to download and run the language models, and we'll need to authenticate with Kaggle to access the model weights.
30
"""
31
32
import os
33
import json
34
import random
35
import string
36
import re
37
import ast
38
import io
39
import sys
40
import contextlib
41
42
# Set backend before importing Keras
43
os.environ["KERAS_BACKEND"] = "jax"
44
45
import keras
46
import keras_hub
47
import kagglehub
48
import numpy as np
49
50
# Constants
51
USD_TO_EUR_RATE = 0.85
52
53
# Set the default dtype policy to bfloat16 for improved performance and reduced memory usage on supported hardware (e.g., TPUs, some GPUs)
54
keras.config.set_dtype_policy("bfloat16")
55
56
# Authenticate with Kaggle
57
# In Google Colab, you can set KAGGLE_USERNAME and KAGGLE_KEY as secrets,
58
# and kagglehub.login() will automatically detect and use them:
59
# kagglehub.login()
60
61
"""
62
## Loading the Model
63
64
Next, we'll load the Gemma 3 model from KerasHub. We're using the `gemma3_instruct_4b` preset, which is a version of the model that has been specifically fine-tuned for instruction following and tool calling.
65
"""
66
67
try:
68
gemma = keras_hub.models.Gemma3CausalLM.from_preset("gemma3_instruct_4b")
69
print("✅ Gemma 3 model loaded successfully")
70
except Exception as e:
71
print(f"❌ Error loading Gemma 3 model: {e}")
72
print("Please ensure you have the correct model preset and sufficient resources.")
73
raise
74
75
"""
76
## Defining a Tool
77
78
Now, let's define a simple tool that we want our model to be able to use. For this example, we'll create a Python function called `convert` that can convert one currency to another.
79
"""
80
81
82
def convert(amount, currency, new_currency):
83
"""Convert the currency with the latest exchange rate
84
85
Args:
86
amount: The amount of currency to convert
87
currency: The currency to convert from
88
new_currency: The currency to convert to
89
"""
90
# Input validation
91
if amount < 0:
92
raise ValueError("Amount cannot be negative")
93
94
if not isinstance(currency, str) or not isinstance(new_currency, str):
95
raise ValueError("Currency codes must be strings")
96
97
# Normalize currency codes to uppercase to handle model-generated lowercase codes
98
currency = currency.upper().strip()
99
new_currency = new_currency.upper().strip()
100
101
# In a real application, this function would call an API to get the latest
102
# exchange rate. For this example, we'll just use a fixed rate.
103
if currency == "USD" and new_currency == "EUR":
104
return amount * USD_TO_EUR_RATE
105
elif currency == "EUR" and new_currency == "USD":
106
return amount / USD_TO_EUR_RATE
107
else:
108
raise NotImplementedError(
109
f"Currency conversion from {currency} to {new_currency} is not supported."
110
)
111
112
113
"""
114
## Telling the Model About the Tool
115
116
Now that we have a tool, we need to tell the Gemma 3 model about it. We do this by providing a carefully crafted prompt that includes:
117
118
1. A description of the tool calling process.
119
2. The Python code for the tool, including its function signature and docstring.
120
3. The user's question.
121
122
Here's the prompt we'll use:
123
"""
124
125
message = '''
126
<start_of_turn>user
127
At each turn, if you decide to invoke any of the function(s), it should be wrapped with ```tool_code```. The python methods described below are imported and available, you can only use defined methods and must not reimplement them. The generated code should be readable and efficient. I will provide the response wrapped in ```tool_output```, use it to call more tools or generate a helpful, friendly response. When using a ```tool_call``` think step by step why and how it should be used.
128
129
The following Python methods are available:
130
131
```python
132
def convert(amount, currency, new_currency):
133
"""Convert the currency with the latest exchange rate
134
135
Args:
136
amount: The amount of currency to convert
137
currency: The currency to convert from
138
new_currency: The currency to convert to
139
"""
140
```
141
142
User: What is $200,000 in EUR?<end_of_turn>
143
<start_of_turn>model
144
'''
145
146
"""
147
## Generating the Tool Call
148
149
Now, let's pass this prompt to the model and see what it generates.
150
"""
151
152
print(gemma.generate(message))
153
154
"""
155
As you can see, the model has correctly identified that it can use the `convert` function to answer the question, and it has generated the corresponding Python code.
156
"""
157
158
"""
159
## Executing the Tool Call and Getting a Final Answer
160
161
In a real application, you would now take this generated code, execute it, and feed the result back to the model. Let's create a practical example that shows how to do this:
162
"""
163
164
# First, let's get the model's response
165
response = gemma.generate(message)
166
print("Model's response:")
167
print(response)
168
169
170
# Extract the tool call from the response
171
def extract_tool_call(response_text):
172
"""Extract tool call from the model's response."""
173
tool_call_pattern = r"```tool_code\s*\n(.*?)\n```"
174
match = re.search(tool_call_pattern, response_text, re.DOTALL)
175
if match:
176
return match.group(1).strip()
177
return None
178
179
180
def capture_code_output(code_string, globals_dict=None, locals_dict=None):
181
"""
182
Executes Python code and captures any stdout output.
183
184
185
This function uses eval() and exec() which can execute arbitrary code.
186
NEVER use this function with untrusted code in production environments.
187
Always validate and sanitize code from LLMs before execution.
188
189
Args:
190
code_string (str): The code to execute (expression or statements).
191
globals_dict (dict, optional): Global variables for execution.
192
locals_dict (dict, optional): Local variables for execution.
193
194
Returns:
195
The captured stdout output if any, otherwise the return value of the expression,
196
or None if neither.
197
"""
198
if globals_dict is None:
199
globals_dict = {}
200
if locals_dict is None:
201
locals_dict = globals_dict
202
203
output = io.StringIO()
204
try:
205
with contextlib.redirect_stdout(output):
206
try:
207
# Try to evaluate as an expression
208
result = eval(code_string, globals_dict, locals_dict)
209
except SyntaxError:
210
# If not an expression, execute as statements
211
exec(code_string, globals_dict, locals_dict)
212
result = None
213
except Exception as e:
214
return f"Error during code execution: {e}"
215
216
stdout_output = output.getvalue()
217
if stdout_output.strip():
218
return stdout_output
219
return result
220
221
222
# Extract and execute the tool call
223
tool_code = extract_tool_call(response)
224
if tool_code:
225
print(f"\nExtracted tool call: {tool_code}")
226
try:
227
local_vars = {"convert": convert}
228
tool_result = capture_code_output(tool_code, globals_dict=local_vars)
229
print(f"Tool execution result: {tool_result}")
230
231
# Create the next message with the tool result
232
message_with_result = f'''
233
<start_of_turn>user
234
At each turn, if you decide to invoke any of the function(s), it should be wrapped with ```tool_code```. The python methods described below are imported and available, you can only use defined methods and must not reimplement them. The generated code should be readable and efficient. I will provide the response wrapped in ```tool_output```, use it to call more tools or generate a helpful, friendly response. When using a ```tool_call``` think step by step why and how it should be used.
235
236
The following Python methods are available:
237
238
```python
239
def convert(amount, currency, new_currency):
240
"""Convert the currency with the latest exchange rate
241
242
Args:
243
amount: The amount of currency to convert
244
currency: The currency to convert from
245
new_currency: The currency to convert to
246
"""
247
```
248
249
User: What is $200,000 in EUR?<end_of_turn>
250
<start_of_turn>model
251
```tool_code
252
print(convert(200000, "USD", "EUR"))
253
```<end_of_turn>
254
<start_of_turn>user
255
```tool_output
256
{tool_result}
257
```
258
<end_of_turn>
259
<start_of_turn>model
260
'''
261
262
# Get the final response
263
final_response = gemma.generate(message_with_result)
264
print("\nFinal response:")
265
print(final_response)
266
267
except Exception as e:
268
print(f"Error executing tool call: {e}")
269
else:
270
print("No tool call found in the response")
271
272
"""
273
## Automated Tool Call Execution Loop
274
275
Let's create a more sophisticated example that shows how to automatically handle multiple tool calls in a conversation:
276
"""
277
278
279
def automated_tool_calling_example():
280
"""Demonstrate automated tool calling with a conversation loop."""
281
282
conversation_history = []
283
max_turns = 5
284
285
# Initial user message
286
user_message = "What is $500 in EUR, and then what is that amount in USD?"
287
288
# Define base prompt outside the loop for better performance
289
base_prompt = f'''
290
<start_of_turn>user
291
At each turn, if you decide to invoke any of the function(s), it should be wrapped with ```tool_code```. The python methods described below are imported and available, you can only use defined methods and must not reimplement them. The generated code should be readable and efficient. I will provide the response wrapped in ```tool_output```, use it to call more tools or generate a helpful, friendly response. When using a ```tool_call``` think step by step why and how it should be used.
292
293
The following Python methods are available:
294
295
```python
296
def convert(amount, currency, new_currency):
297
"""Convert the currency with the latest exchange rate
298
299
Args:
300
amount: The amount of currency to convert
301
currency: The currency to convert from
302
new_currency: The currency to convert to
303
"""
304
```
305
306
User: {user_message}<end_of_turn>
307
<start_of_turn>model
308
'''
309
310
for turn in range(max_turns):
311
print(f"\n--- Turn {turn + 1} ---")
312
313
# Build conversation context by appending history to base prompt
314
context = base_prompt
315
for hist in conversation_history:
316
context += hist + "\n"
317
318
# Get model response
319
response = gemma.generate(context, strip_prompt=True)
320
print(f"Model response: {response}")
321
322
# Extract tool call
323
tool_code = extract_tool_call(response)
324
325
if tool_code:
326
print(f"Executing: {tool_code}")
327
try:
328
local_vars = {"convert": convert}
329
tool_result = capture_code_output(tool_code, globals_dict=local_vars)
330
conversation_history.append(
331
f"```tool_code\n{tool_code}\n```<end_of_turn>"
332
)
333
conversation_history.append(
334
f"<start_of_turn>user\n```tool_output\n{tool_result}\n```<end_of_turn>"
335
)
336
conversation_history.append(f"<start_of_turn>model\n")
337
print(f"Tool result: {tool_result}")
338
except Exception as e:
339
print(f"Error executing tool: {e}")
340
break
341
else:
342
print("No tool call found - conversation complete")
343
conversation_history.append(response)
344
break
345
346
print("\n--- Final Conversation ---")
347
print(context)
348
for hist in conversation_history:
349
print(hist)
350
351
352
# Run the automated example
353
print("Running automated tool calling example:")
354
automated_tool_calling_example()
355
356
"""
357
## Mistral
358
359
Mistral differs from Gemma in its approach to tool calling, as it requires a specific format and defines special control tokens for this purpose. This JSON-based syntax for tool calling is also adopted by other models, such as Qwen and Llama.
360
361
We will now extend the example to a more exciting use case: building a flight booking agent. This agent will be able to search for appropriate flights and book them automatically.
362
363
To do this, we will first download the Mistral model using KerasHub. For agentic AI with Mistral, low-level access to tokenization is necessary due to the use of control tokens. Therefore, we will instantiate the tokenizer and model separately, and disable the preprocessor for the model.
364
"""
365
366
tokenizer = keras_hub.tokenizers.MistralTokenizer.from_preset(
367
"kaggle://keras/mistral/keras/mistral_0.3_instruct_7b_en"
368
)
369
370
try:
371
mistral = keras_hub.models.MistralCausalLM.from_preset(
372
"kaggle://keras/mistral/keras/mistral_0.3_instruct_7b_en", preprocessor=None
373
)
374
print("✅ Mistral model loaded successfully")
375
except Exception as e:
376
print(f"❌ Error loading Mistral model: {e}")
377
print("Please ensure you have the correct model preset and sufficient resources.")
378
raise
379
380
"""
381
Next, we'll define functions for tokenization. The `preprocess` function will take a tokenized conversation in list form and format it correctly for the model. We'll also create an additional function, `encode_instruction`, for tokenizing text and adding instruction control tokens.
382
"""
383
384
385
def preprocess(messages, sequence_length=8192):
386
"""Preprocess tokenized messages for the Mistral model.
387
388
Args:
389
messages: List of tokenized message sequences
390
sequence_length: Maximum sequence length for the model
391
392
Returns:
393
Dictionary containing token_ids and padding_mask
394
"""
395
concatd = np.expand_dims(np.concatenate(messages), 0)
396
397
# Truncate if the sequence is too long
398
if concatd.shape[1] > sequence_length:
399
concatd = concatd[:, :sequence_length]
400
401
# Calculate padding needed
402
padding_needed = max(0, sequence_length - concatd.shape[1])
403
404
return {
405
"token_ids": np.pad(concatd, ((0, 0), (0, padding_needed))),
406
"padding_mask": np.expand_dims(
407
np.arange(sequence_length) < concatd.shape[1], 0
408
).astype(int),
409
}
410
411
412
def encode_instruction(text):
413
"""Encode instruction text with Mistral control tokens.
414
415
Args:
416
text: The instruction text to encode
417
418
Returns:
419
List of tokenized sequences with instruction control tokens
420
"""
421
return [
422
[tokenizer.token_to_id("[INST]")],
423
tokenizer(text),
424
[tokenizer.token_to_id("[/INST]")],
425
]
426
427
428
"""
429
Now, we'll define a function, `try_parse_funccall`, to handle the model's function calls. These calls are identified by the `[TOOL_CALLS]` control token. The function will parse the subsequent data, which is in JSON format. Mistral also requires us to add a random call ID to each function call. Finally, the function will call the matching tool and encode its results using the `[TOOL_RESULTS]` control token.
430
"""
431
432
433
def try_parse_funccall(response):
434
"""Parse function calls from Mistral model response and execute tools.
435
436
Args:
437
response: Tokenized model response
438
439
Returns:
440
List of tokenized sequences including tool results
441
"""
442
# find the tool call in the response, if any
443
tool_call_id = tokenizer.token_to_id("[TOOL_CALLS]")
444
pos = np.where(response == tool_call_id)[0]
445
if not len(pos):
446
return [response]
447
pos = pos[0]
448
449
try:
450
decoder = json.JSONDecoder()
451
tool_calls, _ = decoder.raw_decode(tokenizer.detokenize(response[pos + 1 :]))
452
if not isinstance(tool_calls, list) or not all(
453
isinstance(item, dict) for item in tool_calls
454
):
455
return [response]
456
457
res = [] # Initialize result list
458
# assign a random call ID
459
for call in tool_calls:
460
call["id"] = "".join(
461
random.choices(string.ascii_letters + string.digits, k=9)
462
)
463
if call["name"] not in tools:
464
continue # Skip unknown tools
465
res.append([tokenizer.token_to_id("[TOOL_RESULTS]")])
466
res.append(
467
tokenizer(
468
json.dumps(
469
{
470
"content": tools[call["name"]](**call["arguments"]),
471
"call_id": call["id"],
472
}
473
)
474
)
475
)
476
res.append([tokenizer.token_to_id("[/TOOL_RESULTS]")])
477
return res
478
except (json.JSONDecodeError, KeyError, TypeError, ValueError) as e:
479
# Log the error for debugging
480
print(f"Error parsing tool call: {e}")
481
return [response]
482
483
484
"""
485
We will extend our set of tools to include functions for currency conversion, finding flights, and booking flights. For this example, we'll use mock implementations for these functions, meaning they will return dummy data instead of interacting with real services.
486
"""
487
488
tools = {
489
"convert_currency": lambda amount, currency, new_currency: (
490
f"{amount*USD_TO_EUR_RATE:.2f}"
491
if currency == "USD" and new_currency == "EUR"
492
else (
493
f"{amount/USD_TO_EUR_RATE:.2f}"
494
if currency == "EUR" and new_currency == "USD"
495
else f"Error: Unsupported conversion from {currency} to {new_currency}"
496
)
497
),
498
"find_flights": lambda origin, destination, date: [
499
{"id": 1, "price": "USD 220", "stops": 2, "duration": 4.5},
500
{"id": 2, "price": "USD 22", "stops": 1, "duration": 2.0},
501
{"id": 3, "price": "USD 240", "stops": 2, "duration": 13.2},
502
],
503
"book_flight": lambda id: {
504
"status": "success",
505
"message": f"Flight {id} booked successfully",
506
},
507
}
508
509
"""
510
It's crucial to inform the model about these available functions at the very beginning of the conversation. To do this, we will define the available tools in a specific JSON format, as shown in the following code block.
511
"""
512
513
tool_definitions = [
514
{
515
"type": "function",
516
"function": {
517
"name": "convert_currency",
518
"description": "Convert the currency with the latest exchange rate",
519
"parameters": {
520
"type": "object",
521
"properties": {
522
"amount": {"type": "number", "description": "The amount"},
523
"currency": {
524
"type": "string",
525
"description": "The currency to convert from",
526
},
527
"new_currency": {
528
"type": "string",
529
"description": "The currency to convert to",
530
},
531
},
532
"required": ["amount", "currency", "new_currency"],
533
},
534
},
535
},
536
{
537
"type": "function",
538
"function": {
539
"name": "find_flights",
540
"description": "Query price, time, number of stopovers and duration in hours for flights for a given date",
541
"parameters": {
542
"type": "object",
543
"properties": {
544
"origin": {
545
"type": "string",
546
"description": "The city to depart from",
547
},
548
"destination": {
549
"type": "string",
550
"description": "The destination city",
551
},
552
"date": {
553
"type": "string",
554
"description": "The date in YYYYMMDD format",
555
},
556
},
557
"required": ["origin", "destination", "date"],
558
},
559
},
560
},
561
{
562
"type": "function",
563
"function": {
564
"name": "book_flight",
565
"description": "Book the flight with the given id",
566
"parameters": {
567
"type": "object",
568
"properties": {
569
"id": {
570
"type": "number",
571
"description": "The numeric id of the flight to book",
572
},
573
},
574
"required": ["id"],
575
},
576
},
577
},
578
]
579
580
"""
581
We will define the conversation as a `messages` list. At the very beginning of this list, we need to include a Beginning-Of-Sequence (BOS) token. This is followed by the tool definitions, which must be wrapped in `[AVAILABLE_TOOLS]` and `[/AVAILABLE_TOOLS]` control tokens.
582
"""
583
584
messages = [
585
[tokenizer.token_to_id("<s>")],
586
[tokenizer.token_to_id("[AVAILABLE_TOOLS]")],
587
tokenizer(json.dumps(tool_definitions)),
588
[tokenizer.token_to_id("[/AVAILABLE_TOOLS]")],
589
]
590
591
"""
592
Now, let's get started! We will task the model with the following: **Book the most comfortable flight from Linz to London on the 24th of July 2025, but only if it costs less than 20€ as of the latest exchange rate.**
593
"""
594
595
messages.extend(
596
encode_instruction(
597
"Book the most comfortable flight from Linz to London on the 24th of July 2025, but only if it costs less than 20€ as of the latest exchange rate."
598
)
599
)
600
601
"""
602
In an agentic AI system, the model interacts with its tools through a sequence of messages. We will continue to handle these messages until the flight is successfully booked.
603
For educational purposes, we will output the tool calls issued by the model; typically, a user would not see this level of detail. It's important to note that after the tool call JSON, the data must be truncated. If not, a less capable model may 'babble', outputting redundant or confused data.
604
"""
605
606
flight_booked = False
607
max_iterations = 10 # Prevent infinite loops
608
iteration_count = 0
609
610
while not flight_booked and iteration_count < max_iterations:
611
iteration_count += 1
612
# query the model
613
res = mistral.generate(
614
preprocess(messages), max_length=8192, stop_token_ids=[2], strip_prompt=True
615
)
616
# output the model's response, add separator line for legibility
617
response_text = tokenizer.detokenize(
618
res["token_ids"][0, : np.argmax(~res["padding_mask"])]
619
)
620
print(response_text, f"\n\n\n{'-'*100}\n\n")
621
622
# Check for tool calls and track booking status
623
tool_call_id = tokenizer.token_to_id("[TOOL_CALLS]")
624
pos = np.where(res["token_ids"][0] == tool_call_id)[0]
625
if len(pos) > 0:
626
try:
627
decoder = json.JSONDecoder()
628
tool_calls, _ = decoder.raw_decode(
629
tokenizer.detokenize(res["token_ids"][0][pos[0] + 1 :])
630
)
631
if isinstance(tool_calls, list):
632
for call in tool_calls:
633
if isinstance(call, dict) and call.get("name") == "book_flight":
634
# Check if book_flight was called successfully
635
flight_booked = True
636
break
637
except (json.JSONDecodeError, KeyError, TypeError, ValueError):
638
pass
639
640
# perform tool calls and extend `messages`
641
messages.extend(try_parse_funccall(res["token_ids"][0]))
642
643
if not flight_booked:
644
print("Maximum iterations reached. Flight booking was not completed.")
645
646
"""
647
For understandability, here's the conversation as received by the model, i.e. when truncating after the tool calling JSON:
648
649
* **User:**
650
```
651
Book the most comfortable flight from Linz to London on the 24th of July 2025, but only if it costs less than 20€ as of the latest exchange rate.
652
```
653
654
* **Model:**
655
```
656
[{"name": "find_flights", "arguments": {"origin": "Linz", "destination": "London", "date": "20250724"}}]
657
```
658
* **Tool Output:**
659
```
660
[{"id": 1, "price": "USD 220", "stops": 2, "duration": 4.5}, {"id": 2, "price": "USD 22", "stops": 1, "duration": 2.0}, {"id": 3, "price": "USD 240", "stops": 2, "duration": 13.2}]
661
```
662
* **Model:**
663
```
664
Now let's convert the price from USD to EUR using the latest exchange rate:
665
666
[{"name": "convert_currency", "arguments": {"amount": 22, "currency": "USD", "new_currency": "EUR"}}]
667
```
668
* **Tool Output:**
669
```
670
"18.70"
671
```
672
* **Model:**
673
```
674
The price of the flight with the id 2 in EUR is 18.70. Since it is below the 20€ limit, let's book this flight:
675
676
[{"name": "book_flight", "arguments": {"id": 2}}]
677
```
678
679
It's important to acknowledge that you might have to run the model a few times to obtain a good output as depicted above. As a 7-billion parameter model, Mistral may still make several mistakes, such as misinterpreting data, outputting malformed tool calls, or making incorrect decisions. However, the continued development in this field paves the way for increasingly powerful agentic AI in the future.
680
"""
681
682
"""
683
## Conclusion
684
685
Tool calling is a powerful feature that allows large language models to interact with the real world, access live data, and perform complex calculations. By defining a set of tools and telling the model about them, you can create sophisticated applications that go far beyond simple text generation.
686
"""
687
688