CoCalc -- function_calling_with_keras

GitHub Repository: keras-team/keras-io
Path: blob/master/guides/keras_hub/function_calling_with_keras_hub.py
³²⁹³ views
1
"""
2
Title: Function Calling with KerasHub models
3
Author: [Laxmareddy Patlolla](https://github.com/laxmareddyp), [Divyashree Sreepathihalli](https://github.com/divyashreepathihalli)
4
Date created: 2025/07/08
5
Last modified: 2025/07/10
6
Description: A guide to using the function calling feature in KerasHub with Gemma 3 and Mistral.
7
Accelerator: GPU
8
"""
9

10
"""
11
## Introduction
12

13
Tool calling is a powerful new feature in modern large language models that allows them to use external tools, such as Python functions, to answer questions and perform actions. Instead of just generating text, a tool-calling model can generate code that calls a function you've provided, allowing it to interact with the real world, access live data, and perform complex calculations.
14

15
In this guide, we'll walk you through a simple example of tool calling with the Gemma 3 and Mistral models and KerasHub. We'll show you how to:
16

17
1. Define a tool (a Python function).
18
2. Tell the models about the tool.
19
3. Use the model to generate code that calls the tool.
20
4. Execute the code and feed the result back to the model.
21
5. Get a final, natural-language response from the model.
22

23
Let's get started!
24
"""
25

26
"""
27
## Setup
28

29
First, let's import the necessary libraries and configure our environment. We'll be using KerasHub to download and run the language models, and we'll need to authenticate with Kaggle to access the model weights.
30
"""
31

32
import os
33
import json
34
import random
35
import string
36
import re
37
import ast
38
import io
39
import sys
40
import contextlib
41

42
# Set backend before importing Keras
43
os.environ["KERAS_BACKEND"] = "jax"
44

45
import keras
46
import keras_hub
47
import kagglehub
48
import numpy as np
49

50
# Constants
51
USD_TO_EUR_RATE = 0.85
52

53
# Set the default dtype policy to bfloat16 for improved performance and reduced memory usage on supported hardware (e.g., TPUs, some GPUs)
54
keras.config.set_dtype_policy("bfloat16")
55

56
# Authenticate with Kaggle
57
# In Google Colab, you can set KAGGLE_USERNAME and KAGGLE_KEY as secrets,
58
# and kagglehub.login() will automatically detect and use them:
59
# kagglehub.login()
60

61
"""
62
## Loading the Model
63

64
Next, we'll load the Gemma 3 model from KerasHub. We're using the `gemma3_instruct_4b` preset, which is a version of the model that has been specifically fine-tuned for instruction following and tool calling.
65
"""
66

67
try:
68
    gemma = keras_hub.models.Gemma3CausalLM.from_preset("gemma3_instruct_4b")
69
    print("✅ Gemma 3 model loaded successfully")
70
except Exception as e:
71
    print(f"❌ Error loading Gemma 3 model: {e}")
72
    print("Please ensure you have the correct model preset and sufficient resources.")
73
    raise
74

75
"""
76
## Defining a Tool
77

78
Now, let's define a simple tool that we want our model to be able to use. For this example, we'll create a Python function called `convert` that can convert one currency to another.
79
"""
80

81

82
def convert(amount, currency, new_currency):
83
    """Convert the currency with the latest exchange rate
84

85
    Args:
86
      amount: The amount of currency to convert
87
      currency: The currency to convert from
88
      new_currency: The currency to convert to
89
    """
90
    # Input validation
91
    if amount < 0:
92
        raise ValueError("Amount cannot be negative")
93

94
    if not isinstance(currency, str) or not isinstance(new_currency, str):
95
        raise ValueError("Currency codes must be strings")
96

97
    # Normalize currency codes to uppercase to handle model-generated lowercase codes
98
    currency = currency.upper().strip()
99
    new_currency = new_currency.upper().strip()
100

101
    # In a real application, this function would call an API to get the latest
102
    # exchange rate. For this example, we'll just use a fixed rate.
103
    if currency == "USD" and new_currency == "EUR":
104
        return amount * USD_TO_EUR_RATE
105
    elif currency == "EUR" and new_currency == "USD":
106
        return amount / USD_TO_EUR_RATE
107
    else:
108
        raise NotImplementedError(
109
            f"Currency conversion from {currency} to {new_currency} is not supported."
110
        )
111

112

113
"""
114
## Telling the Model About the Tool
115

116
Now that we have a tool, we need to tell the Gemma 3 model about it. We do this by providing a carefully crafted prompt that includes:
117

118
1. A description of the tool calling process.
119
2. The Python code for the tool, including its function signature and docstring.
120
3. The user's question.
121

122
Here's the prompt we'll use:
123
"""
124

125
message = '''
126
<start_of_turn>user
127
At each turn, if you decide to invoke any of the function(s), it should be wrapped with ```tool_code```. The python methods described below are imported and available, you can only use defined methods and must not reimplement them. The generated code should be readable and efficient. I will provide the response wrapped in ```tool_output```, use it to call more tools or generate a helpful, friendly response. When using a ```tool_call``` think step by step why and how it should be used.
128

129
The following Python methods are available:
130

131
```python
132
def convert(amount, currency, new_currency):
133
    """Convert the currency with the latest exchange rate
134

135
    Args:
136
      amount: The amount of currency to convert
137
      currency: The currency to convert from
138
      new_currency: The currency to convert to
139
    """
140
```
141

142
User: What is $200,000 in EUR?<end_of_turn>
143
<start_of_turn>model
144
'''
145

146
"""
147
## Generating the Tool Call
148

149
Now, let's pass this prompt to the model and see what it generates.
150
"""
151

152
print(gemma.generate(message))
153

154
"""
155
As you can see, the model has correctly identified that it can use the `convert` function to answer the question, and it has generated the corresponding Python code.
156
"""
157

158
"""
159
## Executing the Tool Call and Getting a Final Answer
160

161
In a real application, you would now take this generated code, execute it, and feed the result back to the model. Let's create a practical example that shows how to do this:
162
"""
163

164
# First, let's get the model's response
165
response = gemma.generate(message)
166
print("Model's response:")
167
print(response)
168

169

170
# Extract the tool call from the response
171
def extract_tool_call(response_text):
172
    """Extract tool call from the model's response."""
173
    tool_call_pattern = r"```tool_code\s*\n(.*?)\n```"
174
    match = re.search(tool_call_pattern, response_text, re.DOTALL)
175
    if match:
176
        return match.group(1).strip()
177
    return None
178

179

180
def capture_code_output(code_string, globals_dict=None, locals_dict=None):
181
    """
182
    Executes Python code and captures any stdout output.
183

184

185
    This function uses eval() and exec() which can execute arbitrary code.
186
    NEVER use this function with untrusted code in production environments.
187
    Always validate and sanitize code from LLMs before execution.
188

189
    Args:
190
        code_string (str): The code to execute (expression or statements).
191
        globals_dict (dict, optional): Global variables for execution.
192
        locals_dict (dict, optional): Local variables for execution.
193

194
    Returns:
195
        The captured stdout output if any, otherwise the return value of the expression,
196
        or None if neither.
197
    """
198
    if globals_dict is None:
199
        globals_dict = {}
200
    if locals_dict is None:
201
        locals_dict = globals_dict
202

203
    output = io.StringIO()
204
    try:
205
        with contextlib.redirect_stdout(output):
206
            try:
207
                # Try to evaluate as an expression
208
                result = eval(code_string, globals_dict, locals_dict)
209
            except SyntaxError:
210
                # If not an expression, execute as statements
211
                exec(code_string, globals_dict, locals_dict)
212
                result = None
213
    except Exception as e:
214
        return f"Error during code execution: {e}"
215

216
    stdout_output = output.getvalue()
217
    if stdout_output.strip():
218
        return stdout_output
219
    return result
220

221

222
# Extract and execute the tool call
223
tool_code = extract_tool_call(response)
224
if tool_code:
225
    print(f"\nExtracted tool call: {tool_code}")
226
    try:
227
        local_vars = {"convert": convert}
228
        tool_result = capture_code_output(tool_code, globals_dict=local_vars)
229
        print(f"Tool execution result: {tool_result}")
230

231
        # Create the next message with the tool result
232
        message_with_result = f'''
233
<start_of_turn>user
234
At each turn, if you decide to invoke any of the function(s), it should be wrapped with ```tool_code```. The python methods described below are imported and available, you can only use defined methods and must not reimplement them. The generated code should be readable and efficient. I will provide the response wrapped in ```tool_output```, use it to call more tools or generate a helpful, friendly response. When using a ```tool_call``` think step by step why and how it should be used.
235

236
The following Python methods are available:
237

238
```python
239
def convert(amount, currency, new_currency):
240
    """Convert the currency with the latest exchange rate
241

242
    Args:
243
      amount: The amount of currency to convert
244
      currency: The currency to convert from
245
      new_currency: The currency to convert to
246
    """
247
```
248

249
User: What is $200,000 in EUR?<end_of_turn>
250
<start_of_turn>model
251
```tool_code
252
print(convert(200000, "USD", "EUR"))
253
```<end_of_turn>
254
<start_of_turn>user
255
```tool_output
256
{tool_result}
257
```
258
<end_of_turn>
259
<start_of_turn>model
260
'''
261

262
        # Get the final response
263
        final_response = gemma.generate(message_with_result)
264
        print("\nFinal response:")
265
        print(final_response)
266

267
    except Exception as e:
268
        print(f"Error executing tool call: {e}")
269
else:
270
    print("No tool call found in the response")
271

272
"""
273
## Automated Tool Call Execution Loop
274

275
Let's create a more sophisticated example that shows how to automatically handle multiple tool calls in a conversation:
276
"""
277

278

279
def automated_tool_calling_example():
280
    """Demonstrate automated tool calling with a conversation loop."""
281

282
    conversation_history = []
283
    max_turns = 5
284

285
    # Initial user message
286
    user_message = "What is $500 in EUR, and then what is that amount in USD?"
287

288
    # Define base prompt outside the loop for better performance
289
    base_prompt = f'''
290
<start_of_turn>user
291
At each turn, if you decide to invoke any of the function(s), it should be wrapped with ```tool_code```. The python methods described below are imported and available, you can only use defined methods and must not reimplement them. The generated code should be readable and efficient. I will provide the response wrapped in ```tool_output```, use it to call more tools or generate a helpful, friendly response. When using a ```tool_call``` think step by step why and how it should be used.
292

293
The following Python methods are available:
294

295
```python
296
def convert(amount, currency, new_currency):
297
    """Convert the currency with the latest exchange rate
298

299
    Args:
300
      amount: The amount of currency to convert
301
      currency: The currency to convert from
302
      new_currency: The currency to convert to
303
    """
304
```
305

306
User: {user_message}<end_of_turn>
307
<start_of_turn>model
308
'''
309

310
    for turn in range(max_turns):
311
        print(f"\n--- Turn {turn + 1} ---")
312

313
        # Build conversation context by appending history to base prompt
314
        context = base_prompt
315
        for hist in conversation_history:
316
            context += hist + "\n"
317

318
        # Get model response
319
        response = gemma.generate(context, strip_prompt=True)
320
        print(f"Model response: {response}")
321

322
        # Extract tool call
323
        tool_code = extract_tool_call(response)
324

325
        if tool_code:
326
            print(f"Executing: {tool_code}")
327
            try:
328
                local_vars = {"convert": convert}
329
                tool_result = capture_code_output(tool_code, globals_dict=local_vars)
330
                conversation_history.append(
331
                    f"```tool_code\n{tool_code}\n```<end_of_turn>"
332
                )
333
                conversation_history.append(
334
                    f"<start_of_turn>user\n```tool_output\n{tool_result}\n```<end_of_turn>"
335
                )
336
                conversation_history.append(f"<start_of_turn>model\n")
337
                print(f"Tool result: {tool_result}")
338
            except Exception as e:
339
                print(f"Error executing tool: {e}")
340
                break
341
        else:
342
            print("No tool call found - conversation complete")
343
            conversation_history.append(response)
344
            break
345

346
    print("\n--- Final Conversation ---")
347
    print(context)
348
    for hist in conversation_history:
349
        print(hist)
350

351

352
# Run the automated example
353
print("Running automated tool calling example:")
354
automated_tool_calling_example()
355

356
"""
357
## Mistral
358

359
Mistral differs from Gemma in its approach to tool calling, as it requires a specific format and defines special control tokens for this purpose. This JSON-based syntax for tool calling is also adopted by other models, such as Qwen and Llama.
360

361
We will now extend the example to a more exciting use case: building a flight booking agent. This agent will be able to search for appropriate flights and book them automatically.
362

363
To do this, we will first download the Mistral model using KerasHub. For agentic AI with Mistral, low-level access to tokenization is necessary due to the use of control tokens. Therefore, we will instantiate the tokenizer and model separately, and disable the preprocessor for the model.
364
"""
365

366
tokenizer = keras_hub.tokenizers.MistralTokenizer.from_preset(
367
    "kaggle://keras/mistral/keras/mistral_0.3_instruct_7b_en"
368
)
369

370
try:
371
    mistral = keras_hub.models.MistralCausalLM.from_preset(
372
        "kaggle://keras/mistral/keras/mistral_0.3_instruct_7b_en", preprocessor=None
373
    )
374
    print("✅ Mistral model loaded successfully")
375
except Exception as e:
376
    print(f"❌ Error loading Mistral model: {e}")
377
    print("Please ensure you have the correct model preset and sufficient resources.")
378
    raise
379

380
"""
381
Next, we'll define functions for tokenization. The `preprocess` function will take a tokenized conversation in list form and format it correctly for the model. We'll also create an additional function, `encode_instruction`, for tokenizing text and adding instruction control tokens.
382
"""
383

384

385
def preprocess(messages, sequence_length=8192):
386
    """Preprocess tokenized messages for the Mistral model.
387

388
    Args:
389
        messages: List of tokenized message sequences
390
        sequence_length: Maximum sequence length for the model
391

392
    Returns:
393
        Dictionary containing token_ids and padding_mask
394
    """
395
    concatd = np.expand_dims(np.concatenate(messages), 0)
396

397
    # Truncate if the sequence is too long
398
    if concatd.shape[1] > sequence_length:
399
        concatd = concatd[:, :sequence_length]
400

401
    # Calculate padding needed
402
    padding_needed = max(0, sequence_length - concatd.shape[1])
403

404
    return {
405
        "token_ids": np.pad(concatd, ((0, 0), (0, padding_needed))),
406
        "padding_mask": np.expand_dims(
407
            np.arange(sequence_length) < concatd.shape[1], 0
408
        ).astype(int),
409
    }
410

411

412
def encode_instruction(text):
413
    """Encode instruction text with Mistral control tokens.
414

415
    Args:
416
        text: The instruction text to encode
417

418
    Returns:
419
        List of tokenized sequences with instruction control tokens
420
    """
421
    return [
422
        [tokenizer.token_to_id("[INST]")],
423
        tokenizer(text),
424
        [tokenizer.token_to_id("[/INST]")],
425
    ]
426

427

428
"""
429
Now, we'll define a function, `try_parse_funccall`, to handle the model's function calls. These calls are identified by the `[TOOL_CALLS]` control token. The function will parse the subsequent data, which is in JSON format. Mistral also requires us to add a random call ID to each function call. Finally, the function will call the matching tool and encode its results using the `[TOOL_RESULTS]` control token.
430
"""
431

432

433
def try_parse_funccall(response):
434
    """Parse function calls from Mistral model response and execute tools.
435

436
    Args:
437
        response: Tokenized model response
438

439
    Returns:
440
        List of tokenized sequences including tool results
441
    """
442
    # find the tool call in the response, if any
443
    tool_call_id = tokenizer.token_to_id("[TOOL_CALLS]")
444
    pos = np.where(response == tool_call_id)[0]
445
    if not len(pos):
446
        return [response]
447
    pos = pos[0]
448

449
    try:
450
        decoder = json.JSONDecoder()
451
        tool_calls, _ = decoder.raw_decode(tokenizer.detokenize(response[pos + 1 :]))
452
        if not isinstance(tool_calls, list) or not all(
453
            isinstance(item, dict) for item in tool_calls
454
        ):
455
            return [response]
456

457
        res = []  # Initialize result list
458
        # assign a random call ID
459
        for call in tool_calls:
460
            call["id"] = "".join(
461
                random.choices(string.ascii_letters + string.digits, k=9)
462
            )
463
            if call["name"] not in tools:
464
                continue  # Skip unknown tools
465
            res.append([tokenizer.token_to_id("[TOOL_RESULTS]")])
466
            res.append(
467
                tokenizer(
468
                    json.dumps(
469
                        {
470
                            "content": tools[call["name"]](**call["arguments"]),
471
                            "call_id": call["id"],
472
                        }
473
                    )
474
                )
475
            )
476
            res.append([tokenizer.token_to_id("[/TOOL_RESULTS]")])
477
        return res
478
    except (json.JSONDecodeError, KeyError, TypeError, ValueError) as e:
479
        # Log the error for debugging
480
        print(f"Error parsing tool call: {e}")
481
        return [response]
482

483

484
"""
485
We will extend our set of tools to include functions for currency conversion, finding flights, and booking flights. For this example, we'll use mock implementations for these functions, meaning they will return dummy data instead of interacting with real services.
486
"""
487

488
tools = {
489
    "convert_currency": lambda amount, currency, new_currency: (
490
        f"{amount*USD_TO_EUR_RATE:.2f}"
491
        if currency == "USD" and new_currency == "EUR"
492
        else (
493
            f"{amount/USD_TO_EUR_RATE:.2f}"
494
            if currency == "EUR" and new_currency == "USD"
495
            else f"Error: Unsupported conversion from {currency} to {new_currency}"
496
        )
497
    ),
498
    "find_flights": lambda origin, destination, date: [
499
        {"id": 1, "price": "USD 220", "stops": 2, "duration": 4.5},
500
        {"id": 2, "price": "USD 22", "stops": 1, "duration": 2.0},
501
        {"id": 3, "price": "USD 240", "stops": 2, "duration": 13.2},
502
    ],
503
    "book_flight": lambda id: {
504
        "status": "success",
505
        "message": f"Flight {id} booked successfully",
506
    },
507
}
508

509
"""
510
It's crucial to inform the model about these available functions at the very beginning of the conversation. To do this, we will define the available tools in a specific JSON format, as shown in the following code block.
511
"""
512

513
tool_definitions = [
514
    {
515
        "type": "function",
516
        "function": {
517
            "name": "convert_currency",
518
            "description": "Convert the currency with the latest exchange rate",
519
            "parameters": {
520
                "type": "object",
521
                "properties": {
522
                    "amount": {"type": "number", "description": "The amount"},
523
                    "currency": {
524
                        "type": "string",
525
                        "description": "The currency to convert from",
526
                    },
527
                    "new_currency": {
528
                        "type": "string",
529
                        "description": "The currency to convert to",
530
                    },
531
                },
532
                "required": ["amount", "currency", "new_currency"],
533
            },
534
        },
535
    },
536
    {
537
        "type": "function",
538
        "function": {
539
            "name": "find_flights",
540
            "description": "Query price, time, number of stopovers and duration in hours for flights for a given date",
541
            "parameters": {
542
                "type": "object",
543
                "properties": {
544
                    "origin": {
545
                        "type": "string",
546
                        "description": "The city to depart from",
547
                    },
548
                    "destination": {
549
                        "type": "string",
550
                        "description": "The destination city",
551
                    },
552
                    "date": {
553
                        "type": "string",
554
                        "description": "The date in YYYYMMDD format",
555
                    },
556
                },
557
                "required": ["origin", "destination", "date"],
558
            },
559
        },
560
    },
561
    {
562
        "type": "function",
563
        "function": {
564
            "name": "book_flight",
565
            "description": "Book the flight with the given id",
566
            "parameters": {
567
                "type": "object",
568
                "properties": {
569
                    "id": {
570
                        "type": "number",
571
                        "description": "The numeric id of the flight to book",
572
                    },
573
                },
574
                "required": ["id"],
575
            },
576
        },
577
    },
578
]
579

580
"""
581
We will define the conversation as a `messages` list. At the very beginning of this list, we need to include a Beginning-Of-Sequence (BOS) token. This is followed by the tool definitions, which must be wrapped in `[AVAILABLE_TOOLS]` and `[/AVAILABLE_TOOLS]` control tokens.
582
"""
583

584
messages = [
585
    [tokenizer.token_to_id("<s>")],
586
    [tokenizer.token_to_id("[AVAILABLE_TOOLS]")],
587
    tokenizer(json.dumps(tool_definitions)),
588
    [tokenizer.token_to_id("[/AVAILABLE_TOOLS]")],
589
]
590

591
"""
592
Now, let's get started! We will task the model with the following: **Book the most comfortable flight from Linz to London on the 24th of July 2025, but only if it costs less than 20€ as of the latest exchange rate.**
593
"""
594

595
messages.extend(
596
    encode_instruction(
597
        "Book the most comfortable flight from Linz to London on the 24th of July 2025, but only if it costs less than 20€ as of the latest exchange rate."
598
    )
599
)
600

601
"""
602
In an agentic AI system, the model interacts with its tools through a sequence of messages. We will continue to handle these messages until the flight is successfully booked.
603
For educational purposes, we will output the tool calls issued by the model; typically, a user would not see this level of detail. It's important to note that after the tool call JSON, the data must be truncated. If not, a less capable model may 'babble', outputting redundant or confused data.
604
"""
605

606
flight_booked = False
607
max_iterations = 10  # Prevent infinite loops
608
iteration_count = 0
609

610
while not flight_booked and iteration_count < max_iterations:
611
    iteration_count += 1
612
    # query the model
613
    res = mistral.generate(
614
        preprocess(messages), max_length=8192, stop_token_ids=[2], strip_prompt=True
615
    )
616
    # output the model's response, add separator line for legibility
617
    response_text = tokenizer.detokenize(
618
        res["token_ids"][0, : np.argmax(~res["padding_mask"])]
619
    )
620
    print(response_text, f"\n\n\n{'-'*100}\n\n")
621

622
    # Check for tool calls and track booking status
623
    tool_call_id = tokenizer.token_to_id("[TOOL_CALLS]")
624
    pos = np.where(res["token_ids"][0] == tool_call_id)[0]
625
    if len(pos) > 0:
626
        try:
627
            decoder = json.JSONDecoder()
628
            tool_calls, _ = decoder.raw_decode(
629
                tokenizer.detokenize(res["token_ids"][0][pos[0] + 1 :])
630
            )
631
            if isinstance(tool_calls, list):
632
                for call in tool_calls:
633
                    if isinstance(call, dict) and call.get("name") == "book_flight":
634
                        # Check if book_flight was called successfully
635
                        flight_booked = True
636
                        break
637
        except (json.JSONDecodeError, KeyError, TypeError, ValueError):
638
            pass
639

640
    # perform tool calls and extend `messages`
641
    messages.extend(try_parse_funccall(res["token_ids"][0]))
642

643
if not flight_booked:
644
    print("Maximum iterations reached. Flight booking was not completed.")
645

646
"""
647
For understandability, here's the conversation as received by the model, i.e. when truncating after the tool calling JSON:
648

649
* **User:**
650
```
651
Book the most comfortable flight from Linz to London on the 24th of July 2025, but only if it costs less than 20€ as of the latest exchange rate.
652
```
653

654
* **Model:**
655
```
656
[{"name": "find_flights", "arguments": {"origin": "Linz", "destination": "London", "date": "20250724"}}]
657
```
658
* **Tool Output:**
659
```
660
[{"id": 1, "price": "USD 220", "stops": 2, "duration": 4.5}, {"id": 2, "price": "USD 22", "stops": 1, "duration": 2.0}, {"id": 3, "price": "USD 240", "stops": 2, "duration": 13.2}]
661
```
662
* **Model:**
663
```
664
Now let's convert the price from USD to EUR using the latest exchange rate:
665

666
 [{"name": "convert_currency", "arguments": {"amount": 22, "currency": "USD", "new_currency": "EUR"}}]
667
```
668
* **Tool Output:**
669
```
670
"18.70"
671
```
672
* **Model:**
673
```
674
The price of the flight with the id 2 in EUR is 18.70. Since it is below the 20€ limit, let's book this flight:
675

676
 [{"name": "book_flight", "arguments": {"id": 2}}]
677
```
678

679
It's important to acknowledge that you might have to run the model a few times to obtain a good output as depicted above. As a 7-billion parameter model, Mistral may still make several mistakes, such as misinterpreting data, outputting malformed tool calls, or making incorrect decisions. However, the continued development in this field paves the way for increasingly powerful agentic AI in the future.
680
"""
681

682
"""
683
## Conclusion
684

685
Tool calling is a powerful feature that allows large language models to interact with the real world, access live data, and perform complex calculations. By defining a set of tools and telling the model about them, you can create sophisticated applications that go far beyond simple text generation.
686
"""
687

688
Product

Resources

Company