Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
IBM
GitHub Repository: IBM/watson-machine-learning-samples
Path: blob/master/cloud/notebooks/python_sdk/deployments/foundation_models/Use watsonx, and `mistral-small-3-1-24b-instruct-2503` to summarize legal Contracts documents.ipynb
5214 views
Kernel: clear_env

image

Use watsonx, and mistral-small-3-1-24b-instruct-2503 to summarize legal Contracts documents

Disclaimers

  • Use only Projects and Spaces that are available in watsonx context.

Notebook content

This notebook contains the steps and code to demonstrate support of text summarization in watsonx. It introduces commands for data retrieval and model testing.

Some familiarity with Python is helpful. This notebook uses Python 3.11.

Learning goal

The goal of this notebook is to demonstrate how to use mistral-small-3-1-24b-instruct-2503 model to summarize legal documents .

Contents

This notebook contains the following parts:

Set up the environment

Before you use the sample code in this notebook, you must perform the following setup tasks:

Install and import dependencies

Note: ibm-watsonx-ai documentation can be found here.

%pip install wget | tail -n 1 %pip install requests | tail -n 1 %pip install -U ibm-watsonx-ai | tail -n 1
Successfully installed wget-3.2 Successfully installed certifi-2025.10.5 charset_normalizer-3.4.4 idna-3.11 requests-2.32.5 urllib3-2.5.0 Successfully installed anyio-4.11.0 cachetools-6.2.1 h11-0.16.0 httpcore-1.0.9 httpx-0.28.1 ibm-cos-sdk-2.14.3 ibm-cos-sdk-core-2.14.3 ibm-cos-sdk-s3transfer-2.14.3 ibm-watsonx-ai-1.4.4 jmespath-1.0.1 lomond-0.3.3 numpy-2.3.4 pandas-2.2.3 pytz-2025.2 sniffio-1.3.1 tabulate-0.9.0 tzdata-2025.2

Defining the watsonx.ai credentials

This cell defines the watsonx.ai credentials required to work with watsonx Foundation Model inferencing.

Action: Provide the IBM Cloud user API key. For details, see documentation.

import getpass from ibm_watsonx_ai import Credentials credentials = Credentials( url="https://us-south.ml.cloud.ibm.com", api_key=getpass.getpass("Please enter your watsonx.ai api key (hit enter): "), )

Defining the project id

The Foundation Model requires project id that provides the context for the call. We will obtain the id from the project in which this notebook runs. Otherwise, please provide the project id.

import os try: project_id = os.environ["PROJECT_ID"] except KeyError: project_id = input("Please enter your project_id (hit enter): ")
from ibm_watsonx_ai import APIClient api_client = APIClient(credentials=credentials, project_id=project_id)

Data loading

Download the legal_contracts_summarization dataset. It contains different legal documents, e.g. terms & conditions or licences, together with their summaries written by humans.

import wget filename = "contracts_summarization.json" url = "https://raw.githubusercontent.com/lauramanor/legal_summarization/master/all_v1.json" if not os.path.isfile(filename): wget.download(url, out="contracts_summarization.json")

Read the data.

import pandas as pd data = pd.read_json("contracts_summarization.json").T

Inspect data sample.

import json data_sample = data[17:27][["original_text", "reference_summary"]] print(json.dumps(data_sample.values.tolist(), indent=2))
[ [ "these terms and any action related thereto will be governed by the laws of the state of california without regard to its conflict of laws provisions. these terms constitute the entire and exclusive understanding and agreement between niantic and you regarding the services and content and these terms supersede and replace any and all prior oral or written understandings or agreements between niantic and you regarding the services and content. if any provision of these terms is held invalid or unenforceable either by an arbitrator appointed pursuant to the terms of the dispute resolution section above or by a court of competent jurisdiction but only if you timely opt out of arbitration by sending us an arbitration opt out notice in accordance with the terms set forth above that provision will be enforced to the maximum extent permissible and the other provisions of these terms will remain in full force and effect. you may not assign or transfer these terms by operation of law or otherwise without niantic s prior written consent. any attempt by you to assign or transfer these terms without such consent will be null. niantic may freely assign or transfer these terms without restriction. subject to the foregoing these terms will bind and inure to the benefit of the parties their successors and permitted assigns. any notices or other communications provided by niantic under these terms including those regarding modifications to these terms will be given a via email or b by posting to the services. for notices made by email the date of receipt will be deemed the date on which such notice is transmitted. niantic s failure to enforce any right or provision of these terms will not be considered a waiver of such right or provision. the waiver of any such right or provision will be effective only if in writing and signed by a duly authorized representative of niantic. except as expressly set forth in these terms the exercise by either party of any of its remedies under these terms will be without prejudice to its other remedies under these terms or otherwise.", "california governs these terms. we ll let you know if we make changes to the terms and we might forget to enforce them sometimes." ] ]

Check the sample text and summary length.

The original text length statistics.

data.original_text.apply(lambda x: len(x.split())).describe()
count 446.000000 mean 101.894619 std 143.408492 min 7.000000 25% 32.250000 50% 58.500000 75% 103.000000 max 1077.000000 Name: original_text, dtype: float64

The reference summary length statistics.

data.reference_summary.apply(lambda x: len(x.split())).describe()
count 446.000000 mean 15.964126 std 11.344199 min 1.000000 25% 9.000000 50% 13.000000 75% 18.000000 max 80.000000 Name: reference_summary, dtype: float64

Foundation Models on watson.ai

List available models

All avaliable models are presented under TextModels class. For more information refer to documentation.

print([model.name for model in api_client.foundation_models.TextModels])
['DEFENCE_GRANITE', 'GRANITE_3_2_8B_INSTRUCT', 'GRANITE_3_2B_INSTRUCT', 'GRANITE_3_3_8B_INSTRUCT', 'GRANITE_3_8B_INSTRUCT', 'GRANITE_4_H_SMALL', 'GRANITE_8B_CODE_INSTRUCT', 'GRANITE_GUARDIAN_3_8B', 'GRANITE_VISION_3_2_2B', 'LLAMA_3_2_11B_VISION_INSTRUCT', 'LLAMA_3_2_90B_VISION_INSTRUCT', 'LLAMA_3_3_70B_INSTRUCT', 'LLAMA_3_405B_INSTRUCT', 'LLAMA_4_MAVERICK_17B_128E_INSTRUCT_FP8', 'LLAMA_GUARD_3_11B_VISION', 'MISTRAL_MEDIUM_2505', 'MISTRAL_SMALL_3_1_24B_INSTRUCT_2503', 'GPT_OSS_120B', 'ALLAM_1_13B_INSTRUCT']

You need to specify model_id that will be used for inferencing:

model_id = api_client.foundation_models.TextModels.MISTRAL_SMALL_3_1_24B_INSTRUCT_2503

Defining the model parameters

You might need to adjust model parameters for different models or tasks, to do so please refer to documentation.

from ibm_watsonx_ai.metanames import GenTextParamsMetaNames as GenParams parameters = { GenParams.DECODING_METHOD: "greedy", GenParams.MIN_NEW_TOKENS: 1, GenParams.MAX_NEW_TOKENS: 150, }

Initialize the model

Initialize the Model class with previous set params.

from ibm_watsonx_ai.foundation_models import ModelInference model = ModelInference( model_id=model_id, params=parameters, credentials=credentials, project_id=project_id )

Model's details

model.get_details()

Generate document summary

Define instructions for the model.

instruction = "Generate a brief summary of this document:\n"

Prepare model inputs - build few-shot examples.

few_shot_input = [] few_shot_target = [] singleoutput = [] for i, tl in enumerate(data_sample.values): if (i + 1) % 2 == 0: singleoutput.append(f" document: {tl[0]} summary:") few_shot_input.append("".join(singleoutput)) few_shot_target.append(tl[1]) singleoutput = [] else: singleoutput.append(f" document: {tl[0]} summary: {tl[1]}")

Inspect an exemplary input of the few-shot prompt.

print(json.dumps(print(few_shot_input[0]), indent=2))
document: these terms and any action related thereto will be governed by the laws of the state of california without regard to its conflict of laws provisions. these terms constitute the entire and exclusive understanding and agreement between niantic and you regarding the services and content and these terms supersede and replace any and all prior oral or written understandings or agreements between niantic and you regarding the services and content. if any provision of these terms is held invalid or unenforceable either by an arbitrator appointed pursuant to the terms of the dispute resolution section above or by a court of competent jurisdiction but only if you timely opt out of arbitration by sending us an arbitration opt out notice in accordance with the terms set forth above that provision will be enforced to the maximum extent permissible and the other provisions of these terms will remain in full force and effect. you may not assign or transfer these terms by operation of law or otherwise without niantic s prior written consent. any attempt by you to assign or transfer these terms without such consent will be null. niantic may freely assign or transfer these terms without restriction. subject to the foregoing these terms will bind and inure to the benefit of the parties their successors and permitted assigns. any notices or other communications provided by niantic under these terms including those regarding modifications to these terms will be given a via email or b by posting to the services. for notices made by email the date of receipt will be deemed the date on which such notice is transmitted. niantic s failure to enforce any right or provision of these terms will not be considered a waiver of such right or provision. the waiver of any such right or provision will be effective only if in writing and signed by a duly authorized representative of niantic. except as expressly set forth in these terms the exercise by either party of any of its remedies under these terms will be without prejudice to its other remedies under these terms or otherwise. summary: california governs these terms. we ll let you know if we make changes to the terms and we might forget to enforce them sometimes. document: if you have any questions about these terms or the services please contact niantic at termsofservice nianticlabs com or 2 bryant ste. 220 san francisco ca 94105. summary: null

Get the docs summaries.

results = [] for inp in few_shot_input: results.append(model.generate(" ".join([instruction, inp]))["results"][0])

Explore model output.

print(json.dumps(results, indent=2))
[ { "generated_text": " contact niantic at termsofservice nianticlabs com or 2 bryant ste. 220 san francisco, ca 94105 with any questions about the terms or services.", "generated_token_count": 46, "input_token_count": 479, "stop_reason": "eos_token" } ]

Score the model

Note: To run the Score section for model scoring on the whole financial phrasebank dataset please transform following markdown cells to code cells. Have in mind that it might use significant amount of recources to score model on the whole dataset.

In this sample notebook spacy implementation of cosine similarity for en_core_web_md corpus was used for cosine similarity calculation.

Tip: You might consider using bigger language corpus, different word embeddings and distance metrics for output summary scoring against the reference summary.

Get the true labels.

y_true = few_shot_target y_true

Get the prediction labels.

y_pred = [result['generated_text'] for result in results]

Use spacy and en_core_web_md corpus to calculate cosine similarity of generated and reference summaries.

%pip install spacy | tail -1 !python -m spacy download en_core_web_md | tail -1
import en_core_web_md nlp = en_core_web_md.load()
for truth, pred in zip(y_true, y_pred): t = nlp(truth) p = nlp(pred) print("Cosine similarity between the reference summary and the predicted summary:", t.similarity(p))

Rouge Metric

Note: The Rouge (Recall-Oriented Understudy for Gisting Evaluation) metric is a set of evaluation measures used in natural language processing (NLP) and specifically in text summarization and machine translation tasks. The Rouge metrics are designed to assess the quality of generated summaries or translations by comparing them to one or more reference texts.

The main idea behind Rouge is to measure the overlap between the generated summary (or translation) and the reference text(s) in terms of n-grams or longest common subsequences. By calculating recall, precision, and F1 scores based on these overlapping units, Rouge provides a quantitative assessment of the summary's content overlap with the reference(s).

Rouge-1 focuses on individual word overlap, Rouge-2 considers pairs of consecutive words, and Rouge-L takes into account the ordering of words and phrases. These metrics provide different perspectives on the similarity between two texts and can be used to evaluate different aspects of summarization or text generation models.

%pip install rouge
from rouge import Rouge rouge = Rouge() scores = rouge.get_scores(y_true, y_pred) scores

Summary and next steps

You successfully completed this notebook!

You learned how to generate documents summaries with IBM's mistral-small-3-1-24b-instruct-2503 on watsonx.

Check out our Online Documentation for more samples, tutorials, documentation, how-tos, and blog posts.

Authors

Mateusz Szewczyk, Software Engineer at watsonx.ai.

Copyright © 2024-2026 IBM. This notebook and its source code are released under the terms of the MIT License.