GitHub Repository: IBM/watson-machine-learning-samples
Path: blob/master/cloud/notebooks/python_sdk/deployments/foundation_models/Use watsonx, and `mistral-small-3-1-24b-instruct-2503` to summarize legal Contracts documents.ipynb
⁵²¹⁴ views

Kernel: clear_env

Use watsonx, and `mistral-small-3-1-24b-instruct-2503` to summarize legal Contracts documents

Disclaimers

Use only Projects and Spaces that are available in watsonx context.

Notebook content

This notebook contains the steps and code to demonstrate support of text summarization in watsonx. It introduces commands for data retrieval and model testing.

Some familiarity with Python is helpful. This notebook uses Python 3.11.

Learning goal

The goal of this notebook is to demonstrate how to use mistral-small-3-1-24b-instruct-2503 model to summarize legal documents .

This notebook contains the following parts:

Set up the environment

Before you use the sample code in this notebook, you must perform the following setup tasks:

Create a watsonx.ai Runtime Service instance (a free plan is offered and information about how to create the instance can be found here).

Install and import dependencies

Note: ibm-watsonx-ai documentation can be found here.

In [1]:

%pip install wget | tail -n 1
%pip install requests | tail -n 1
%pip install -U ibm-watsonx-ai | tail -n 1

Out[1]:

Successfully installed wget-3.2
Successfully installed certifi-2025.10.5 charset_normalizer-3.4.4 idna-3.11 requests-2.32.5 urllib3-2.5.0
Successfully installed anyio-4.11.0 cachetools-6.2.1 h11-0.16.0 httpcore-1.0.9 httpx-0.28.1 ibm-cos-sdk-2.14.3 ibm-cos-sdk-core-2.14.3 ibm-cos-sdk-s3transfer-2.14.3 ibm-watsonx-ai-1.4.4 jmespath-1.0.1 lomond-0.3.3 numpy-2.3.4 pandas-2.2.3 pytz-2025.2 sniffio-1.3.1 tabulate-0.9.0 tzdata-2025.2

Defining the watsonx.ai credentials

This cell defines the watsonx.ai credentials required to work with watsonx Foundation Model inferencing.

Action: Provide the IBM Cloud user API key. For details, see documentation.

In [2]:

import getpass

from ibm_watsonx_ai import Credentials

credentials = Credentials(
    url="https://us-south.ml.cloud.ibm.com",
    api_key=getpass.getpass("Please enter your watsonx.ai api key (hit enter): "),
)

Defining the project id

The Foundation Model requires project id that provides the context for the call. We will obtain the id from the project in which this notebook runs. Otherwise, please provide the project id.

In [3]:

import os

try:
    project_id = os.environ["PROJECT_ID"]
except KeyError:
    project_id = input("Please enter your project_id (hit enter): ")

In [4]:

from ibm_watsonx_ai import APIClient

api_client = APIClient(credentials=credentials, project_id=project_id)

Data loading

Download the legal_contracts_summarization dataset. It contains different legal documents, e.g. terms & conditions or licences, together with their summaries written by humans.

In [5]:

import wget

filename = "contracts_summarization.json"
url = "https://raw.githubusercontent.com/lauramanor/legal_summarization/master/all_v1.json"

if not os.path.isfile(filename):
    wget.download(url, out="contracts_summarization.json")

Read the data.

In [6]:

import pandas as pd

data = pd.read_json("contracts_summarization.json").T

Inspect data sample.

In [7]:

import json

data_sample = data[17:27][["original_text", "reference_summary"]]
print(json.dumps(data_sample.values.tolist(), indent=2))

Out[7]:

[
  [
    "these terms and any action related thereto will be governed by the laws of the state of california without regard to its conflict of laws provisions. these terms constitute the entire and exclusive understanding and agreement between niantic and you regarding the services and content and these terms supersede and replace any and all prior oral or written understandings or agreements between niantic and you regarding the services and content. if any provision of these terms is held invalid or unenforceable either by an arbitrator appointed pursuant to the terms of the dispute resolution section above or by a court of competent jurisdiction but only if you timely opt out of arbitration by sending us an arbitration opt out notice in accordance with the terms set forth above that provision will be enforced to the maximum extent permissible and the other provisions of these terms will remain in full force and effect. you may not assign or transfer these terms by operation of law or otherwise without niantic s prior written consent. any attempt by you to assign or transfer these terms without such consent will be null. niantic may freely assign or transfer these terms without restriction. subject to the foregoing these terms will bind and inure to the benefit of the parties their successors and permitted assigns. any notices or other communications provided by niantic under these terms including those regarding modifications to these terms will be given a via email or b by posting to the services. for notices made by email the date of receipt will be deemed the date on which such notice is transmitted. niantic s failure to enforce any right or provision of these terms will not be considered a waiver of such right or provision. the waiver of any such right or provision will be effective only if in writing and signed by a duly authorized representative of niantic. except as expressly set forth in these terms the exercise by either party of any of its remedies under these terms will be without prejudice to its other remedies under these terms or otherwise.",
    "california governs these terms. we ll let you know if we make changes to the terms and we might forget to enforce them sometimes."
  ]
]

Check the sample text and summary length.

The original text length statistics.

In [8]:

data.original_text.apply(lambda x: len(x.split())).describe()

Out[8]:

count     446.000000
mean      101.894619
std       143.408492
min         7.000000
25%        32.250000
50%        58.500000
75%       103.000000
max      1077.000000
Name: original_text, dtype: float64

The reference summary length statistics.

In [9]:

data.reference_summary.apply(lambda x: len(x.split())).describe()

Out[9]:

count    446.000000
mean      15.964126
std       11.344199
min        1.000000
25%        9.000000
50%       13.000000
75%       18.000000
max       80.000000
Name: reference_summary, dtype: float64

Foundation Models on `watson.ai`

List available models

All avaliable models are presented under TextModels class. For more information refer to documentation.

In [10]:

print([model.name for model in api_client.foundation_models.TextModels])

Out[10]:

['DEFENCE_GRANITE', 'GRANITE_3_2_8B_INSTRUCT', 'GRANITE_3_2B_INSTRUCT', 'GRANITE_3_3_8B_INSTRUCT', 'GRANITE_3_8B_INSTRUCT', 'GRANITE_4_H_SMALL', 'GRANITE_8B_CODE_INSTRUCT', 'GRANITE_GUARDIAN_3_8B', 'GRANITE_VISION_3_2_2B', 'LLAMA_3_2_11B_VISION_INSTRUCT', 'LLAMA_3_2_90B_VISION_INSTRUCT', 'LLAMA_3_3_70B_INSTRUCT', 'LLAMA_3_405B_INSTRUCT', 'LLAMA_4_MAVERICK_17B_128E_INSTRUCT_FP8', 'LLAMA_GUARD_3_11B_VISION', 'MISTRAL_MEDIUM_2505', 'MISTRAL_SMALL_3_1_24B_INSTRUCT_2503', 'GPT_OSS_120B', 'ALLAM_1_13B_INSTRUCT']

You need to specify model_id that will be used for inferencing:

In [11]:

model_id = api_client.foundation_models.TextModels.MISTRAL_SMALL_3_1_24B_INSTRUCT_2503

Defining the model parameters

You might need to adjust model parameters for different models or tasks, to do so please refer to documentation.

In [12]:

from ibm_watsonx_ai.metanames import GenTextParamsMetaNames as GenParams

parameters = {
    GenParams.DECODING_METHOD: "greedy",
    GenParams.MIN_NEW_TOKENS: 1,
    GenParams.MAX_NEW_TOKENS: 150,
}

Initialize the model

Initialize the Model class with previous set params.

In [13]:

from ibm_watsonx_ai.foundation_models import ModelInference

model = ModelInference(
    model_id=model_id, params=parameters, credentials=credentials, project_id=project_id
)

Model's details

In [14]:

model.get_details()

Generate document summary

Define instructions for the model.

In [15]:

instruction = "Generate a brief summary of this document:\n"

Prepare model inputs - build few-shot examples.

In [16]:

few_shot_input = []
few_shot_target = []
singleoutput = []

for i, tl in enumerate(data_sample.values):
    if (i + 1) % 2 == 0:
        singleoutput.append(f"    document: {tl[0]}    summary:")
        few_shot_input.append("".join(singleoutput))
        few_shot_target.append(tl[1])
        singleoutput = []
    else:
        singleoutput.append(f"    document: {tl[0]}    summary: {tl[1]}")

Inspect an exemplary input of the few-shot prompt.

In [17]:

print(json.dumps(print(few_shot_input[0]), indent=2))

Out[17]:

    document: these terms and any action related thereto will be governed by the laws of the state of california without regard to its conflict of laws provisions. these terms constitute the entire and exclusive understanding and agreement between niantic and you regarding the services and content and these terms supersede and replace any and all prior oral or written understandings or agreements between niantic and you regarding the services and content. if any provision of these terms is held invalid or unenforceable either by an arbitrator appointed pursuant to the terms of the dispute resolution section above or by a court of competent jurisdiction but only if you timely opt out of arbitration by sending us an arbitration opt out notice in accordance with the terms set forth above that provision will be enforced to the maximum extent permissible and the other provisions of these terms will remain in full force and effect. you may not assign or transfer these terms by operation of law or otherwise without niantic s prior written consent. any attempt by you to assign or transfer these terms without such consent will be null. niantic may freely assign or transfer these terms without restriction. subject to the foregoing these terms will bind and inure to the benefit of the parties their successors and permitted assigns. any notices or other communications provided by niantic under these terms including those regarding modifications to these terms will be given a via email or b by posting to the services. for notices made by email the date of receipt will be deemed the date on which such notice is transmitted. niantic s failure to enforce any right or provision of these terms will not be considered a waiver of such right or provision. the waiver of any such right or provision will be effective only if in writing and signed by a duly authorized representative of niantic. except as expressly set forth in these terms the exercise by either party of any of its remedies under these terms will be without prejudice to its other remedies under these terms or otherwise.    summary: california governs these terms. we ll let you know if we make changes to the terms and we might forget to enforce them sometimes.    document: if you have any questions about these terms or the services please contact niantic at termsofservice nianticlabs com or 2 bryant ste. 220 san francisco ca 94105.    summary:
null

Generate the legal document summary using `mistral-small-3-1-24b-instruct-2503` model.

Get the docs summaries.

In [18]:

results = []

for inp in few_shot_input:
    results.append(model.generate(" ".join([instruction, inp]))["results"][0])

Explore model output.

In [19]:

print(json.dumps(results, indent=2))

Out[19]:

[
  {
    "generated_text": " contact niantic at termsofservice nianticlabs com or 2 bryant ste. 220 san francisco, ca 94105 with any questions about the terms or services.",
    "generated_token_count": 46,
    "input_token_count": 479,
    "stop_reason": "eos_token"
  }
]

Score the model

Note: To run the Score section for model scoring on the whole financial phrasebank dataset please transform following markdown cells to code cells. Have in mind that it might use significant amount of recources to score model on the whole dataset.

In this sample notebook spacy implementation of cosine similarity for en_core_web_md corpus was used for cosine similarity calculation.

Tip: You might consider using bigger language corpus, different word embeddings and distance metrics for output summary scoring against the reference summary.

Get the true labels.

y_true = few_shot_target
y_true

Get the prediction labels.

y_pred = [result['generated_text'] for result in results]

Use spacy and en_core_web_md corpus to calculate cosine similarity of generated and reference summaries.

%pip install spacy | tail -1
!python -m spacy download en_core_web_md | tail -1

import en_core_web_md
nlp = en_core_web_md.load()

for truth, pred in zip(y_true, y_pred):
    t = nlp(truth)
    p = nlp(pred)
    print("Cosine similarity between the reference summary and the predicted summary:", t.similarity(p))

Rouge Metric

Note: The Rouge (Recall-Oriented Understudy for Gisting Evaluation) metric is a set of evaluation measures used in natural language processing (NLP) and specifically in text summarization and machine translation tasks. The Rouge metrics are designed to assess the quality of generated summaries or translations by comparing them to one or more reference texts.

The main idea behind Rouge is to measure the overlap between the generated summary (or translation) and the reference text(s) in terms of n-grams or longest common subsequences. By calculating recall, precision, and F1 scores based on these overlapping units, Rouge provides a quantitative assessment of the summary's content overlap with the reference(s).

Rouge-1 focuses on individual word overlap, Rouge-2 considers pairs of consecutive words, and Rouge-L takes into account the ordering of words and phrases. These metrics provide different perspectives on the similarity between two texts and can be used to evaluate different aspects of summarization or text generation models.

%pip install rouge

from rouge import Rouge

rouge = Rouge()
scores = rouge.get_scores(y_true, y_pred)
scores

Summary and next steps

You successfully completed this notebook!

You learned how to generate documents summaries with IBM's mistral-small-3-1-24b-instruct-2503 on watsonx.

Check out our Online Documentation for more samples, tutorials, documentation, how-tos, and blog posts.

Authors

Mateusz Szewczyk, Software Engineer at watsonx.ai.

Use watsonx, and `mistral-small-3-1-24b-instruct-2503` to summarize legal Contracts documents

Disclaimers

Notebook content

Learning goal

Contents

Set up the environment

Install and import dependencies

Defining the watsonx.ai credentials

Defining the project id

Data loading

Check the sample text and summary length.

Foundation Models on `watson.ai`

List available models

Defining the model parameters

Initialize the model

Model's details

Generate document summary

Generate the legal document summary using `mistral-small-3-1-24b-instruct-2503` model.

Score the model

Rouge Metric

Summary and next steps

Authors

Product

Resources

Company

Use watsonx, and mistral-small-3-1-24b-instruct-2503 to summarize legal Contracts documents

Disclaimers

Notebook content

Learning goal

Contents

Set up the environment

Install and import dependencies

Defining the watsonx.ai credentials

Defining the project id

Data loading

Check the sample text and summary length.

Foundation Models on watson.ai

List available models

Defining the model parameters

Initialize the model

Model's details

Generate document summary

Generate the legal document summary using mistral-small-3-1-24b-instruct-2503 model.

Score the model

Rouge Metric

Summary and next steps

Authors

Use watsonx, and `mistral-small-3-1-24b-instruct-2503` to summarize legal Contracts documents

Foundation Models on `watson.ai`

Generate the legal document summary using `mistral-small-3-1-24b-instruct-2503` model.